* [for-next][PATCH 00/14] tracing: More updates for 6.6
@ 2023-08-24 2:18 Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 01/14] tracing/filters: Dynamically allocate filter_pred.regex Steven Rostedt
` (13 more replies)
0 siblings, 14 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
trace/for-next
Head SHA1: 8c96b70171584f38940eb2ba65b84eee38b549ba
Eric Vaughn (1):
tracing/user_events: Optimize safe list traversals
Sishuai Gong (1):
tracefs: Avoid changing i_mode to a temp value
Steven Rostedt (Google) (1):
tracefs: Remove kerneldoc from struct eventfs_file
Valentin Schneider (9):
tracing/filters: Dynamically allocate filter_pred.regex
tracing/filters: Enable filtering a cpumask field by another cpumask
tracing/filters: Enable filtering a scalar field by a cpumask
tracing/filters: Enable filtering the CPU common field by a cpumask
tracing/filters: Optimise cpumask vs cpumask filtering when user mask is a single CPU
tracing/filters: Optimise scalar vs cpumask filtering when the user mask is a single CPU
tracing/filters: Optimise CPU vs cpumask filtering when the user mask is a single CPU
tracing/filters: Further optimise scalar vs cpumask comparison
tracing/filters: Document cpumask filtering
Yue Haibing (1):
tracing: Remove unused function declarations
Zhang Zekun (1):
ftrace: Remove empty declaration ftrace_enable_daemon() and ftrace_disable_daemon()
----
Documentation/trace/events.rst | 14 ++
fs/tracefs/event_inode.c | 14 +-
fs/tracefs/inode.c | 6 +-
include/linux/ftrace.h | 5 -
include/linux/trace_events.h | 1 +
kernel/trace/trace.h | 2 -
kernel/trace/trace_events_filter.c | 302 +++++++++++++++++++++++++++++++++----
kernel/trace/trace_events_user.c | 15 +-
8 files changed, 312 insertions(+), 47 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [for-next][PATCH 01/14] tracing/filters: Dynamically allocate filter_pred.regex
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 02/14] tracing/filters: Enable filtering a cpumask field by another cpumask Steven Rostedt
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Every predicate allocation includes a MAX_FILTER_STR_VAL (256) char array
in the regex field, even if the predicate function does not use the field.
A later commit will introduce a dynamically allocated cpumask to struct
filter_pred, which will require a dedicated freeing function. Bite the
bullet and make filter_pred.regex dynamically allocated.
While at it, reorder the fields of filter_pred to fill in the byte
holes. The struct now fits on a single cacheline.
No change in behaviour intended.
The kfree()'s were patched via Coccinelle:
@@
struct filter_pred *pred;
@@
-kfree(pred);
+free_predicate(pred);
Link: https://lkml.kernel.org/r/20230707172155.70873-2-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 64 ++++++++++++++++++------------
1 file changed, 39 insertions(+), 25 deletions(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 1dad64267878..91fc9990107f 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -70,15 +70,15 @@ enum filter_pred_fn {
};
struct filter_pred {
- enum filter_pred_fn fn_num;
- u64 val;
- u64 val2;
- struct regex regex;
+ struct regex *regex;
unsigned short *ops;
struct ftrace_event_field *field;
- int offset;
+ u64 val;
+ u64 val2;
+ enum filter_pred_fn fn_num;
+ int offset;
int not;
- int op;
+ int op;
};
/*
@@ -186,6 +186,14 @@ enum {
PROCESS_OR = 4,
};
+static void free_predicate(struct filter_pred *pred)
+{
+ if (pred) {
+ kfree(pred->regex);
+ kfree(pred);
+ }
+}
+
/*
* Without going into a formal proof, this explains the method that is used in
* parsing the logical expressions.
@@ -623,7 +631,7 @@ predicate_parse(const char *str, int nr_parens, int nr_preds,
kfree(inverts);
if (prog_stack) {
for (i = 0; prog_stack[i].pred; i++)
- kfree(prog_stack[i].pred);
+ free_predicate(prog_stack[i].pred);
kfree(prog_stack);
}
return ERR_PTR(ret);
@@ -750,7 +758,7 @@ static int filter_pred_string(struct filter_pred *pred, void *event)
char *addr = (char *)(event + pred->offset);
int cmp, match;
- cmp = pred->regex.match(addr, &pred->regex, pred->regex.field_len);
+ cmp = pred->regex->match(addr, pred->regex, pred->regex->field_len);
match = cmp ^ pred->not;
@@ -763,7 +771,7 @@ static __always_inline int filter_pchar(struct filter_pred *pred, char *str)
int len;
len = strlen(str) + 1; /* including tailing '\0' */
- cmp = pred->regex.match(str, &pred->regex, len);
+ cmp = pred->regex->match(str, pred->regex, len);
match = cmp ^ pred->not;
@@ -813,7 +821,7 @@ static int filter_pred_strloc(struct filter_pred *pred, void *event)
char *addr = (char *)(event + str_loc);
int cmp, match;
- cmp = pred->regex.match(addr, &pred->regex, str_len);
+ cmp = pred->regex->match(addr, pred->regex, str_len);
match = cmp ^ pred->not;
@@ -836,7 +844,7 @@ static int filter_pred_strrelloc(struct filter_pred *pred, void *event)
char *addr = (char *)(&item[1]) + str_loc;
int cmp, match;
- cmp = pred->regex.match(addr, &pred->regex, str_len);
+ cmp = pred->regex->match(addr, pred->regex, str_len);
match = cmp ^ pred->not;
@@ -874,7 +882,7 @@ static int filter_pred_comm(struct filter_pred *pred, void *event)
{
int cmp;
- cmp = pred->regex.match(current->comm, &pred->regex,
+ cmp = pred->regex->match(current->comm, pred->regex,
TASK_COMM_LEN);
return cmp ^ pred->not;
}
@@ -1004,7 +1012,7 @@ enum regex_type filter_parse_regex(char *buff, int len, char **search, int *not)
static void filter_build_regex(struct filter_pred *pred)
{
- struct regex *r = &pred->regex;
+ struct regex *r = pred->regex;
char *search;
enum regex_type type = MATCH_FULL;
@@ -1169,7 +1177,7 @@ static void free_prog(struct event_filter *filter)
return;
for (i = 0; prog[i].pred; i++)
- kfree(prog[i].pred);
+ free_predicate(prog[i].pred);
kfree(prog);
}
@@ -1553,9 +1561,12 @@ static int parse_pred(const char *str, void *data,
goto err_free;
}
- pred->regex.len = len;
- strncpy(pred->regex.pattern, str + s, len);
- pred->regex.pattern[len] = 0;
+ pred->regex = kzalloc(sizeof(*pred->regex), GFP_KERNEL);
+ if (!pred->regex)
+ goto err_mem;
+ pred->regex->len = len;
+ strncpy(pred->regex->pattern, str + s, len);
+ pred->regex->pattern[len] = 0;
/* This is either a string, or an integer */
} else if (str[i] == '\'' || str[i] == '"') {
@@ -1597,9 +1608,12 @@ static int parse_pred(const char *str, void *data,
goto err_free;
}
- pred->regex.len = len;
- strncpy(pred->regex.pattern, str + s, len);
- pred->regex.pattern[len] = 0;
+ pred->regex = kzalloc(sizeof(*pred->regex), GFP_KERNEL);
+ if (!pred->regex)
+ goto err_mem;
+ pred->regex->len = len;
+ strncpy(pred->regex->pattern, str + s, len);
+ pred->regex->pattern[len] = 0;
filter_build_regex(pred);
@@ -1608,7 +1622,7 @@ static int parse_pred(const char *str, void *data,
} else if (field->filter_type == FILTER_STATIC_STRING) {
pred->fn_num = FILTER_PRED_FN_STRING;
- pred->regex.field_len = field->size;
+ pred->regex->field_len = field->size;
} else if (field->filter_type == FILTER_DYN_STRING) {
pred->fn_num = FILTER_PRED_FN_STRLOC;
@@ -1691,10 +1705,10 @@ static int parse_pred(const char *str, void *data,
return i;
err_free:
- kfree(pred);
+ free_predicate(pred);
return -EINVAL;
err_mem:
- kfree(pred);
+ free_predicate(pred);
return -ENOMEM;
}
@@ -2287,8 +2301,8 @@ static int ftrace_function_set_filter_pred(struct filter_pred *pred,
return ret;
return __ftrace_function_set_filter(pred->op == OP_EQ,
- pred->regex.pattern,
- pred->regex.len,
+ pred->regex->pattern,
+ pred->regex->len,
data);
}
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 02/14] tracing/filters: Enable filtering a cpumask field by another cpumask
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 01/14] tracing/filters: Dynamically allocate filter_pred.regex Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 03/14] tracing/filters: Enable filtering a scalar field by a cpumask Steven Rostedt
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
The recently introduced ipi_send_cpumask trace event contains a cpumask
field, but it currently cannot be used in filter expressions.
Make event filtering aware of cpumask fields, and allow these to be
filtered by a user-provided cpumask.
The user-provided cpumask is to be given in cpulist format and wrapped as:
"CPUS{$cpulist}". The use of curly braces instead of parentheses is to
prevent predicate_parse() from parsing the contents of CPUS{...} as a
full-fledged predicate subexpression.
This enables e.g.:
$ trace-cmd record -e 'ipi_send_cpumask' -f 'cpumask & CPUS{2,4,6,8-32}'
Link: https://lkml.kernel.org/r/20230707172155.70873-3-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
include/linux/trace_events.h | 1 +
kernel/trace/trace_events_filter.c | 97 +++++++++++++++++++++++++++++-
2 files changed, 96 insertions(+), 2 deletions(-)
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index c17623c78029..1600aeb8e1a3 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -808,6 +808,7 @@ enum {
FILTER_RDYN_STRING,
FILTER_PTR_STRING,
FILTER_TRACE_FN,
+ FILTER_CPUMASK,
FILTER_COMM,
FILTER_CPU,
FILTER_STACKTRACE,
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 91fc9990107f..cb1863dfa280 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -64,6 +64,7 @@ enum filter_pred_fn {
FILTER_PRED_FN_PCHAR_USER,
FILTER_PRED_FN_PCHAR,
FILTER_PRED_FN_CPU,
+ FILTER_PRED_FN_CPUMASK,
FILTER_PRED_FN_FUNCTION,
FILTER_PRED_FN_,
FILTER_PRED_TEST_VISITED,
@@ -71,6 +72,7 @@ enum filter_pred_fn {
struct filter_pred {
struct regex *regex;
+ struct cpumask *mask;
unsigned short *ops;
struct ftrace_event_field *field;
u64 val;
@@ -94,6 +96,8 @@ struct filter_pred {
C(TOO_MANY_OPEN, "Too many '('"), \
C(TOO_MANY_CLOSE, "Too few '('"), \
C(MISSING_QUOTE, "Missing matching quote"), \
+ C(MISSING_BRACE_OPEN, "Missing '{'"), \
+ C(MISSING_BRACE_CLOSE, "Missing '}'"), \
C(OPERAND_TOO_LONG, "Operand too long"), \
C(EXPECT_STRING, "Expecting string field"), \
C(EXPECT_DIGIT, "Expecting numeric field"), \
@@ -103,6 +107,7 @@ struct filter_pred {
C(BAD_SUBSYS_FILTER, "Couldn't find or set field in one of a subsystem's events"), \
C(TOO_MANY_PREDS, "Too many terms in predicate expression"), \
C(INVALID_FILTER, "Meaningless filter expression"), \
+ C(INVALID_CPULIST, "Invalid cpulist"), \
C(IP_FIELD_ONLY, "Only 'ip' field is supported for function trace"), \
C(INVALID_VALUE, "Invalid value (did you forget quotes)?"), \
C(NO_FUNCTION, "Function not found"), \
@@ -190,6 +195,7 @@ static void free_predicate(struct filter_pred *pred)
{
if (pred) {
kfree(pred->regex);
+ kfree(pred->mask);
kfree(pred);
}
}
@@ -877,6 +883,26 @@ static int filter_pred_cpu(struct filter_pred *pred, void *event)
}
}
+/* Filter predicate for cpumask field vs user-provided cpumask */
+static int filter_pred_cpumask(struct filter_pred *pred, void *event)
+{
+ u32 item = *(u32 *)(event + pred->offset);
+ int loc = item & 0xffff;
+ const struct cpumask *mask = (event + loc);
+ const struct cpumask *cmp = pred->mask;
+
+ switch (pred->op) {
+ case OP_EQ:
+ return cpumask_equal(mask, cmp);
+ case OP_NE:
+ return !cpumask_equal(mask, cmp);
+ case OP_BAND:
+ return cpumask_intersects(mask, cmp);
+ default:
+ return 0;
+ }
+}
+
/* Filter predicate for COMM. */
static int filter_pred_comm(struct filter_pred *pred, void *event)
{
@@ -1244,8 +1270,12 @@ static void filter_free_subsystem_filters(struct trace_subsystem_dir *dir,
int filter_assign_type(const char *type)
{
- if (strstr(type, "__data_loc") && strstr(type, "char"))
- return FILTER_DYN_STRING;
+ if (strstr(type, "__data_loc")) {
+ if (strstr(type, "char"))
+ return FILTER_DYN_STRING;
+ if (strstr(type, "cpumask_t"))
+ return FILTER_CPUMASK;
+ }
if (strstr(type, "__rel_loc") && strstr(type, "char"))
return FILTER_RDYN_STRING;
@@ -1357,6 +1387,8 @@ static int filter_pred_fn_call(struct filter_pred *pred, void *event)
return filter_pred_pchar(pred, event);
case FILTER_PRED_FN_CPU:
return filter_pred_cpu(pred, event);
+ case FILTER_PRED_FN_CPUMASK:
+ return filter_pred_cpumask(pred, event);
case FILTER_PRED_FN_FUNCTION:
return filter_pred_function(pred, event);
case FILTER_PRED_TEST_VISITED:
@@ -1568,6 +1600,67 @@ static int parse_pred(const char *str, void *data,
strncpy(pred->regex->pattern, str + s, len);
pred->regex->pattern[len] = 0;
+ } else if (!strncmp(str + i, "CPUS", 4)) {
+ unsigned int maskstart;
+ char *tmp;
+
+ switch (field->filter_type) {
+ case FILTER_CPUMASK:
+ break;
+ default:
+ parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
+ goto err_free;
+ }
+
+ switch (op) {
+ case OP_EQ:
+ case OP_NE:
+ case OP_BAND:
+ break;
+ default:
+ parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
+ goto err_free;
+ }
+
+ /* Skip CPUS */
+ i += 4;
+ if (str[i++] != '{') {
+ parse_error(pe, FILT_ERR_MISSING_BRACE_OPEN, pos + i);
+ goto err_free;
+ }
+ maskstart = i;
+
+ /* Walk the cpulist until closing } */
+ for (; str[i] && str[i] != '}'; i++);
+ if (str[i] != '}') {
+ parse_error(pe, FILT_ERR_MISSING_BRACE_CLOSE, pos + i);
+ goto err_free;
+ }
+
+ if (maskstart == i) {
+ parse_error(pe, FILT_ERR_INVALID_CPULIST, pos + i);
+ goto err_free;
+ }
+
+ /* Copy the cpulist between { and } */
+ tmp = kmalloc((i - maskstart) + 1, GFP_KERNEL);
+ strscpy(tmp, str + maskstart, (i - maskstart) + 1);
+
+ pred->mask = kzalloc(cpumask_size(), GFP_KERNEL);
+ if (!pred->mask)
+ goto err_mem;
+
+ /* Now parse it */
+ if (cpulist_parse(tmp, pred->mask)) {
+ parse_error(pe, FILT_ERR_INVALID_CPULIST, pos + i);
+ goto err_free;
+ }
+
+ /* Move along */
+ i++;
+ if (field->filter_type == FILTER_CPUMASK)
+ pred->fn_num = FILTER_PRED_FN_CPUMASK;
+
/* This is either a string, or an integer */
} else if (str[i] == '\'' || str[i] == '"') {
char q = str[i];
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 03/14] tracing/filters: Enable filtering a scalar field by a cpumask
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 01/14] tracing/filters: Dynamically allocate filter_pred.regex Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 02/14] tracing/filters: Enable filtering a cpumask field by another cpumask Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 04/14] tracing/filters: Enable filtering the CPU common " Steven Rostedt
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Several events use a scalar field to denote a CPU:
o sched_wakeup.target_cpu
o sched_migrate_task.orig_cpu,dest_cpu
o sched_move_numa.src_cpu,dst_cpu
o ipi_send_cpu.cpu
o ...
Filtering these currently requires using arithmetic comparison functions,
which can be tedious when dealing with interleaved SMT or NUMA CPU ids.
Allow these to be filtered by a user-provided cpumask, which enables e.g.:
$ trace-cmd record -e 'sched_wakeup' -f 'target_cpu & CPUS{2,4,6,8-32}'
Link: https://lkml.kernel.org/r/20230707172155.70873-4-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 92 ++++++++++++++++++++++++++----
1 file changed, 81 insertions(+), 11 deletions(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index cb1863dfa280..1e14f801685a 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -46,15 +46,19 @@ static const char * ops[] = { OPS };
enum filter_pred_fn {
FILTER_PRED_FN_NOP,
FILTER_PRED_FN_64,
+ FILTER_PRED_FN_64_CPUMASK,
FILTER_PRED_FN_S64,
FILTER_PRED_FN_U64,
FILTER_PRED_FN_32,
+ FILTER_PRED_FN_32_CPUMASK,
FILTER_PRED_FN_S32,
FILTER_PRED_FN_U32,
FILTER_PRED_FN_16,
+ FILTER_PRED_FN_16_CPUMASK,
FILTER_PRED_FN_S16,
FILTER_PRED_FN_U16,
FILTER_PRED_FN_8,
+ FILTER_PRED_FN_8_CPUMASK,
FILTER_PRED_FN_S8,
FILTER_PRED_FN_U8,
FILTER_PRED_FN_COMM,
@@ -643,6 +647,39 @@ predicate_parse(const char *str, int nr_parens, int nr_preds,
return ERR_PTR(ret);
}
+static inline int
+do_filter_cpumask(int op, const struct cpumask *mask, const struct cpumask *cmp)
+{
+ switch (op) {
+ case OP_EQ:
+ return cpumask_equal(mask, cmp);
+ case OP_NE:
+ return !cpumask_equal(mask, cmp);
+ case OP_BAND:
+ return cpumask_intersects(mask, cmp);
+ default:
+ return 0;
+ }
+}
+
+/* Optimisation of do_filter_cpumask() for scalar fields */
+static inline int
+do_filter_scalar_cpumask(int op, unsigned int cpu, const struct cpumask *mask)
+{
+ switch (op) {
+ case OP_EQ:
+ return cpumask_test_cpu(cpu, mask) &&
+ cpumask_nth(1, mask) >= nr_cpu_ids;
+ case OP_NE:
+ return !cpumask_test_cpu(cpu, mask) ||
+ cpumask_nth(1, mask) < nr_cpu_ids;
+ case OP_BAND:
+ return cpumask_test_cpu(cpu, mask);
+ default:
+ return 0;
+ }
+}
+
enum pred_cmp_types {
PRED_CMP_TYPE_NOP,
PRED_CMP_TYPE_LT,
@@ -686,6 +723,18 @@ static int filter_pred_##type(struct filter_pred *pred, void *event) \
} \
}
+#define DEFINE_CPUMASK_COMPARISON_PRED(size) \
+static int filter_pred_##size##_cpumask(struct filter_pred *pred, void *event) \
+{ \
+ u##size *addr = (u##size *)(event + pred->offset); \
+ unsigned int cpu = *addr; \
+ \
+ if (cpu >= nr_cpu_ids) \
+ return 0; \
+ \
+ return do_filter_scalar_cpumask(pred->op, cpu, pred->mask); \
+}
+
#define DEFINE_EQUALITY_PRED(size) \
static int filter_pred_##size(struct filter_pred *pred, void *event) \
{ \
@@ -707,6 +756,11 @@ DEFINE_COMPARISON_PRED(u16);
DEFINE_COMPARISON_PRED(s8);
DEFINE_COMPARISON_PRED(u8);
+DEFINE_CPUMASK_COMPARISON_PRED(64);
+DEFINE_CPUMASK_COMPARISON_PRED(32);
+DEFINE_CPUMASK_COMPARISON_PRED(16);
+DEFINE_CPUMASK_COMPARISON_PRED(8);
+
DEFINE_EQUALITY_PRED(64);
DEFINE_EQUALITY_PRED(32);
DEFINE_EQUALITY_PRED(16);
@@ -891,16 +945,7 @@ static int filter_pred_cpumask(struct filter_pred *pred, void *event)
const struct cpumask *mask = (event + loc);
const struct cpumask *cmp = pred->mask;
- switch (pred->op) {
- case OP_EQ:
- return cpumask_equal(mask, cmp);
- case OP_NE:
- return !cpumask_equal(mask, cmp);
- case OP_BAND:
- return cpumask_intersects(mask, cmp);
- default:
- return 0;
- }
+ return do_filter_cpumask(pred->op, mask, cmp);
}
/* Filter predicate for COMM. */
@@ -1351,24 +1396,32 @@ static int filter_pred_fn_call(struct filter_pred *pred, void *event)
switch (pred->fn_num) {
case FILTER_PRED_FN_64:
return filter_pred_64(pred, event);
+ case FILTER_PRED_FN_64_CPUMASK:
+ return filter_pred_64_cpumask(pred, event);
case FILTER_PRED_FN_S64:
return filter_pred_s64(pred, event);
case FILTER_PRED_FN_U64:
return filter_pred_u64(pred, event);
case FILTER_PRED_FN_32:
return filter_pred_32(pred, event);
+ case FILTER_PRED_FN_32_CPUMASK:
+ return filter_pred_32_cpumask(pred, event);
case FILTER_PRED_FN_S32:
return filter_pred_s32(pred, event);
case FILTER_PRED_FN_U32:
return filter_pred_u32(pred, event);
case FILTER_PRED_FN_16:
return filter_pred_16(pred, event);
+ case FILTER_PRED_FN_16_CPUMASK:
+ return filter_pred_16_cpumask(pred, event);
case FILTER_PRED_FN_S16:
return filter_pred_s16(pred, event);
case FILTER_PRED_FN_U16:
return filter_pred_u16(pred, event);
case FILTER_PRED_FN_8:
return filter_pred_8(pred, event);
+ case FILTER_PRED_FN_8_CPUMASK:
+ return filter_pred_8_cpumask(pred, event);
case FILTER_PRED_FN_S8:
return filter_pred_s8(pred, event);
case FILTER_PRED_FN_U8:
@@ -1606,6 +1659,7 @@ static int parse_pred(const char *str, void *data,
switch (field->filter_type) {
case FILTER_CPUMASK:
+ case FILTER_OTHER:
break;
default:
parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
@@ -1658,8 +1712,24 @@ static int parse_pred(const char *str, void *data,
/* Move along */
i++;
- if (field->filter_type == FILTER_CPUMASK)
+ if (field->filter_type == FILTER_CPUMASK) {
pred->fn_num = FILTER_PRED_FN_CPUMASK;
+ } else {
+ switch (field->size) {
+ case 8:
+ pred->fn_num = FILTER_PRED_FN_64_CPUMASK;
+ break;
+ case 4:
+ pred->fn_num = FILTER_PRED_FN_32_CPUMASK;
+ break;
+ case 2:
+ pred->fn_num = FILTER_PRED_FN_16_CPUMASK;
+ break;
+ case 1:
+ pred->fn_num = FILTER_PRED_FN_8_CPUMASK;
+ break;
+ }
+ }
/* This is either a string, or an integer */
} else if (str[i] == '\'' || str[i] == '"') {
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 04/14] tracing/filters: Enable filtering the CPU common field by a cpumask
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (2 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 03/14] tracing/filters: Enable filtering a scalar field by a cpumask Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 05/14] tracing/filters: Optimise cpumask vs cpumask filtering when user mask is a single CPU Steven Rostedt
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
The tracing_cpumask lets us specify which CPUs are traced in a buffer
instance, but doesn't let us do this on a per-event basis (unless one
creates an instance per event).
A previous commit added filtering scalar fields by a user-given cpumask,
make this work with the CPU common field as well.
This enables doing things like
$ trace-cmd record -e 'sched_switch' -f 'CPU & CPUS{12-52}' \
-e 'sched_wakeup' -f 'target_cpu & CPUS{12-52}'
Link: https://lkml.kernel.org/r/20230707172155.70873-5-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 1e14f801685a..3009d0c61b53 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -68,6 +68,7 @@ enum filter_pred_fn {
FILTER_PRED_FN_PCHAR_USER,
FILTER_PRED_FN_PCHAR,
FILTER_PRED_FN_CPU,
+ FILTER_PRED_FN_CPU_CPUMASK,
FILTER_PRED_FN_CPUMASK,
FILTER_PRED_FN_FUNCTION,
FILTER_PRED_FN_,
@@ -937,6 +938,14 @@ static int filter_pred_cpu(struct filter_pred *pred, void *event)
}
}
+/* Filter predicate for current CPU vs user-provided cpumask */
+static int filter_pred_cpu_cpumask(struct filter_pred *pred, void *event)
+{
+ int cpu = raw_smp_processor_id();
+
+ return do_filter_scalar_cpumask(pred->op, cpu, pred->mask);
+}
+
/* Filter predicate for cpumask field vs user-provided cpumask */
static int filter_pred_cpumask(struct filter_pred *pred, void *event)
{
@@ -1440,6 +1449,8 @@ static int filter_pred_fn_call(struct filter_pred *pred, void *event)
return filter_pred_pchar(pred, event);
case FILTER_PRED_FN_CPU:
return filter_pred_cpu(pred, event);
+ case FILTER_PRED_FN_CPU_CPUMASK:
+ return filter_pred_cpu_cpumask(pred, event);
case FILTER_PRED_FN_CPUMASK:
return filter_pred_cpumask(pred, event);
case FILTER_PRED_FN_FUNCTION:
@@ -1659,6 +1670,7 @@ static int parse_pred(const char *str, void *data,
switch (field->filter_type) {
case FILTER_CPUMASK:
+ case FILTER_CPU:
case FILTER_OTHER:
break;
default:
@@ -1714,6 +1726,8 @@ static int parse_pred(const char *str, void *data,
i++;
if (field->filter_type == FILTER_CPUMASK) {
pred->fn_num = FILTER_PRED_FN_CPUMASK;
+ } else if (field->filter_type == FILTER_CPU) {
+ pred->fn_num = FILTER_PRED_FN_CPU_CPUMASK;
} else {
switch (field->size) {
case 8:
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 05/14] tracing/filters: Optimise cpumask vs cpumask filtering when user mask is a single CPU
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (3 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 04/14] tracing/filters: Enable filtering the CPU common " Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 06/14] tracing/filters: Optimise scalar vs cpumask filtering when the " Steven Rostedt
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Steven noted that when the user-provided cpumask contains a single CPU,
then the filtering function can use a scalar as input instead of a
full-fledged cpumask.
Reuse do_filter_scalar_cpumask() when the input mask has a weight of one.
Link: https://lkml.kernel.org/r/20230707172155.70873-6-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 35 +++++++++++++++++++++++++++++-
1 file changed, 34 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 3009d0c61b53..2fe65ddeb34e 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -70,6 +70,7 @@ enum filter_pred_fn {
FILTER_PRED_FN_CPU,
FILTER_PRED_FN_CPU_CPUMASK,
FILTER_PRED_FN_CPUMASK,
+ FILTER_PRED_FN_CPUMASK_CPU,
FILTER_PRED_FN_FUNCTION,
FILTER_PRED_FN_,
FILTER_PRED_TEST_VISITED,
@@ -957,6 +958,22 @@ static int filter_pred_cpumask(struct filter_pred *pred, void *event)
return do_filter_cpumask(pred->op, mask, cmp);
}
+/* Filter predicate for cpumask field vs user-provided scalar */
+static int filter_pred_cpumask_cpu(struct filter_pred *pred, void *event)
+{
+ u32 item = *(u32 *)(event + pred->offset);
+ int loc = item & 0xffff;
+ const struct cpumask *mask = (event + loc);
+ unsigned int cpu = pred->val;
+
+ /*
+ * This inverts the usual usage of the function (field is first element,
+ * user parameter is second), but that's fine because the (scalar, mask)
+ * operations used are symmetric.
+ */
+ return do_filter_scalar_cpumask(pred->op, cpu, mask);
+}
+
/* Filter predicate for COMM. */
static int filter_pred_comm(struct filter_pred *pred, void *event)
{
@@ -1453,6 +1470,8 @@ static int filter_pred_fn_call(struct filter_pred *pred, void *event)
return filter_pred_cpu_cpumask(pred, event);
case FILTER_PRED_FN_CPUMASK:
return filter_pred_cpumask(pred, event);
+ case FILTER_PRED_FN_CPUMASK_CPU:
+ return filter_pred_cpumask_cpu(pred, event);
case FILTER_PRED_FN_FUNCTION:
return filter_pred_function(pred, event);
case FILTER_PRED_TEST_VISITED:
@@ -1666,6 +1685,7 @@ static int parse_pred(const char *str, void *data,
} else if (!strncmp(str + i, "CPUS", 4)) {
unsigned int maskstart;
+ bool single;
char *tmp;
switch (field->filter_type) {
@@ -1724,8 +1744,21 @@ static int parse_pred(const char *str, void *data,
/* Move along */
i++;
+
+ /*
+ * Optimisation: if the user-provided mask has a weight of one
+ * then we can treat it as a scalar input.
+ */
+ single = cpumask_weight(pred->mask) == 1;
+ if (single && field->filter_type == FILTER_CPUMASK) {
+ pred->val = cpumask_first(pred->mask);
+ kfree(pred->mask);
+ }
+
if (field->filter_type == FILTER_CPUMASK) {
- pred->fn_num = FILTER_PRED_FN_CPUMASK;
+ pred->fn_num = single ?
+ FILTER_PRED_FN_CPUMASK_CPU :
+ FILTER_PRED_FN_CPUMASK;
} else if (field->filter_type == FILTER_CPU) {
pred->fn_num = FILTER_PRED_FN_CPU_CPUMASK;
} else {
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 06/14] tracing/filters: Optimise scalar vs cpumask filtering when the user mask is a single CPU
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (4 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 05/14] tracing/filters: Optimise cpumask vs cpumask filtering when user mask is a single CPU Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 07/14] tracing/filters: Optimise CPU " Steven Rostedt
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Steven noted that when the user-provided cpumask contains a single CPU,
then the filtering function can use a scalar as input instead of a
full-fledged cpumask.
When the mask contains a single CPU, directly re-use the unsigned field
predicate functions. Transform '&' into '==' beforehand.
Link: https://lkml.kernel.org/r/20230707172155.70873-7-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 2fe65ddeb34e..54d642fabb7f 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1750,7 +1750,7 @@ static int parse_pred(const char *str, void *data,
* then we can treat it as a scalar input.
*/
single = cpumask_weight(pred->mask) == 1;
- if (single && field->filter_type == FILTER_CPUMASK) {
+ if (single && field->filter_type != FILTER_CPU) {
pred->val = cpumask_first(pred->mask);
kfree(pred->mask);
}
@@ -1761,6 +1761,11 @@ static int parse_pred(const char *str, void *data,
FILTER_PRED_FN_CPUMASK;
} else if (field->filter_type == FILTER_CPU) {
pred->fn_num = FILTER_PRED_FN_CPU_CPUMASK;
+ } else if (single) {
+ pred->op = pred->op == OP_BAND ? OP_EQ : pred->op;
+ pred->fn_num = select_comparison_fn(pred->op, field->size, false);
+ if (pred->op == OP_NE)
+ pred->not = 1;
} else {
switch (field->size) {
case 8:
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 07/14] tracing/filters: Optimise CPU vs cpumask filtering when the user mask is a single CPU
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (5 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 06/14] tracing/filters: Optimise scalar vs cpumask filtering when the " Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 08/14] tracing/filters: Further optimise scalar vs cpumask comparison Steven Rostedt
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Steven noted that when the user-provided cpumask contains a single CPU,
then the filtering function can use a scalar as input instead of a
full-fledged cpumask.
In this case we can directly re-use filter_pred_cpu(), we just need to
transform '&' into '==' before executing it.
Link: https://lkml.kernel.org/r/20230707172155.70873-8-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 54d642fabb7f..fd72dacc5d1b 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -1750,7 +1750,7 @@ static int parse_pred(const char *str, void *data,
* then we can treat it as a scalar input.
*/
single = cpumask_weight(pred->mask) == 1;
- if (single && field->filter_type != FILTER_CPU) {
+ if (single) {
pred->val = cpumask_first(pred->mask);
kfree(pred->mask);
}
@@ -1760,7 +1760,12 @@ static int parse_pred(const char *str, void *data,
FILTER_PRED_FN_CPUMASK_CPU :
FILTER_PRED_FN_CPUMASK;
} else if (field->filter_type == FILTER_CPU) {
- pred->fn_num = FILTER_PRED_FN_CPU_CPUMASK;
+ if (single) {
+ pred->op = pred->op == OP_BAND ? OP_EQ : pred->op;
+ pred->fn_num = FILTER_PRED_FN_CPU;
+ } else {
+ pred->fn_num = FILTER_PRED_FN_CPU_CPUMASK;
+ }
} else if (single) {
pred->op = pred->op == OP_BAND ? OP_EQ : pred->op;
pred->fn_num = select_comparison_fn(pred->op, field->size, false);
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 08/14] tracing/filters: Further optimise scalar vs cpumask comparison
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (6 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 07/14] tracing/filters: Optimise CPU " Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 09/14] tracing/filters: Document cpumask filtering Steven Rostedt
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Per the previous commits, we now only enter do_filter_scalar_cpumask() with
a mask of weight greater than one. Optimise the equality checks.
Link: https://lkml.kernel.org/r/20230707172155.70873-9-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_filter.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index fd72dacc5d1b..3a529214a21b 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -667,6 +667,25 @@ do_filter_cpumask(int op, const struct cpumask *mask, const struct cpumask *cmp)
/* Optimisation of do_filter_cpumask() for scalar fields */
static inline int
do_filter_scalar_cpumask(int op, unsigned int cpu, const struct cpumask *mask)
+{
+ /*
+ * Per the weight-of-one cpumask optimisations, the mask passed in this
+ * function has a weight >= 2, so it is never equal to a single scalar.
+ */
+ switch (op) {
+ case OP_EQ:
+ return false;
+ case OP_NE:
+ return true;
+ case OP_BAND:
+ return cpumask_test_cpu(cpu, mask);
+ default:
+ return 0;
+ }
+}
+
+static inline int
+do_filter_cpumask_scalar(int op, const struct cpumask *mask, unsigned int cpu)
{
switch (op) {
case OP_EQ:
@@ -966,12 +985,7 @@ static int filter_pred_cpumask_cpu(struct filter_pred *pred, void *event)
const struct cpumask *mask = (event + loc);
unsigned int cpu = pred->val;
- /*
- * This inverts the usual usage of the function (field is first element,
- * user parameter is second), but that's fine because the (scalar, mask)
- * operations used are symmetric.
- */
- return do_filter_scalar_cpumask(pred->op, cpu, mask);
+ return do_filter_cpumask_scalar(pred->op, mask, cpu);
}
/* Filter predicate for COMM. */
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 09/14] tracing/filters: Document cpumask filtering
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (7 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 08/14] tracing/filters: Further optimise scalar vs cpumask comparison Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 10/14] tracing: Remove unused function declarations Steven Rostedt
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Jonathan Corbet,
Juri Lelli, Daniel Bristot de Oliveira, Marcelo Tosatti,
Leonardo Bras, Frederic Weisbecker, Valentin Schneider
From: Valentin Schneider <vschneid@redhat.com>
Cpumask, scalar and CPU fields can now be filtered by a user-provided
cpumask, document the syntax.
Link: https://lkml.kernel.org/r/20230707172155.70873-10-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Documentation/trace/events.rst | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index f5fcb8e1218f..34108d5a55b4 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -219,6 +219,20 @@ the function "security_prepare_creds" and less than the end of that function.
The ".function" postfix can only be attached to values of size long, and can only
be compared with "==" or "!=".
+Cpumask fields or scalar fields that encode a CPU number can be filtered using
+a user-provided cpumask in cpulist format. The format is as follows::
+
+ CPUS{$cpulist}
+
+Operators available to cpumask filtering are:
+
+& (intersection), ==, !=
+
+For example, this will filter events that have their .target_cpu field present
+in the given cpumask::
+
+ target_cpu & CPUS{17-42}
+
5.2 Setting filters
-------------------
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 10/14] tracing: Remove unused function declarations
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (8 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 09/14] tracing/filters: Document cpumask filtering Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 11/14] ftrace: Remove empty declaration ftrace_enable_daemon() and ftrace_disable_daemon() Steven Rostedt
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Yue Haibing
From: Yue Haibing <yuehaibing@huawei.com>
Commit 9457158bbc0e ("tracing: Fix reset of time stamps during trace_clock changes")
left behind tracing_reset_current() declaration.
Also commit 6954e415264e ("tracing: Place trace_pid_list logic into abstract functions")
removed trace_free_pid_list() implementation but leave declaration.
Link: https://lore.kernel.org/linux-trace-kernel/20230803144028.25492-1-yuehaibing@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 5b1f9e24764a..b6e44a39b4ce 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -604,7 +604,6 @@ trace_buffer_iter(struct trace_iterator *iter, int cpu)
int tracer_init(struct tracer *t, struct trace_array *tr);
int tracing_is_enabled(void);
void tracing_reset_online_cpus(struct array_buffer *buf);
-void tracing_reset_current(int cpu);
void tracing_reset_all_online_cpus(void);
void tracing_reset_all_online_cpus_unlocked(void);
int tracing_open_generic(struct inode *inode, struct file *filp);
@@ -705,7 +704,6 @@ void trace_filter_add_remove_task(struct trace_pid_list *pid_list,
void *trace_pid_next(struct trace_pid_list *pid_list, void *v, loff_t *pos);
void *trace_pid_start(struct trace_pid_list *pid_list, loff_t *pos);
int trace_pid_show(struct seq_file *m, void *v);
-void trace_free_pid_list(struct trace_pid_list *pid_list);
int trace_pid_write(struct trace_pid_list *filtered_pids,
struct trace_pid_list **new_pid_list,
const char __user *ubuf, size_t cnt);
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 11/14] ftrace: Remove empty declaration ftrace_enable_daemon() and ftrace_disable_daemon()
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (9 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 10/14] tracing: Remove unused function declarations Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 12/14] tracing/user_events: Optimize safe list traversals Steven Rostedt
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Zhang Zekun
From: Zhang Zekun <zhangzekun11@huawei.com>
The definition of ftrace_enable_daemon() and ftrace_disable_daemon() has
been removed since commit cb7be3b2fc2c ("ftrace: remove daemon"), remain
the declarations in the header files, so remove it.
Link: https://lore.kernel.org/linux-trace-kernel/20230804013636.115940-1-zhangzekun11@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <mark.rutland@arm.com>
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
include/linux/ftrace.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index aad9cf8876b5..e8921871ef9a 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -862,13 +862,8 @@ extern int skip_trace(unsigned long ip);
extern void ftrace_module_init(struct module *mod);
extern void ftrace_module_enable(struct module *mod);
extern void ftrace_release_mod(struct module *mod);
-
-extern void ftrace_disable_daemon(void);
-extern void ftrace_enable_daemon(void);
#else /* CONFIG_DYNAMIC_FTRACE */
static inline int skip_trace(unsigned long ip) { return 0; }
-static inline void ftrace_disable_daemon(void) { }
-static inline void ftrace_enable_daemon(void) { }
static inline void ftrace_module_init(struct module *mod) { }
static inline void ftrace_module_enable(struct module *mod) { }
static inline void ftrace_release_mod(struct module *mod) { }
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 12/14] tracing/user_events: Optimize safe list traversals
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (10 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 11/14] ftrace: Remove empty declaration ftrace_enable_daemon() and ftrace_disable_daemon() Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 13/14] tracefs: Avoid changing i_mode to a temp value Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 14/14] tracefs: Remove kerneldoc from struct eventfs_file Steven Rostedt
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Beau Belgrave,
Eric Vaughn
From: Eric Vaughn <ervaughn@linux.microsoft.com>
Several of the list traversals in the user_events facility use safe list
traversals where they could be using the unsafe versions instead.
Replace these safe traversals with their unsafe counterparts in the
interest of optimization.
Link: https://lore.kernel.org/linux-trace-kernel/20230810194337.695983-1-ervaughn@linux.microsoft.com
Suggested-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Eric Vaughn <ervaughn@linux.microsoft.com>
Acked-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
kernel/trace/trace_events_user.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 33cb6af31f39..6f046650e527 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1328,14 +1328,14 @@ static int user_field_set_string(struct ftrace_event_field *field,
static int user_event_set_print_fmt(struct user_event *user, char *buf, int len)
{
- struct ftrace_event_field *field, *next;
+ struct ftrace_event_field *field;
struct list_head *head = &user->fields;
int pos = 0, depth = 0;
const char *str_func;
pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
- list_for_each_entry_safe_reverse(field, next, head, link) {
+ list_for_each_entry_reverse(field, head, link) {
if (depth != 0)
pos += snprintf(buf + pos, LEN_OR_ZERO, " ");
@@ -1347,7 +1347,7 @@ static int user_event_set_print_fmt(struct user_event *user, char *buf, int len)
pos += snprintf(buf + pos, LEN_OR_ZERO, "\"");
- list_for_each_entry_safe_reverse(field, next, head, link) {
+ list_for_each_entry_reverse(field, head, link) {
if (user_field_is_dyn_string(field->type, &str_func))
pos += snprintf(buf + pos, LEN_OR_ZERO,
", %s(%s)", str_func, field->name);
@@ -1732,7 +1732,7 @@ static int user_event_create(const char *raw_command)
static int user_event_show(struct seq_file *m, struct dyn_event *ev)
{
struct user_event *user = container_of(ev, struct user_event, devent);
- struct ftrace_event_field *field, *next;
+ struct ftrace_event_field *field;
struct list_head *head;
int depth = 0;
@@ -1740,7 +1740,7 @@ static int user_event_show(struct seq_file *m, struct dyn_event *ev)
head = trace_get_fields(&user->call);
- list_for_each_entry_safe_reverse(field, next, head, link) {
+ list_for_each_entry_reverse(field, head, link) {
if (depth == 0)
seq_puts(m, " ");
else
@@ -1816,13 +1816,14 @@ static bool user_field_match(struct ftrace_event_field *field, int argc,
static bool user_fields_match(struct user_event *user, int argc,
const char **argv)
{
- struct ftrace_event_field *field, *next;
+ struct ftrace_event_field *field;
struct list_head *head = &user->fields;
int i = 0;
- list_for_each_entry_safe_reverse(field, next, head, link)
+ list_for_each_entry_reverse(field, head, link) {
if (!user_field_match(field, argc, argv, &i))
return false;
+ }
if (i != argc)
return false;
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 13/14] tracefs: Avoid changing i_mode to a temp value
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (11 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 12/14] tracing/user_events: Optimize safe list traversals Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 14/14] tracefs: Remove kerneldoc from struct eventfs_file Steven Rostedt
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Sishuai Gong
From: Sishuai Gong <sishuai.system@gmail.com>
Right now inode->i_mode is updated twice to reach the desired value
in tracefs_apply_options(). Because there is no lock protecting the two
writes, other threads might read the intermediate value of inode->i_mode.
Thread-1 Thread-2
// tracefs_apply_options() //e.g., acl_permission_check
inode->i_mode &= ~S_IALLUGO;
unsigned int mode = inode->i_mode;
inode->i_mode |= opts->mode;
I think there is no need to introduce a lock but it is better to
only update inode->i_mode ONCE, so the readers will either see the old
or latest value, rather than an intermediate/temporary value.
Note, the race is not a security concern as the intermediate value is more
locked down than either the start or end version. This is more just to do
the conversion cleanly.
Link: https://lore.kernel.org/linux-trace-kernel/AB5B0A1C-75D9-4E82-A7F0-CF7D0715587B@gmail.com
Signed-off-by: Sishuai Gong <sishuai.system@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
fs/tracefs/inode.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index bb6de89eb446..c7a10f965602 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -310,6 +310,7 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
struct tracefs_fs_info *fsi = sb->s_fs_info;
struct inode *inode = d_inode(sb->s_root);
struct tracefs_mount_opts *opts = &fsi->mount_opts;
+ umode_t tmp_mode;
/*
* On remount, only reset mode/uid/gid if they were provided as mount
@@ -317,8 +318,9 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
*/
if (!remount || opts->opts & BIT(Opt_mode)) {
- inode->i_mode &= ~S_IALLUGO;
- inode->i_mode |= opts->mode;
+ tmp_mode = READ_ONCE(inode->i_mode) & ~S_IALLUGO;
+ tmp_mode |= opts->mode;
+ WRITE_ONCE(inode->i_mode, tmp_mode);
}
if (!remount || opts->opts & BIT(Opt_uid))
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [for-next][PATCH 14/14] tracefs: Remove kerneldoc from struct eventfs_file
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
` (12 preceding siblings ...)
2023-08-24 2:18 ` [for-next][PATCH 13/14] tracefs: Avoid changing i_mode to a temp value Steven Rostedt
@ 2023-08-24 2:18 ` Steven Rostedt
13 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2023-08-24 2:18 UTC (permalink / raw)
To: linux-kernel
Cc: Masami Hiramatsu, Mark Rutland, Andrew Morton, Matthew Wilcox (Oracle)
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
The struct eventfs_file is a local structure and should not be parsed by
kernel doc. It also does not fully follow the kerneldoc format and is
causing kerneldoc to spit out errors. Replace the /** to /* so that
kerneldoc no longer processes this structure.
Also format the comments of the delete union of the structure to be a bit
better.
Link: https://lore.kernel.org/linux-trace-kernel/20230818201414.2729745-1-willy@infradead.org/
Link: https://lore.kernel.org/linux-trace-kernel/20230822053313.77aa3397@rorschach.local.home
Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
fs/tracefs/event_inode.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index da8d2e73cc47..237c6f370ad9 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -27,7 +27,7 @@ struct eventfs_inode {
struct list_head e_top_files;
};
-/**
+/*
* struct eventfs_file - hold the properties of the eventfs files and
* directories.
* @name: the name of the file or directory to create
@@ -48,10 +48,16 @@ struct eventfs_file {
struct eventfs_inode *ei;
const struct file_operations *fop;
const struct inode_operations *iop;
+ /*
+ * Union - used for deletion
+ * @del_list: list of eventfs_file to delete
+ * @rcu: eventfs_file to delete in RCU
+ * @is_freed: node is freed if one of the above is set
+ */
union {
- struct list_head del_list; /* list of eventfs_file to delete */
- struct rcu_head rcu; /* eventfs_file to delete */
- unsigned long is_freed; /* Freed if one of the above is set */
+ struct list_head del_list;
+ struct rcu_head rcu;
+ unsigned long is_freed;
};
void *data;
umode_t mode;
--
2.40.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-08-24 2:19 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24 2:18 [for-next][PATCH 00/14] tracing: More updates for 6.6 Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 01/14] tracing/filters: Dynamically allocate filter_pred.regex Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 02/14] tracing/filters: Enable filtering a cpumask field by another cpumask Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 03/14] tracing/filters: Enable filtering a scalar field by a cpumask Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 04/14] tracing/filters: Enable filtering the CPU common " Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 05/14] tracing/filters: Optimise cpumask vs cpumask filtering when user mask is a single CPU Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 06/14] tracing/filters: Optimise scalar vs cpumask filtering when the " Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 07/14] tracing/filters: Optimise CPU " Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 08/14] tracing/filters: Further optimise scalar vs cpumask comparison Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 09/14] tracing/filters: Document cpumask filtering Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 10/14] tracing: Remove unused function declarations Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 11/14] ftrace: Remove empty declaration ftrace_enable_daemon() and ftrace_disable_daemon() Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 12/14] tracing/user_events: Optimize safe list traversals Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 13/14] tracefs: Avoid changing i_mode to a temp value Steven Rostedt
2023-08-24 2:18 ` [for-next][PATCH 14/14] tracefs: Remove kerneldoc from struct eventfs_file Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).