linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Burton <paulburton@google.com>
To: linux-kernel@vger.kernel.org
Cc: Paul Burton <paulburton@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Joel Fernandes <joelaf@google.com>,
	stable@vger.kernel.org
Subject: [PATCH v2 2/2] tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
Date: Thu,  1 Jul 2021 10:24:07 -0700	[thread overview]
Message-ID: <20210701172407.889626-2-paulburton@google.com> (raw)
In-Reply-To: <20210701172407.889626-1-paulburton@google.com>

Currently tgid_map is sized at PID_MAX_DEFAULT entries, which means that
on systems where pid_max is configured higher than PID_MAX_DEFAULT the
ftrace record-tgid option doesn't work so well. Any tasks with PIDs
higher than PID_MAX_DEFAULT are simply not recorded in tgid_map, and
don't show up in the saved_tgids file.

In particular since systemd v243 & above configure pid_max to its
highest possible 1<<22 value by default on 64 bit systems this renders
the record-tgids option of little use.

Increase the size of tgid_map to the configured pid_max instead,
allowing it to cover the full range of PIDs up to the maximum value of
PID_MAX_LIMIT if the system is configured that way.

On 64 bit systems with pid_max == PID_MAX_LIMIT this will increase the
size of tgid_map from 256KiB to 16MiB. Whilst this 64x increase in
memory overhead sounds significant 64 bit systems are presumably best
placed to accommodate it, and since tgid_map is only allocated when the
record-tgid option is actually used presumably the user would rather it
spends sufficient memory to actually record the tgids they expect.

The size of tgid_map could also increase for CONFIG_BASE_SMALL=y
configurations, but these seem unlikely to be systems upon which people
are both configuring a large pid_max and running ftrace with record-tgid
anyway.

Of note is that we only allocate tgid_map once, the first time that the
record-tgid option is enabled. Therefore its size is only set once, to
the value of pid_max at the time the record-tgid option is first
enabled. If a user increases pid_max after that point, the saved_tgids
file will not contain entries for any tasks with pids beyond the earlier
value of pid_max.

Signed-off-by: Paul Burton <paulburton@google.com>
Fixes: d914ba37d714 ("tracing: Add support for recording tgid of tasks")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: <stable@vger.kernel.org>
---
Changes in v2:
- Use the configured value of pid_max at the time record-tgid is enabled
  rather than unconditionally using PID_MAX_LIMIT, to avoid added memory
  overhead for systems that don't configure such a high pid_max.
---
 kernel/trace/trace.c | 60 ++++++++++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 7a37c9e36b88..3c4b3b207c06 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2184,8 +2184,13 @@ void tracing_reset_all_online_cpus(void)
 	}
 }
 
+// The tgid_map array maps from pid to tgid; i.e. the value stored at index i
+// is the tgid last observed corresponding to pid=i.
 static int *tgid_map;
 
+// The maximum valid index into tgid_map.
+static size_t tgid_map_max;
+
 #define SAVED_CMDLINES_DEFAULT 128
 #define NO_CMDLINE_MAP UINT_MAX
 static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED;
@@ -2458,24 +2463,39 @@ void trace_find_cmdline(int pid, char comm[])
 	preempt_enable();
 }
 
+static int *trace_find_tgid_ptr(int pid)
+{
+	// Pairs with the smp_store_release in set_tracer_flag() to ensure that
+	// if we observe a non-NULL tgid_map then we also observe the correct
+	// tgid_map_max.
+	int *map = smp_load_acquire(&tgid_map);
+
+	if (unlikely(!map || pid > tgid_map_max))
+		return NULL;
+
+	return &map[pid];
+}
+
 int trace_find_tgid(int pid)
 {
-	if (unlikely(!tgid_map || !pid || pid > PID_MAX_DEFAULT))
-		return 0;
+	int *ptr = trace_find_tgid_ptr(pid);
 
-	return tgid_map[pid];
+	return ptr ? *ptr : 0;
 }
 
 static int trace_save_tgid(struct task_struct *tsk)
 {
+	int *ptr;
+
 	/* treat recording of idle task as a success */
 	if (!tsk->pid)
 		return 1;
 
-	if (unlikely(!tgid_map || tsk->pid > PID_MAX_DEFAULT))
+	ptr = trace_find_tgid_ptr(tsk->pid);
+	if (!ptr)
 		return 0;
 
-	tgid_map[tsk->pid] = tsk->tgid;
+	*ptr = tsk->tgid;
 	return 1;
 }
 
@@ -5171,6 +5191,8 @@ int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set)
 
 int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
 {
+	int *map;
+
 	if ((mask == TRACE_ITER_RECORD_TGID) ||
 	    (mask == TRACE_ITER_RECORD_CMD))
 		lockdep_assert_held(&event_mutex);
@@ -5193,10 +5215,17 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
 		trace_event_enable_cmd_record(enabled);
 
 	if (mask == TRACE_ITER_RECORD_TGID) {
-		if (!tgid_map)
-			tgid_map = kvcalloc(PID_MAX_DEFAULT + 1,
-					   sizeof(*tgid_map),
-					   GFP_KERNEL);
+		if (!tgid_map) {
+			tgid_map_max = pid_max;
+			map = kvcalloc(tgid_map_max + 1, sizeof(*tgid_map),
+				       GFP_KERNEL);
+
+			// Pairs with smp_load_acquire() in
+			// trace_find_tgid_ptr() to ensure that if it observes
+			// the tgid_map we just allocated then it also observes
+			// the corresponding tgid_map_max value.
+			smp_store_release(&tgid_map, map);
+		}
 		if (!tgid_map) {
 			tr->trace_flags &= ~TRACE_ITER_RECORD_TGID;
 			return -ENOMEM;
@@ -5610,21 +5639,14 @@ static void *saved_tgids_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	int pid = ++(*pos);
 
-	if (pid > PID_MAX_DEFAULT)
-		return NULL;
-
-	// We already know that tgid_map is non-NULL here because the v
-	// argument is by definition a non-NULL pointer into tgid_map returned
-	// by saved_tgids_start() or an earlier call to saved_tgids_next().
-	return &tgid_map[pid];
+	return trace_find_tgid_ptr(pid);
 }
 
 static void *saved_tgids_start(struct seq_file *m, loff_t *pos)
 {
-	if (!tgid_map || *pos > PID_MAX_DEFAULT)
-		return NULL;
+	int pid = *pos;
 
-	return &tgid_map[*pos];
+	return trace_find_tgid_ptr(pid);
 }
 
 static void saved_tgids_stop(struct seq_file *m, void *v)
-- 
2.32.0.93.g670b81a890-goog


  reply	other threads:[~2021-07-01 17:24 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-30  0:34 [PATCH 1/2] tracing: Simplify & fix saved_tgids logic Paul Burton
2021-06-30  0:34 ` [PATCH 2/2] tracing: Resize tgid_map to PID_MAX_LIMIT, not PID_MAX_DEFAULT Paul Burton
2021-06-30 12:35   ` Steven Rostedt
2021-06-30 21:09     ` Paul Burton
2021-06-30 21:34       ` Steven Rostedt
2021-06-30 22:34         ` Joel Fernandes
2021-06-30 23:11           ` Steven Rostedt
2021-07-01 13:55             ` Steven Rostedt
2021-07-01 17:24               ` [PATCH v2 1/2] tracing: Simplify & fix saved_tgids logic Paul Burton
2021-07-01 17:24                 ` Paul Burton [this message]
2021-07-01 18:12                   ` [PATCH v2 2/2] tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT Steven Rostedt
2021-07-01 18:15                     ` Paul Burton
2021-07-01 18:27                       ` Steven Rostedt
2021-07-01 18:07                 ` [PATCH v2 1/2] tracing: Simplify & fix saved_tgids logic Joel Fernandes
2021-06-30 12:31 ` [PATCH " Steven Rostedt
2021-06-30 16:43   ` Joel Fernandes
2021-06-30 22:29 ` Joel Fernandes
2021-07-01 17:31   ` Paul Burton
2021-07-01 18:05     ` Joel Fernandes
2021-07-01 18:07     ` Steven Rostedt
2021-07-01 18:09       ` Joel Fernandes
2021-07-01 18:12       ` Paul Burton
2021-07-01 18:26         ` Steven Rostedt
2021-07-01 19:35           ` Joe Perches
2021-07-01 19:51             ` Steven Rostedt
2021-07-01 21:07               ` Joe Perches
2021-07-01 23:49                 ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210701172407.889626-2-paulburton@google.com \
    --to=paulburton@google.com \
    --cc=joelaf@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).