All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Burton <paulburton@google.com>
To: linux-kernel@vger.kernel.org
Cc: Paul Burton <paulburton@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Joel Fernandes <joelaf@google.com>,
	stable@vger.kernel.org
Subject: [PATCH v2 2/2] tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
Date: Thu,  1 Jul 2021 10:24:07 -0700	[thread overview]
Message-ID: <20210701172407.889626-2-paulburton@google.com> (raw)
In-Reply-To: <20210701172407.889626-1-paulburton@google.com>

Currently tgid_map is sized at PID_MAX_DEFAULT entries, which means that
on systems where pid_max is configured higher than PID_MAX_DEFAULT the
ftrace record-tgid option doesn't work so well. Any tasks with PIDs
higher than PID_MAX_DEFAULT are simply not recorded in tgid_map, and
don't show up in the saved_tgids file.

In particular since systemd v243 & above configure pid_max to its
highest possible 1<<22 value by default on 64 bit systems this renders
the record-tgids option of little use.

Increase the size of tgid_map to the configured pid_max instead,
allowing it to cover the full range of PIDs up to the maximum value of
PID_MAX_LIMIT if the system is configured that way.

On 64 bit systems with pid_max == PID_MAX_LIMIT this will increase the
size of tgid_map from 256KiB to 16MiB. Whilst this 64x increase in
memory overhead sounds significant 64 bit systems are presumably best
placed to accommodate it, and since tgid_map is only allocated when the
record-tgid option is actually used presumably the user would rather it
spends sufficient memory to actually record the tgids they expect.

The size of tgid_map could also increase for CONFIG_BASE_SMALL=y
configurations, but these seem unlikely to be systems upon which people
are both configuring a large pid_max and running ftrace with record-tgid
anyway.

Of note is that we only allocate tgid_map once, the first time that the
record-tgid option is enabled. Therefore its size is only set once, to
the value of pid_max at the time the record-tgid option is first
enabled. If a user increases pid_max after that point, the saved_tgids
file will not contain entries for any tasks with pids beyond the earlier
value of pid_max.

Signed-off-by: Paul Burton <paulburton@google.com>
Fixes: d914ba37d714 ("tracing: Add support for recording tgid of tasks")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: <stable@vger.kernel.org>
---
Changes in v2:
- Use the configured value of pid_max at the time record-tgid is enabled
  rather than unconditionally using PID_MAX_LIMIT, to avoid added memory
  overhead for systems that don't configure such a high pid_max.
---
 kernel/trace/trace.c | 60 ++++++++++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 7a37c9e36b88..3c4b3b207c06 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2184,8 +2184,13 @@ void tracing_reset_all_online_cpus(void)
 	}
 }
 
+// The tgid_map array maps from pid to tgid; i.e. the value stored at index i
+// is the tgid last observed corresponding to pid=i.
 static int *tgid_map;
 
+// The maximum valid index into tgid_map.
+static size_t tgid_map_max;
+
 #define SAVED_CMDLINES_DEFAULT 128
 #define NO_CMDLINE_MAP UINT_MAX
 static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED;
@@ -2458,24 +2463,39 @@ void trace_find_cmdline(int pid, char comm[])
 	preempt_enable();
 }
 
+static int *trace_find_tgid_ptr(int pid)
+{
+	// Pairs with the smp_store_release in set_tracer_flag() to ensure that
+	// if we observe a non-NULL tgid_map then we also observe the correct
+	// tgid_map_max.
+	int *map = smp_load_acquire(&tgid_map);
+
+	if (unlikely(!map || pid > tgid_map_max))
+		return NULL;
+
+	return &map[pid];
+}
+
 int trace_find_tgid(int pid)
 {
-	if (unlikely(!tgid_map || !pid || pid > PID_MAX_DEFAULT))
-		return 0;
+	int *ptr = trace_find_tgid_ptr(pid);
 
-	return tgid_map[pid];
+	return ptr ? *ptr : 0;
 }
 
 static int trace_save_tgid(struct task_struct *tsk)
 {
+	int *ptr;
+
 	/* treat recording of idle task as a success */
 	if (!tsk->pid)
 		return 1;
 
-	if (unlikely(!tgid_map || tsk->pid > PID_MAX_DEFAULT))
+	ptr = trace_find_tgid_ptr(tsk->pid);
+	if (!ptr)
 		return 0;
 
-	tgid_map[tsk->pid] = tsk->tgid;
+	*ptr = tsk->tgid;
 	return 1;
 }
 
@@ -5171,6 +5191,8 @@ int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set)
 
 int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
 {
+	int *map;
+
 	if ((mask == TRACE_ITER_RECORD_TGID) ||
 	    (mask == TRACE_ITER_RECORD_CMD))
 		lockdep_assert_held(&event_mutex);
@@ -5193,10 +5215,17 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
 		trace_event_enable_cmd_record(enabled);
 
 	if (mask == TRACE_ITER_RECORD_TGID) {
-		if (!tgid_map)
-			tgid_map = kvcalloc(PID_MAX_DEFAULT + 1,
-					   sizeof(*tgid_map),
-					   GFP_KERNEL);
+		if (!tgid_map) {
+			tgid_map_max = pid_max;
+			map = kvcalloc(tgid_map_max + 1, sizeof(*tgid_map),
+				       GFP_KERNEL);
+
+			// Pairs with smp_load_acquire() in
+			// trace_find_tgid_ptr() to ensure that if it observes
+			// the tgid_map we just allocated then it also observes
+			// the corresponding tgid_map_max value.
+			smp_store_release(&tgid_map, map);
+		}
 		if (!tgid_map) {
 			tr->trace_flags &= ~TRACE_ITER_RECORD_TGID;
 			return -ENOMEM;
@@ -5610,21 +5639,14 @@ static void *saved_tgids_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	int pid = ++(*pos);
 
-	if (pid > PID_MAX_DEFAULT)
-		return NULL;
-
-	// We already know that tgid_map is non-NULL here because the v
-	// argument is by definition a non-NULL pointer into tgid_map returned
-	// by saved_tgids_start() or an earlier call to saved_tgids_next().
-	return &tgid_map[pid];
+	return trace_find_tgid_ptr(pid);
 }
 
 static void *saved_tgids_start(struct seq_file *m, loff_t *pos)
 {
-	if (!tgid_map || *pos > PID_MAX_DEFAULT)
-		return NULL;
+	int pid = *pos;
 
-	return &tgid_map[*pos];
+	return trace_find_tgid_ptr(pid);
 }
 
 static void saved_tgids_stop(struct seq_file *m, void *v)
-- 
2.32.0.93.g670b81a890-goog


  reply	other threads:[~2021-07-01 17:24 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-30  0:34 [PATCH 1/2] tracing: Simplify & fix saved_tgids logic Paul Burton
2021-06-30  0:34 ` [PATCH 2/2] tracing: Resize tgid_map to PID_MAX_LIMIT, not PID_MAX_DEFAULT Paul Burton
2021-06-30 12:35   ` Steven Rostedt
2021-06-30 21:09     ` Paul Burton
2021-06-30 21:34       ` Steven Rostedt
2021-06-30 22:34         ` Joel Fernandes
2021-06-30 23:11           ` Steven Rostedt
2021-07-01 13:55             ` Steven Rostedt
2021-07-01 17:24               ` [PATCH v2 1/2] tracing: Simplify & fix saved_tgids logic Paul Burton
2021-07-01 17:24                 ` Paul Burton [this message]
2021-07-01 18:12                   ` [PATCH v2 2/2] tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT Steven Rostedt
2021-07-01 18:15                     ` Paul Burton
2021-07-01 18:27                       ` Steven Rostedt
2021-07-01 18:07                 ` [PATCH v2 1/2] tracing: Simplify & fix saved_tgids logic Joel Fernandes
2021-06-30 12:31 ` [PATCH " Steven Rostedt
2021-06-30 16:43   ` Joel Fernandes
2021-06-30 22:29 ` Joel Fernandes
2021-07-01 17:31   ` Paul Burton
2021-07-01 18:05     ` Joel Fernandes
2021-07-01 18:07     ` Steven Rostedt
2021-07-01 18:09       ` Joel Fernandes
2021-07-01 18:12       ` Paul Burton
2021-07-01 18:26         ` Steven Rostedt
2021-07-01 19:35           ` Joe Perches
2021-07-01 19:51             ` Steven Rostedt
2021-07-01 21:07               ` Joe Perches
2021-07-01 23:49                 ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210701172407.889626-2-paulburton@google.com \
    --to=paulburton@google.com \
    --cc=joelaf@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.