[for-linus][PATCH 0/4] tracing: Some fixes for 5.13

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [for-linus][PATCH 0/4] tracing: Some fixes for 5.13
@ 2021-06-18 13:20 Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 1/4] recordmcount: Correct st_shndx handling Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Steven Rostedt @ 2021-06-18 13:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton


Tracing fixes for 5.13:

 - Have recordmcount check for valid st_shndx otherwise some archs may have
   invalid references for the mcount location.

 - Two fixes done for mapping pids to task names. Traces were not showing
   the names of tasks when they should have.

 - Fix to trace_clock_global() to prevent it from going backwards

Peter Zijlstra (1):
      recordmcount: Correct st_shndx handling

Steven Rostedt (VMware) (3):
      tracing: Do not stop recording cmdlines when tracing is off
      tracing: Do not stop recording comms if the trace file is being read
      tracing: Do no increment trace_clock_global() by one

----
 kernel/trace/trace.c       | 11 -----------
 kernel/trace/trace_clock.c |  6 +++---
 scripts/recordmcount.h     | 15 ++++++++++-----
 3 files changed, 13 insertions(+), 19 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 1/4] recordmcount: Correct st_shndx handling
  2021-06-18 13:20 [for-linus][PATCH 0/4] tracing: Some fixes for 5.13 Steven Rostedt
@ 2021-06-18 13:20 ` Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 2/4] tracing: Do not stop recording cmdlines when tracing is off Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2021-06-18 13:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Mark-PK Tsai, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

One should only use st_shndx when >SHN_UNDEF and <SHN_LORESERVE. When
SHN_XINDEX, then use .symtab_shndx. Otherwise use 0.

This handles the case: st_shndx >= SHN_LORESERVE && st_shndx != SHN_XINDEX.

Link: https://lore.kernel.org/lkml/20210607023839.26387-1-mark-pk.tsai@mediatek.com/
Link: https://lkml.kernel.org/r/20210616154126.2794-1-mark-pk.tsai@mediatek.com

Reported-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[handle endianness of sym->st_shndx]
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 scripts/recordmcount.h | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index f9b19524da11..1e9baa5c4fc6 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -192,15 +192,20 @@ static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab,
 				 Elf32_Word const *symtab_shndx)
 {
 	unsigned long offset;
+	unsigned short shndx = w2(sym->st_shndx);
 	int index;
 
-	if (sym->st_shndx != SHN_XINDEX)
-		return w2(sym->st_shndx);
+	if (shndx > SHN_UNDEF && shndx < SHN_LORESERVE)
+		return shndx;
 
-	offset = (unsigned long)sym - (unsigned long)symtab;
-	index = offset / sizeof(*sym);
+	if (shndx == SHN_XINDEX) {
+		offset = (unsigned long)sym - (unsigned long)symtab;
+		index = offset / sizeof(*sym);
 
-	return w(symtab_shndx[index]);
+		return w(symtab_shndx[index]);
+	}
+
+	return 0;
 }
 
 static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0)
-- 
2.30.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 2/4] tracing: Do not stop recording cmdlines when tracing is off
  2021-06-18 13:20 [for-linus][PATCH 0/4] tracing: Some fixes for 5.13 Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 1/4] recordmcount: Correct st_shndx handling Steven Rostedt
@ 2021-06-18 13:20 ` Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 3/4] tracing: Do not stop recording comms if the trace file is being read Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 4/4] tracing: Do no increment trace_clock_global() by one Steven Rostedt
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2021-06-18 13:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, stable

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The saved_cmdlines is used to map pids to the task name, such that the
output of the tracing does not just show pids, but also gives a human
readable name for the task.

If the name is not mapped, the output looks like this:

    <...>-1316          [005] ...2   132.044039: ...

Instead of this:

    gnome-shell-1316    [005] ...2   132.044039: ...

The names are updated when tracing is running, but are skipped if tracing
is stopped. Unfortunately, this stops the recording of the names if the
top level tracer is stopped, and not if there's other tracers active.

The recording of a name only happens when a new event is written into a
ring buffer, so there is no need to test if tracing is on or not. If
tracing is off, then no event is written and no need to test if tracing is
off or not.

Remove the check, as it hides the names of tasks for events in the
instance buffers.

Cc: stable@vger.kernel.org
Fixes: 7ffbd48d5cab2 ("tracing: Cache comms only after an event occurred")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9299057feb56..e220b37e29c6 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2486,8 +2486,6 @@ static bool tracing_record_taskinfo_skip(int flags)
 {
 	if (unlikely(!(flags & (TRACE_RECORD_CMDLINE | TRACE_RECORD_TGID))))
 		return true;
-	if (atomic_read(&trace_record_taskinfo_disabled) || !tracing_is_on())
-		return true;
 	if (!__this_cpu_read(trace_taskinfo_save))
 		return true;
 	return false;
-- 
2.30.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 3/4] tracing: Do not stop recording comms if the trace file is being read
  2021-06-18 13:20 [for-linus][PATCH 0/4] tracing: Some fixes for 5.13 Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 1/4] recordmcount: Correct st_shndx handling Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 2/4] tracing: Do not stop recording cmdlines when tracing is off Steven Rostedt
@ 2021-06-18 13:20 ` Steven Rostedt
  2021-06-18 13:20 ` [for-linus][PATCH 4/4] tracing: Do no increment trace_clock_global() by one Steven Rostedt
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2021-06-18 13:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, stable

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

A while ago, when the "trace" file was opened, tracing was stopped, and
code was added to stop recording the comms to saved_cmdlines, for mapping
of the pids to the task name.

Code has been added that only records the comm if a trace event occurred,
and there's no reason to not trace it if the trace file is opened.

Cc: stable@vger.kernel.org
Fixes: 7ffbd48d5cab2 ("tracing: Cache comms only after an event occurred")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e220b37e29c6..d23a09d3eb37 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2198,9 +2198,6 @@ struct saved_cmdlines_buffer {
 };
 static struct saved_cmdlines_buffer *savedcmd;
 
-/* temporary disable recording */
-static atomic_t trace_record_taskinfo_disabled __read_mostly;
-
 static inline char *get_saved_cmdlines(int idx)
 {
 	return &savedcmd->saved_cmdlines[idx * TASK_COMM_LEN];
@@ -3996,9 +3993,6 @@ static void *s_start(struct seq_file *m, loff_t *pos)
 		return ERR_PTR(-EBUSY);
 #endif
 
-	if (!iter->snapshot)
-		atomic_inc(&trace_record_taskinfo_disabled);
-
 	if (*pos != iter->pos) {
 		iter->ent = NULL;
 		iter->cpu = 0;
@@ -4041,9 +4035,6 @@ static void s_stop(struct seq_file *m, void *p)
 		return;
 #endif
 
-	if (!iter->snapshot)
-		atomic_dec(&trace_record_taskinfo_disabled);
-
 	trace_access_unlock(iter->cpu_file);
 	trace_event_read_unlock();
 }
-- 
2.30.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 4/4] tracing: Do no increment trace_clock_global() by one
  2021-06-18 13:20 [for-linus][PATCH 0/4] tracing: Some fixes for 5.13 Steven Rostedt
                   ` (2 preceding siblings ...)
  2021-06-18 13:20 ` [for-linus][PATCH 3/4] tracing: Do not stop recording comms if the trace file is being read Steven Rostedt
@ 2021-06-18 13:20 ` Steven Rostedt
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2021-06-18 13:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, stable

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The trace_clock_global() tries to make sure the events between CPUs is
somewhat in order. A global value is used and updated by the latest read
of a clock. If one CPU is ahead by a little, and is read by another CPU, a
lock is taken, and if the timestamp of the other CPU is behind, it will
simply use the other CPUs timestamp.

The lock is also only taken with a "trylock" due to tracing, and strange
recursions can happen. The lock is not taken at all in NMI context.

In the case where the lock is not able to be taken, the non synced
timestamp is returned. But it will not be less than the saved global
timestamp.

The problem arises because when the time goes "backwards" the time
returned is the saved timestamp plus 1. If the lock is not taken, and the
plus one to the timestamp is returned, there's a small race that can cause
the time to go backwards!

	CPU0				CPU1
	----				----
				trace_clock_global() {
				    ts = clock() [ 1000 ]
				    trylock(clock_lock) [ success ]
				    global_ts = ts; [ 1000 ]

				    <interrupted by NMI>
 trace_clock_global() {
    ts = clock() [ 999 ]
    if (ts < global_ts)
	ts = global_ts + 1 [ 1001 ]

    trylock(clock_lock) [ fail ]

    return ts [ 1001]
 }
				    unlock(clock_lock);
				    return ts; [ 1000 ]
				}

 trace_clock_global() {
    ts = clock() [ 1000 ]
    if (ts < global_ts) [ false 1000 == 1000 ]

    trylock(clock_lock) [ success ]
    global_ts = ts; [ 1000 ]
    unlock(clock_lock)

    return ts; [ 1000 ]
 }

The above case shows to reads of trace_clock_global() on the same CPU, but
the second read returns one less than the first read. That is, time when
backwards, and this is not what is allowed by trace_clock_global().

This was triggered by heavy tracing and the ring buffer checker that tests
for the clock going backwards:

 Ring buffer clock went backwards: 20613921464 -> 20613921463
 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 0 at kernel/trace/ring_buffer.c:3412 check_buffer+0x1b9/0x1c0
 Modules linked in:
 [..]
 [CPU: 2]TIME DOES NOT MATCH expected:20620711698 actual:20620711697 delta:6790234 before:20613921463 after:20613921463
   [20613915818] PAGE TIME STAMP
   [20613915818] delta:0
   [20613915819] delta:1
   [20613916035] delta:216
   [20613916465] delta:430
   [20613916575] delta:110
   [20613916749] delta:174
   [20613917248] delta:499
   [20613917333] delta:85
   [20613917775] delta:442
   [20613917921] delta:146
   [20613918321] delta:400
   [20613918568] delta:247
   [20613918768] delta:200
   [20613919306] delta:538
   [20613919353] delta:47
   [20613919980] delta:627
   [20613920296] delta:316
   [20613920571] delta:275
   [20613920862] delta:291
   [20613921152] delta:290
   [20613921464] delta:312
   [20613921464] delta:0 TIME EXTEND
   [20613921464] delta:0

This happened more than once, and always for an off by one result. It also
started happening after commit aafe104aa9096 was added.

Cc: stable@vger.kernel.org
Fixes: aafe104aa9096 ("tracing: Restructure trace_clock_global() to never block")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_clock.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index c1637f90c8a3..4702efb00ff2 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -115,9 +115,9 @@ u64 notrace trace_clock_global(void)
 	prev_time = READ_ONCE(trace_clock_struct.prev_time);
 	now = sched_clock_cpu(this_cpu);
 
-	/* Make sure that now is always greater than prev_time */
+	/* Make sure that now is always greater than or equal to prev_time */
 	if ((s64)(now - prev_time) < 0)
-		now = prev_time + 1;
+		now = prev_time;
 
 	/*
 	 * If in an NMI context then dont risk lockups and simply return
@@ -131,7 +131,7 @@ u64 notrace trace_clock_global(void)
 		/* Reread prev_time in case it was already updated */
 		prev_time = READ_ONCE(trace_clock_struct.prev_time);
 		if ((s64)(now - prev_time) < 0)
-			now = prev_time + 1;
+			now = prev_time;
 
 		trace_clock_struct.prev_time = now;
 
-- 
2.30.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-18 13:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18 13:20 [for-linus][PATCH 0/4] tracing: Some fixes for 5.13 Steven Rostedt
2021-06-18 13:20 ` [for-linus][PATCH 1/4] recordmcount: Correct st_shndx handling Steven Rostedt
2021-06-18 13:20 ` [for-linus][PATCH 2/4] tracing: Do not stop recording cmdlines when tracing is off Steven Rostedt
2021-06-18 13:20 ` [for-linus][PATCH 3/4] tracing: Do not stop recording comms if the trace file is being read Steven Rostedt
2021-06-18 13:20 ` [for-linus][PATCH 4/4] tracing: Do no increment trace_clock_global() by one Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).