linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-linus][PATCH 0/3] tracing: Fixes for v6.8
@ 2024-03-06 18:42 Steven Rostedt
  2024-03-06 18:42 ` [for-linus][PATCH 1/3] tracing: Remove precision vsnprintf() check from print event Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Steven Rostedt @ 2024-03-06 18:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton


Tracing fixes for v6.8-rc7:

- The size of a string written into trace_marker was determined by
  the size of the sub-buffer in the ring buffer. That size is
  dependent on the PAGE_SIZE of the architecture as it can be mapped
  into user space. But on PowerPC, where PAGE_SIZE is 64K, that made
  the limit of the string of writing into trace_marker 64K.

  One of the selftests looks at the size of the ring buffer sub-buffers
  and writes that plus more into the trace_marker. The write will take
  what it can and report back what it consumed so that the user space
  application (like echo) will write the rest of the string. The string
  is stored in the ring buffer and can be read via the "trace" or
  "trace_pipe" files.

  The reading of the ring buffer uses vsnprintf(), which uses a precision
  "%.*s" to make sure it only reads what is stored in the buffer, as
  a bug could cause the string to be non terminated.

  With the combination of the precision change and the PAGE_SIZE of 64K
  allowing huge strings to be added into the ring buffer, plus the test
  that would actually stress that limit, a bug was reported that
  the precision used was too big for "%.*s" as the string was close to
  64K in size and the max precision of vsnprintf is 32K.

  Linus suggested not to have that precision as it could hide a bug
  if the string was again stored without a nul byte.

  Another issue that was brought up is that the trace_seq buffer is
  also based on PAGE_SIZE even though it is not tied to the architecture
  limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
  simply just too big and wasting memory on systems with 64K page sizes.
  It is now hardcoded to 8K which is what all other architectures with
  4K PAGE_SIZE has.

  Finally, the write to trace_marker is now limited to 4K as there is no
  reason to write larger strings into trace_marker.

Steven Rostedt (Google) (3):
      tracing: Remove precision vsnprintf() check from print event
      tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE
      tracing: Limit trace_marker writes to just 4K

----
 include/linux/trace_seq.h   |  8 +++++++-
 kernel/trace/trace.c        | 10 +++++-----
 kernel/trace/trace_output.c |  6 ++----
 3 files changed, 14 insertions(+), 10 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [for-linus][PATCH 1/3] tracing: Remove precision vsnprintf() check from print event
  2024-03-06 18:42 [for-linus][PATCH 0/3] tracing: Fixes for v6.8 Steven Rostedt
@ 2024-03-06 18:42 ` Steven Rostedt
  2024-03-06 18:42 ` [for-linus][PATCH 2/3] tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2024-03-06 18:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Linus Torvalds, Sachin Sant

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

This reverts 60be76eeabb3d ("tracing: Add size check when printing
trace_marker output"). The only reason the precision check was added
was because of a bug that miscalculated the write size of the string into
the ring buffer and it truncated it removing the terminating nul byte. On
reading the trace it crashed the kernel. But this was due to the bug in
the code that happened during development and should never happen in
practice. If anything, the precision can hide bugs where the string in the
ring buffer isn't nul terminated and it will not be checked.

Link: https://lore.kernel.org/all/C7E7AF1A-D30F-4D18-B8E5-AF1EF58004F5@linux.ibm.com/
Link: https://lore.kernel.org/linux-trace-kernel/20240227125706.04279ac2@gandalf.local.home
Link: https://lore.kernel.org/all/20240302111244.3a1674be@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20240304174341.2a561d9f@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 60be76eeabb3d ("tracing: Add size check when printing trace_marker output")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_output.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 3e7fa44dc2b2..d8b302d01083 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -1587,12 +1587,11 @@ static enum print_line_t trace_print_print(struct trace_iterator *iter,
 {
 	struct print_entry *field;
 	struct trace_seq *s = &iter->seq;
-	int max = iter->ent_size - offsetof(struct print_entry, buf);
 
 	trace_assign_type(field, iter->ent);
 
 	seq_print_ip_sym(s, field->ip, flags);
-	trace_seq_printf(s, ": %.*s", max, field->buf);
+	trace_seq_printf(s, ": %s", field->buf);
 
 	return trace_handle_return(s);
 }
@@ -1601,11 +1600,10 @@ static enum print_line_t trace_print_raw(struct trace_iterator *iter, int flags,
 					 struct trace_event *event)
 {
 	struct print_entry *field;
-	int max = iter->ent_size - offsetof(struct print_entry, buf);
 
 	trace_assign_type(field, iter->ent);
 
-	trace_seq_printf(&iter->seq, "# %lx %.*s", field->ip, max, field->buf);
+	trace_seq_printf(&iter->seq, "# %lx %s", field->ip, field->buf);
 
 	return trace_handle_return(&iter->seq);
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [for-linus][PATCH 2/3] tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE
  2024-03-06 18:42 [for-linus][PATCH 0/3] tracing: Fixes for v6.8 Steven Rostedt
  2024-03-06 18:42 ` [for-linus][PATCH 1/3] tracing: Remove precision vsnprintf() check from print event Steven Rostedt
@ 2024-03-06 18:42 ` Steven Rostedt
  2024-03-06 18:42 ` [for-linus][PATCH 3/3] tracing: Limit trace_marker writes to just 4K Steven Rostedt
  2024-03-10 18:16 ` [for-linus][PATCH 0/3] tracing: Fixes for v6.8 David Laight
  3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2024-03-06 18:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Linus Torvalds, Sachin Sant

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The trace_seq buffer is used to print out entire events. It's typically
set to PAGE_SIZE * 2 as there's some events that can be quite large.

As a side effect, writes to trace_marker is limited by both the size of the
trace_seq buffer as well as the ring buffer's sub-buffer size (which is a
power of PAGE_SIZE). By limiting the trace_seq size, it also limits the
size of the largest string written to trace_marker.

trace_seq does not need to be dependent on PAGE_SIZE like the ring buffer
sub-buffers need to be. Hard code it to 8K which is PAGE_SIZE * 2 on most
architectures. This will also limit the size of trace_marker on those
architectures with greater than 4K PAGE_SIZE.

Link: https://lore.kernel.org/all/20240302111244.3a1674be@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20240304191342.56fb1087@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/linux/trace_seq.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/trace_seq.h b/include/linux/trace_seq.h
index 9ec229dfddaa..1ef95c0287f0 100644
--- a/include/linux/trace_seq.h
+++ b/include/linux/trace_seq.h
@@ -9,9 +9,15 @@
 /*
  * Trace sequences are used to allow a function to call several other functions
  * to create a string of data to use.
+ *
+ * Have the trace seq to be 8K which is typically PAGE_SIZE * 2 on
+ * most architectures. The TRACE_SEQ_BUFFER_SIZE (which is
+ * TRACE_SEQ_SIZE minus the other fields of trace_seq), is the
+ * max size the output of a trace event may be.
  */
 
-#define TRACE_SEQ_BUFFER_SIZE	(PAGE_SIZE * 2 - \
+#define TRACE_SEQ_SIZE		8192
+#define TRACE_SEQ_BUFFER_SIZE	(TRACE_SEQ_SIZE - \
 	(sizeof(struct seq_buf) + sizeof(size_t) + sizeof(int)))
 
 struct trace_seq {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [for-linus][PATCH 3/3] tracing: Limit trace_marker writes to just 4K
  2024-03-06 18:42 [for-linus][PATCH 0/3] tracing: Fixes for v6.8 Steven Rostedt
  2024-03-06 18:42 ` [for-linus][PATCH 1/3] tracing: Remove precision vsnprintf() check from print event Steven Rostedt
  2024-03-06 18:42 ` [for-linus][PATCH 2/3] tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE Steven Rostedt
@ 2024-03-06 18:42 ` Steven Rostedt
  2024-03-10 18:16 ` [for-linus][PATCH 0/3] tracing: Fixes for v6.8 David Laight
  3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2024-03-06 18:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	Linus Torvalds

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Limit the max print event of trace_marker to just 4K string size. This must
also be less than the amount that can be held by a trace_seq along with
the text that is before the output (like the task name, PID, CPU, state,
etc). As trace_seq is made to handle large events (some greater than 4K).
Make the max size of a trace_marker write event be 4K which is guaranteed
to fit in the trace_seq buffer.

Link: https://lore.kernel.org/linux-trace-kernel/20240304223433.4ba47dff@gandalf.local.home

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8198bfc54b58..d16b95ca58a7 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -7293,6 +7293,8 @@ tracing_free_buffer_release(struct inode *inode, struct file *filp)
 	return 0;
 }
 
+#define TRACE_MARKER_MAX_SIZE		4096
+
 static ssize_t
 tracing_mark_write(struct file *filp, const char __user *ubuf,
 					size_t cnt, loff_t *fpos)
@@ -7320,6 +7322,9 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
 	if ((ssize_t)cnt < 0)
 		return -EINVAL;
 
+	if (cnt > TRACE_MARKER_MAX_SIZE)
+		cnt = TRACE_MARKER_MAX_SIZE;
+
 	meta_size = sizeof(*entry) + 2;  /* add '\0' and possible '\n' */
  again:
 	size = cnt + meta_size;
@@ -7328,11 +7333,6 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
 	if (cnt < FAULTED_SIZE)
 		size += FAULTED_SIZE - cnt;
 
-	if (size > TRACE_SEQ_BUFFER_SIZE) {
-		cnt -= size - TRACE_SEQ_BUFFER_SIZE;
-		goto again;
-	}
-
 	buffer = tr->array_buffer.buffer;
 	event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
 					    tracing_gen_ctx());
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [for-linus][PATCH 0/3] tracing: Fixes for v6.8
  2024-03-06 18:42 [for-linus][PATCH 0/3] tracing: Fixes for v6.8 Steven Rostedt
                   ` (2 preceding siblings ...)
  2024-03-06 18:42 ` [for-linus][PATCH 3/3] tracing: Limit trace_marker writes to just 4K Steven Rostedt
@ 2024-03-10 18:16 ` David Laight
  2024-03-10 18:36   ` Steven Rostedt
  2024-03-10 19:12   ` Geert Uytterhoeven
  3 siblings, 2 replies; 7+ messages in thread
From: David Laight @ 2024-03-10 18:16 UTC (permalink / raw)
  To: 'Steven Rostedt', linux-kernel
  Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton

...
>   Another issue that was brought up is that the trace_seq buffer is
>   also based on PAGE_SIZE even though it is not tied to the architecture
>   limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
>   simply just too big and wasting memory on systems with 64K page sizes.
>   It is now hardcoded to 8K which is what all other architectures with
>   4K PAGE_SIZE has.

Does Linux use a 2k PAGE_SIZE on any architectures?
IIRC m68k hardware has a 2k page, but Linux might always pair them.
A 2k page might (or might not) cause grief.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [for-linus][PATCH 0/3] tracing: Fixes for v6.8
  2024-03-10 18:16 ` [for-linus][PATCH 0/3] tracing: Fixes for v6.8 David Laight
@ 2024-03-10 18:36   ` Steven Rostedt
  2024-03-10 19:12   ` Geert Uytterhoeven
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2024-03-10 18:36 UTC (permalink / raw)
  To: David Laight
  Cc: linux-kernel, Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers,
	Andrew Morton

On Sun, 10 Mar 2024 18:16:06 +0000
David Laight <David.Laight@ACULAB.COM> wrote:

> ...
> >   Another issue that was brought up is that the trace_seq buffer is
> >   also based on PAGE_SIZE even though it is not tied to the architecture
> >   limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
> >   simply just too big and wasting memory on systems with 64K page sizes.
> >   It is now hardcoded to 8K which is what all other architectures with
> >   4K PAGE_SIZE has.  
> 
> Does Linux use a 2k PAGE_SIZE on any architectures?
> IIRC m68k hardware has a 2k page, but Linux might always pair them.
> A 2k page might (or might not) cause grief.
> 

The trace_seq is just a buffer to build up the event output string. The
ring buffer sub-buffer is set to page size. For trace_marker, it is
still limited to the size of the ring buffer sub-buffer. If the
sub-buffer is only 2K, the trace_marker write will be broken up by less
than 2K.

The problem that is being fixed here had nothing to do with the limited
size of the resources. The issue was actually the opposite. On PowerPC,
the PAGE_SIZE being 64K allowed the strings to be that big too. And
what broke was that it was passed to a vsprintf(s, "%.*s", len, str);
where the len was greater than 32K and that caused a warning as the
precision of "%.*s" has a max of signed short.

2K PAGE_SIZE will still just "work".

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [for-linus][PATCH 0/3] tracing: Fixes for v6.8
  2024-03-10 18:16 ` [for-linus][PATCH 0/3] tracing: Fixes for v6.8 David Laight
  2024-03-10 18:36   ` Steven Rostedt
@ 2024-03-10 19:12   ` Geert Uytterhoeven
  1 sibling, 0 replies; 7+ messages in thread
From: Geert Uytterhoeven @ 2024-03-10 19:12 UTC (permalink / raw)
  To: David Laight
  Cc: Steven Rostedt, linux-kernel, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Andrew Morton

Hi David,

On Sun, Mar 10, 2024 at 7:16 PM David Laight <David.Laight@aculab.com> wrote:
> >   Another issue that was brought up is that the trace_seq buffer is
> >   also based on PAGE_SIZE even though it is not tied to the architecture
> >   limit like the ring buffer sub-buffer is. Having it be 64K * 2 is
> >   simply just too big and wasting memory on systems with 64K page sizes.
> >   It is now hardcoded to 8K which is what all other architectures with
> >   4K PAGE_SIZE has.
>
> Does Linux use a 2k PAGE_SIZE on any architectures?
> IIRC m68k hardware has a 2k page, but Linux might always pair them.
> A 2k page might (or might not) cause grief.

Linux/m68k supports only 4 or 8 KiB page sizes, depending on the
MMU hardware, cfr. [1].  While the MC68851 MMU also supports page sizes
of 256 and 512 bytes, and 1, 2, 8, 16, and 32 KiB, that is not yet
supported by Linux.

I really doubt Linux will ever support pages smaller than 4 KiB...

[1] https://lore.kernel.org/all/20240306141453.3900574-4-arnd@kernel.org/#Z2e.:20240306141453.3900574-4-arnd::40kernel.org:1arch:m68k:Kconfig

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-10 19:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06 18:42 [for-linus][PATCH 0/3] tracing: Fixes for v6.8 Steven Rostedt
2024-03-06 18:42 ` [for-linus][PATCH 1/3] tracing: Remove precision vsnprintf() check from print event Steven Rostedt
2024-03-06 18:42 ` [for-linus][PATCH 2/3] tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE Steven Rostedt
2024-03-06 18:42 ` [for-linus][PATCH 3/3] tracing: Limit trace_marker writes to just 4K Steven Rostedt
2024-03-10 18:16 ` [for-linus][PATCH 0/3] tracing: Fixes for v6.8 David Laight
2024-03-10 18:36   ` Steven Rostedt
2024-03-10 19:12   ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).