linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [for-next][PATCH 07/24] tracing: Add better comments for the filtering temp buffer use case
Date: Sat, 26 Jun 2021 09:04:11 -0400	[thread overview]
Message-ID: <20210626130535.766916537@goodmis.org> (raw)
In-Reply-To: 20210626130404.033700863@goodmis.org

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

When filtering is enabled, the event is copied into a temp buffer instead
of being written into the ring buffer directly, because the discarding of
events from the ring buffer is very expensive, and doing the extra copy is
much faster than having to discard most of the time.

As that logic is subtle, add comments to explain in more detail to what is
going on and how it works.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a0a84ff46ecd..a0d66a056e59 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2734,10 +2734,44 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb,
 	if (!tr->no_filter_buffering_ref &&
 	    (trace_file->flags & (EVENT_FILE_FL_SOFT_DISABLED | EVENT_FILE_FL_FILTERED)) &&
 	    (entry = this_cpu_read(trace_buffered_event))) {
-		/* Try to use the per cpu buffer first */
+		/*
+		 * Filtering is on, so try to use the per cpu buffer first.
+		 * This buffer will simulate a ring_buffer_event,
+		 * where the type_len is zero and the array[0] will
+		 * hold the full length.
+		 * (see include/linux/ring-buffer.h for details on
+		 *  how the ring_buffer_event is structured).
+		 *
+		 * Using a temp buffer during filtering and copying it
+		 * on a matched filter is quicker than writing directly
+		 * into the ring buffer and then discarding it when
+		 * it doesn't match. That is because the discard
+		 * requires several atomic operations to get right.
+		 * Copying on match and doing nothing on a failed match
+		 * is still quicker than no copy on match, but having
+		 * to discard out of the ring buffer on a failed match.
+		 */
 		int max_len = PAGE_SIZE - struct_size(entry, array, 1);
 
 		val = this_cpu_inc_return(trace_buffered_event_cnt);
+
+		/*
+		 * Preemption is disabled, but interrupts and NMIs
+		 * can still come in now. If that happens after
+		 * the above increment, then it will have to go
+		 * back to the old method of allocating the event
+		 * on the ring buffer, and if the filter fails, it
+		 * will have to call ring_buffer_discard_commit()
+		 * to remove it.
+		 *
+		 * Need to also check the unlikely case that the
+		 * length is bigger than the temp buffer size.
+		 * If that happens, then the reserve is pretty much
+		 * guaranteed to fail, as the ring buffer currently
+		 * only allows events less than a page. But that may
+		 * change in the future, so let the ring buffer reserve
+		 * handle the failure in that case.
+		 */
 		if (val == 1 && likely(len <= max_len)) {
 			trace_event_setup(entry, type, trace_ctx);
 			entry->array[0] = len;
-- 
2.30.2

  parent reply	other threads:[~2021-06-26 13:05 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-26 13:04 [for-next][PATCH 00/24] tracing: Last minute updates for 5.14 Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 01/24] bootconfig: Change array value to use child node Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 02/24] bootconfig: Support mixing a value and subkeys under a key Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 03/24] tools/bootconfig: Support mixed value and subkey test cases Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 04/24] docs: bootconfig: Update for mixing value and subkeys Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 05/24] bootconfig: Share the checksum function with tools Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 06/24] tracing: Simplify the max length test when using the filtering temp buffer Steven Rostedt
2021-06-26 13:04 ` Steven Rostedt [this message]
2021-06-26 13:04 ` [for-next][PATCH 08/24] tracing: Add tp_printk_stop_on_boot option Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 09/24] tracing: Have ftrace_dump_on_oops kernel parameter take numbers Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 10/24] bootconfig/tracing/ktest: Add ktest examples of testing bootconfig Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 11/24] trace/hwlat: Fix Clarks email Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 12/24] trace/hwlat: Implement the mode config option Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 13/24] trace/hwlat: Switch disable_migrate to mode none Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 14/24] trace/hwlat: Implement the per-cpu mode Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 15/24] trace: Add a generic function to read/write u64 values from tracefs Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 16/24] trace/hwlat: Use trace_min_max_param for width and window params Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 17/24] trace/hwlat: Remove printk from sampling loop Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 18/24] trace: Add __print_ns_to_secs() and __print_ns_without_secs() helpers Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 19/24] tracing: Add LATENCY_FS_NOTIFY to define if latency_fsnotify() is defined Steven Rostedt
2021-07-20 19:19   ` Arnd Bergmann
2021-06-26 13:04 ` [for-next][PATCH 20/24] trace: Add osnoise tracer Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 21/24] trace: Add timerlat tracer Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 22/24] trace/hwlat: Protect kdata->kthread with get/put_online_cpus Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 23/24] trace/hwlat: Support hotplug operations Steven Rostedt
2021-06-26 13:04 ` [for-next][PATCH 24/24] trace/osnoise: " Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210626130535.766916537@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).