All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Wagner <wagi@monom.org>
To: Tom Zanussi <tom.zanussi@linux.intel.com>, rostedt@goodmis.org
Cc: masami.hiramatsu.pt@hitachi.com, namhyung@kernel.org,
	andi@firstfloor.org, alexei.starovoitov@gmail.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/7] tracing: 'hist' triggers
Date: Mon, 20 Apr 2015 14:52:20 +0200	[thread overview]
Message-ID: <5534F684.9050303@monom.org> (raw)
In-Reply-To: <cover.1428678702.git.tom.zanussi@linux.intel.com>

Hi Tom,

On 04/10/2015 06:05 PM, Tom Zanussi wrote:
> This is v4 of the 'hist triggers' patchset, following feedback from
> v3.
> 
> This version fixes the race in tracing_map_insert() noted in v3, where
> map.val.key could be checked even if map.val wasn't yet set.  The
> simple fix for that in tracing_map_insert() introduces the possibility
> of duplicates in the map, which though rare, need to be accounted for
> in the output.  To address that, duplicate-merging code was added to
> the map-printing code.

Note: I might be abusing your patch. So this could be a completely
bogus question/feedback.

Most important information here is that I placed additional tracepoints
during state transitions. I needed to disable the rcu_dereference
consistency checks to be able to get it working (not claiming that this
is the right thing to do)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index c728513..f194324 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -125,7 +125,7 @@ extern void syscall_unregfunc(void);
                        return;                                         \
                prercu;                                                 \
                rcu_read_lock_sched_notrace();                          \
-               it_func_ptr = rcu_dereference_sched((tp)->funcs);       \
+               it_func_ptr = rcu_dereference_raw_notrace((tp)->funcs); \
                if (it_func_ptr) {                                      \
                        do {                                            \
                                it_func = (it_func_ptr)->func;          \
@@ -175,7 +175,7 @@ extern void syscall_unregfunc(void);
                                TP_CONDITION(cond),,);                  \
                if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {             \
                        rcu_read_lock_sched_notrace();                  \
-                       rcu_dereference_sched(__tracepoint_##name.funcs);\
+                       rcu_dereference_raw_notrace(__tracepoint_##name.funcs);\
                        rcu_read_unlock_sched_notrace();                \
                }                                                       \
        }                                                               \


diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 8523ea3..03f42b9 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -14,6 +14,8 @@
 #include <linux/module.h>
 #include <linux/ftrace.h>
 
+#include <trace/events/sched.h>
+
 #include "trace.h"
 
 static struct trace_array              *irqsoff_trace __read_mostly;
@@ -433,11 +435,13 @@ void start_critical_timings(void)
 {
        if (preempt_trace() || irq_trace())
                start_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
+       trace_sched_latency_critical_timing(1);
 }
 EXPORT_SYMBOL_GPL(start_critical_timings);
 
 void stop_critical_timings(void)
 {
+       trace_sched_latency_critical_timing(0);
        if (preempt_trace() || irq_trace())
                stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
 }
@@ -447,6 +451,7 @@ EXPORT_SYMBOL_GPL(stop_critical_timings);
 #ifdef CONFIG_PROVE_LOCKING
 void time_hardirqs_on(unsigned long a0, unsigned long a1)
 {
+       trace_sched_latency_irqs(0);
        if (!preempt_trace() && irq_trace())
                stop_critical_timing(a0, a1);
 }
@@ -455,6 +460,7 @@ void time_hardirqs_off(unsigned long a0, unsigned long a1)
 {
        if (!preempt_trace() && irq_trace())
                start_critical_timing(a0, a1);
+       trace_sched_latency_irqs(1);
 }

When activating now the hist trigger by

  echo 'hist:key=common_pid:val=enabled' > \
    /sys/kernel/debug/tracing/events/sched/sched_latency_preempt/trigger

the system crashes reliable after a very short time. Those tracepoints
do work normally, so I see them in /sys/kernel/debug/tracing/trace.

After some investigation I found out that event_hist_trigger() gets
called with rec=NULL. This is handed over to hist_field_s32() and
that function derefences the argument event (=NULL).

#19 event_hist_trigger (data=0xffff88007934c780, rec=0x0 <irq_stack_union>) at kernel/trace/trace_events_hist.c:885
#20 0xffffffff8113771b in event_triggers_call (file=<optimized out>, rec=0x0 <irq_stack_union>) at kernel/trace/trace_events_trigger.c:78
#21 0xffffffff81079def in ftrace_trigger_soft_disabled (file=<optimized out>) at include/linux/ftrace_event.h:453
#22 ftrace_raw_event_sched_latency_template (__data=0xffff88007c895aa8, enabled=0) at include/trace/events/sched.h:557
#23 0xffffffff8112cbec in trace_sched_latency_preempt (enabled=<optimized out>) at include/trace/events/sched.h:587
#24 trace_preempt_off (a0=18446744071579483412, a1=18446744071579485792) at kernel/trace/trace_irqsoff.c:532
#25 0xffffffff8107f449 in preempt_count_add (val=1) at kernel/sched/core.c:2554
#26 0xffffffff8109bd14 in get_lock_stats (class=0xffffffff82e235d0 <lock_classes+273296>) at kernel/locking/lockdep.c:249
#27 0xffffffff8109c660 in lock_release_holdtime (hlock=0xffff880079c8ee50) at kernel/locking/lockdep.c:267
#28 0xffffffff810a2cc9 in lock_release_holdtime (hlock=<optimized out>) at kernel/locking/lockdep.c:3464
#29 lock_release_nested (ip=<optimized out>, lock=<optimized out>, curr=<optimized out>) at kernel/locking/lockdep.c:3481
#30 __lock_release (ip=<optimized out>, nested=<optimized out>, lock=<optimized out>) at kernel/locking/lockdep.c:3507
#31 lock_release (lock=0xffffffff8226e5f0 <kernfs_mutex+112>, nested=<optimized out>, ip=18446744071590106190) at kernel/locking/lockdep.c:3628
#32 0xffffffff81abd306 in __mutex_unlock_common_slowpath (nested=<optimized out>, lock=<optimized out>) at kernel/locking/mutex.c:744
#33 __mutex_unlock_slowpath (lock_count=0xffff88007934c780) at kernel/locking/mutex.c:769
#34 0xffffffff81abd44e in mutex_unlock (lock=<optimized out>) at kernel/locking/mutex.c:446
#35 0xffffffff81238e84 in kernfs_dop_revalidate (dentry=0xffff88007acede00, flags=<optimized out>) at fs/kernfs/dir.c:470
#36 0xffffffff811c9813 in d_revalidate (flags=<optimized out>, dentry=<optimized out>) at fs/namei.c:607
#37 lookup_fast (nd=0xffff8800790e3e48, path=0xffff8800790e3ca8, inode=0xffff8800790e3cb8) at fs/namei.c:1465
#38 0xffffffff811cc05f in walk_component (follow=<optimized out>, path=<optimized out>, nd=<optimized out>) at fs/namei.c:1577
#39 link_path_walk (name=0xffff880079fa602d "virtual/block/loop6/subsystem", nd=0xffff8800790e3e48) at fs/namei.c:1836
#40 0xffffffff811cd327 in path_init (dfd=<optimized out>, name=0xffff880079fa6020 "/sys/devices/virtual/block/loop6/subsystem", flags=2051493056, nd=0xffff8800790e3e48)
    at fs/namei.c:1952
#41 0xffffffff811cda90 in path_lookupat (dfd=<optimized out>, name=<optimized out>, flags=16448, nd=0xffff8800790e3e48) at fs/namei.c:1995
#42 0xffffffff811cec1a in filename_lookup (dfd=-100, name=0xffff880079fa6000, flags=16384, nd=0xffff8800790e3e48) at fs/namei.c:2030
#43 0xffffffff811d0d34 in user_path_at_empty (dfd=-100, name=<optimized out>, flags=16384, path=0xffff8800790e3f38, empty=<optimized out>) at fs/namei.c:2197
#44 0xffffffff811c285c in SYSC_readlinkat (bufsiz=<optimized out>, buf=<optimized out>, pathname=<optimized out>, dfd=<optimized out>) at fs/stat.c:327
#45 SyS_readlinkat (bufsiz=<optimized out>, buf=<optimized out>, pathname=<optimized out>, dfd=<optimized out>) at fs/stat.c:315
#46 SYSC_readlink (bufsiz=<optimized out>, buf=<optimized out>, path=<optimized out>) at fs/stat.c:352
#47 SyS_readlink (path=-131939361831040, buf=0, bufsiz=<optimized out>) at fs/stat.c:349

So I am wondering if the path from ftrace_trigger_soft_disabled()
to event_triggers_call() is supposed never to happen? The comment
on ftrace_trigger_soft_disabled() indicate this might happen on
normal operation:

/**
 * ftrace_trigger_soft_disabled - do triggers and test if soft disabled
 * @file: The file pointer of the event to test
 *
 * If any triggers without filters are attached to this event, they
 * will be called here. If the event is soft disabled and has no
 * triggers that require testing the fields, it will return true,
 * otherwise false.
 */
static inline bool
ftrace_trigger_soft_disabled(struct ftrace_event_file *file)
{
	unsigned long eflags = file->flags;

	if (!(eflags & FTRACE_EVENT_FL_TRIGGER_COND)) {
		if (eflags & FTRACE_EVENT_FL_TRIGGER_MODE)
			event_triggers_call(file, NULL);
		if (eflags & FTRACE_EVENT_FL_SOFT_DISABLED)
			return true;
	}
	return false;
}

cheers,
daniel

  parent reply	other threads:[~2015-04-20 12:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-10 16:05 [PATCH v4 0/7] tracing: 'hist' triggers Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 1/7] tracing: Make ftrace_event_field checking functions available Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 2/7] tracing: Add event record param to trigger_ops.func() Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 3/7] tracing: Add get_syscall_name() Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 4/7] tracing: Add a per-event-trigger 'paused' field Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 5/7] tracing: Add 'hist' event trigger command Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 6/7] tracing: Add enable_hist/disable_hist triggers Tom Zanussi
2015-04-10 16:05 ` [PATCH v4 7/7] tracing: Add 'hist' trigger Documentation Tom Zanussi
2015-04-20 12:52 ` Daniel Wagner [this message]
2015-04-20 14:15   ` [PATCH v4 0/7] tracing: 'hist' triggers Tom Zanussi
2015-04-20 22:03   ` Tom Zanussi
2015-04-21 10:36     ` Daniel Wagner
2015-04-21 19:03       ` Tom Zanussi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5534F684.9050303@monom.org \
    --to=wagi@monom.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tom.zanussi@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.