linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] trace: Add special x86 irq entry/exit tracepoints
@ 2011-04-22 22:56 Vaibhav Nagarnaik
  2011-04-25 23:41 ` Vaibhav Nagarnaik
  2011-04-29  0:14 ` [PATCH] trace: Add special x86 irq " Thomas Gleixner
  0 siblings, 2 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-04-22 22:56 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Ingo Molnar
  Cc: Michael Rubin, David Sharp, linux-kernel, x86, Jiaying Zhang,
	Vaibhav Nagarnaik

From: Jiaying Zhang <jiayingz@google.com>

With this patch, following 'special' interrupts are logged by ftrace
(when enabled):

Interrupt as in /proc/interrupts        Name as it appears in ftrace log
NMI: Non-maskable interrupts           : NMI_VECTOR
LOC: Local timer interrupts            : LOCAL_TIMER_VECTOR
SPU: Spurious interrupts               : SPURIOUS_APIC_VECTOR
PMI: Performance monitoring interrupts : <not added>
PND: Performance pending work          : LOCAL_PENDING_VECTOR
RES: Rescheduling interrupts           : RESCHEDULE_VECTOR
CAL: Function call interrupts          : CALL_FUNCTION_VECTOR or
                                         CALL_FUNCTION_SINGLE_VECTOR
TLB: TLB shootdowns                    : INVALIDATE_TLB_VECTOR_START to
                                         INVALIDATE_TLB_VECTOR_END
TRM: Thermal event interrupts          : THERMAL_APIC_VECTOR
THR: Threshold APIC interrupts         : THRESHOLD_APIC_VECTOR
MCE: Machine check exceptions          : <not added>
MCP: Machine check polls               : <not added>
ERR:                                   : ERROR_APIC_VECTOR
MIS:                                   : <not added>
PLT: Platform interrupts               : X86_PLATFORM_IPI_VECTOR

Enabled the added tracepoints and verified the output:
$ echo 1 > debug/tracing/events/irq/special_irq_entry/enable
$ echo 1 > debug/tracing/events/irq/special_irq_exit/enable
$ cat debug/tracing/trace | more
<idle>-0     [000]   263.699530: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [003]   263.699700: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [004]   263.699749: special_irq_entry: irq=237
name=INVALIDATE_TLB_VECTOR tlb_vector=30
<...>-9483  [000]   263.700498: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [004]   263.700498: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [003]   263.700519: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1

A custom script compares the two:
$ sh ~/ftrace_test.sh
tracing...
  cpu0: overrun: 0
  cpu1: overrun: 0
  cpu2: overrun: 0
  cpu3: overrun: 0
IRQ ftrace record counts (ignoring numeric  irq=(0|1|4|5|7|9|10|24) )
      2 irq=242 name=INVALIDATE_TLB_VECTOR_2
      3 irq=252 name=CALL_FUNCTION_VECTOR
      6 irq=241 name=INVALIDATE_TLB_VECTOR_1
      6 irq=2 name=NMI_VECTOR
     22 irq=240 name=INVALIDATE_TLB_VECTOR_0
     40 irq=251 name=CALL_FUNCTION_SINGLE_VECTOR
    587 irq=253 name=RESCHEDULE_VECTOR
  15013 irq=239 name=LOCAL_TIMER_VECTOR
/proc/interrupts stats difference
      6 NMI (Non-maskable interrupts)
  14945 LOC (Local timer interrupts)
      6 PMI (Performance monitoring interrupts)
    578 RES (Rescheduling interrupts)
     42 CAL (Function call interrupts)
     30 TLB (TLB shootdowns)
      4 MCP (Machine check polls)

In above output:
NMI (Non-maskable interrupts) == NMI_VECTOR (irq 2)
LOC (Local timer interrupts) == LOCAL_TIMER_VECTOR (irq 239)
RES (Rescheduling interrupts) == RESCHEDULE_VECTOR (irq=253)
CAL (Function call interrupts) == CALL_FUNCTION_VECTOR (irq 252) +
                                  CALL_FUNCTION_SINGLE_VECTOR (irq 251)
TLB (TLB shootdowns) == INVALIDATE_TLB_VECTOR_0 (irq=240) +
                        INVALIDATE_TLB_VECTOR_1 (irq=241) +
                        INVALIDATE_TLB_VECTOR_2 (irq=242)

Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
---
 arch/x86/kernel/apic/apic.c              |    7 +++
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    3 +
 arch/x86/kernel/cpu/mcheck/threshold.c   |    3 +
 arch/x86/kernel/irq.c                    |    4 +-
 arch/x86/kernel/irq_work.c               |    4 ++
 arch/x86/kernel/smp.c                    |    7 +++
 arch/x86/kernel/time.c                   |    6 +++
 arch/x86/kernel/traps.c                  |    3 +
 arch/x86/mm/tlb.c                        |    3 +
 include/trace/events/irq.h               |   68 ++++++++++++++++++++++++++++++
 10 files changed, 107 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index fabf01e..e601c27 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -33,6 +33,7 @@
 #include <linux/dmi.h>
 #include <linux/smp.h>
 #include <linux/mm.h>
+#include <trace/events/irq.h>
 
 #include <asm/perf_event.h>
 #include <asm/x86_init.h>
@@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
 	 */
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(LOCAL_TIMER_VECTOR);
 	local_apic_timer_interrupt();
+	trace_special_irq_exit(LOCAL_TIMER_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
@@ -1790,6 +1793,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(SPURIOUS_APIC_VECTOR);
 	/*
 	 * Check if this really is a spurious interrupt and ACK it
 	 * if it is a vectored one.  Just in case...
@@ -1804,6 +1808,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
 	pr_info("spurious APIC interrupt on CPU#%d, "
 		"should never happen.\n", smp_processor_id());
+	trace_special_irq_exit(SPURIOUS_APIC_VECTOR);
 	irq_exit();
 }
 
@@ -1816,6 +1821,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(ERROR_APIC_VECTOR);
 	/* First tickle the hardware, only then report what went on. -- REW */
 	v = apic_read(APIC_ESR);
 	apic_write(APIC_ESR, 0);
@@ -1836,6 +1842,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 	 */
 	pr_debug("APIC error on CPU%d: %02x(%02x)\n",
 		smp_processor_id(), v , v1);
+	trace_special_irq_exit(ERROR_APIC_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 6f8c5e9..abdec08 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/smp.h>
 #include <linux/cpu.h>
+#include <trace/events/irq.h>
 
 #include <asm/processor.h>
 #include <asm/system.h>
@@ -402,8 +403,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
 {
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(THERMAL_APIC_VECTOR);
 	inc_irq_stat(irq_thermal_count);
 	smp_thermal_vector();
+	trace_special_irq_exit(THERMAL_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index d746df2..1943d9e 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -3,6 +3,7 @@
  */
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
+#include <trace/events/irq.h>
 
 #include <asm/irq_vectors.h>
 #include <asm/apic.h>
@@ -21,8 +22,10 @@ asmlinkage void smp_threshold_interrupt(void)
 {
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(THRESHOLD_APIC_VECTOR);
 	inc_irq_stat(irq_threshold_count);
 	mce_threshold_vector();
+	trace_special_irq_exit(THRESHOLD_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 1cb0b9f..6f857c5 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -8,6 +8,7 @@
 #include <linux/seq_file.h>
 #include <linux/smp.h>
 #include <linux/ftrace.h>
+#include <trace/events/irq.h>
 #include <linux/delay.h>
 
 #include <asm/apic.h>
@@ -211,12 +212,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
 	exit_idle();
 
 	irq_enter();
-
+	trace_special_irq_entry(X86_PLATFORM_IPI_VECTOR);
 	inc_irq_stat(x86_platform_ipis);
 
 	if (x86_platform_ipi_callback)
 		x86_platform_ipi_callback();
 
+	trace_special_irq_exit(X86_PLATFORM_IPI_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index ca8f703..9ee00ba 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -8,13 +8,17 @@
 #include <linux/irq_work.h>
 #include <linux/hardirq.h>
 #include <asm/apic.h>
+#include <asm/irq_vectors.h>
+#include <trace/events/irq.h>
 
 void smp_irq_work_interrupt(struct pt_regs *regs)
 {
 	irq_enter();
 	ack_APIC_irq();
+	trace_special_irq_entry(IRQ_WORK_VECTOR);
 	inc_irq_stat(apic_irq_work_irqs);
 	irq_work_run();
+	trace_special_irq_exit(IRQ_WORK_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 513deac..51ec1f8 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -22,6 +22,7 @@
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
 #include <linux/gfp.h>
+#include <trace/events/irq.h>
 
 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
@@ -201,7 +202,9 @@ static void native_stop_other_cpus(int wait)
 void smp_reschedule_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
+	trace_special_irq_entry(RESCHEDULE_VECTOR);
 	inc_irq_stat(irq_resched_count);
+	trace_special_irq_exit(RESCHEDULE_VECTOR);
 	/*
 	 * KVM uses this interrupt to force a cpu out of guest mode
 	 */
@@ -211,8 +214,10 @@ void smp_call_function_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_special_irq_entry(CALL_FUNCTION_VECTOR);
 	generic_smp_call_function_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_special_irq_exit(CALL_FUNCTION_VECTOR);
 	irq_exit();
 }
 
@@ -220,8 +225,10 @@ void smp_call_function_single_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_special_irq_entry(CALL_FUNCTION_SINGLE_VECTOR);
 	generic_smp_call_function_single_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_special_irq_exit(CALL_FUNCTION_SINGLE_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 25a28a2..687b087 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -13,6 +13,7 @@
 #include <linux/interrupt.h>
 #include <linux/time.h>
 #include <linux/mca.h>
+#include <trace/events/irq.h>
 
 #include <asm/vsyscall.h>
 #include <asm/x86_init.h>
@@ -26,6 +27,8 @@
 volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
 #endif
 
+static struct irqaction *irq0_action;
+
 unsigned long profile_pc(struct pt_regs *regs)
 {
 	unsigned long pc = instruction_pointer(regs);
@@ -59,7 +62,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	/* Keep nmi watchdog up to date */
 	inc_irq_stat(irq0_irqs);
 
+	trace_irq_handler_entry(0, irq0_action);
 	global_clock_event->event_handler(global_clock_event);
+	trace_irq_handler_exit(0, irq0_action, 1);
 
 	/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
 	if (MCA_bus)
@@ -76,6 +81,7 @@ static struct irqaction irq0  = {
 
 void __init setup_default_timer_irq(void)
 {
+	irq0_action = &irq0;
 	setup_irq(0, &irq0);
 }
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..4857ff6 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -31,6 +31,7 @@
 #include <linux/mm.h>
 #include <linux/smp.h>
 #include <linux/io.h>
+#include <trace/events/irq.h>
 
 #ifdef CONFIG_EISA
 #include <linux/ioport.h>
@@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void
 do_nmi(struct pt_regs *regs, long error_code)
 {
 	nmi_enter();
+	trace_special_irq_entry(NMI_VECTOR);
 
 	inc_irq_stat(__nmi_count);
 
 	if (!ignore_nmis)
 		default_do_nmi(regs);
 
+	trace_special_irq_exit(NMI_VECTOR);
 	nmi_exit();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index d6c0418..f119306 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -5,6 +5,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <trace/events/irq.h>
 #include <linux/cpu.h>
 
 #include <asm/tlbflush.h>
@@ -141,6 +142,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs)
 	sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START;
 	f = &flush_state[sender];
 
+	trace_special_irq_entry(INVALIDATE_TLB_VECTOR_START + sender);
 	if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask)))
 		goto out;
 		/*
@@ -167,6 +169,7 @@ out:
 	cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask));
 	smp_mb__after_clear_bit();
 	inc_irq_stat(irq_tlb_count);
+	trace_special_irq_exit(INVALIDATE_TLB_VECTOR_START + sender);
 }
 
 static void flush_tlb_others_ipi(const struct cpumask *cpumask,
diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
index 4b4c7d9..6c0af78 100644
--- a/include/trace/events/irq.h
+++ b/include/trace/events/irq.h
@@ -139,6 +139,74 @@ DEFINE_EVENT(softirq, softirq_raise,
 	TP_ARGS(vec_nr)
 );
 
+#ifdef CONFIG_X86
+#include <asm/irq_vectors.h>
+
+#define special_irq_name(sirq) { sirq, #sirq }
+#define show_special_irq_name(val)					\
+	__print_symbolic(val,						\
+			 special_irq_name(NMI_VECTOR),			\
+			 special_irq_name(LOCAL_TIMER_VECTOR),		\
+			 special_irq_name(ERROR_APIC_VECTOR),		\
+			 special_irq_name(RESCHEDULE_VECTOR),		\
+			 special_irq_name(CALL_FUNCTION_VECTOR),	\
+			 special_irq_name(CALL_FUNCTION_SINGLE_VECTOR),	\
+			 special_irq_name(THERMAL_APIC_VECTOR),		\
+			 special_irq_name(THRESHOLD_APIC_VECTOR),	\
+			 special_irq_name(REBOOT_VECTOR),		\
+			 special_irq_name(SPURIOUS_APIC_VECTOR),	\
+			 special_irq_name(IRQ_WORK_VECTOR),		\
+			 special_irq_name(X86_PLATFORM_IPI_VECTOR)	\
+			)
+
+#define IS_INVALIDATE_TLB_VECTOR(__irq) (\
+		__irq >= INVALIDATE_TLB_VECTOR_START && \
+		__irq <= INVALIDATE_TLB_VECTOR_END)
+
+DECLARE_EVENT_CLASS(special_irq,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq),
+
+	TP_STRUCT__entry(
+		__field(	int,	irq	)
+	),
+
+	TP_fast_assign(
+		__entry->irq = irq;
+	),
+
+	TP_printk("irq=%d name=%s tlb_vector=%d", __entry->irq,
+		IS_INVALIDATE_TLB_VECTOR(__entry->irq) ?
+		"INVALIDATE_TLB_VECTOR" : show_special_irq_name(__entry->irq),
+		IS_INVALIDATE_TLB_VECTOR(__entry->irq) ?
+		__entry->irq - INVALIDATE_TLB_VECTOR_START : -1)
+);
+
+/*
+ * special_irq_entry - called before enterring a special interrupt
+ * such as apic timer, spurious interrupt, and etc.
+ */
+DEFINE_EVENT(special_irq, special_irq_entry,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
+/*
+ * special_irq_exit - called immediately after the special interrupt
+ * handler returns
+ */
+DEFINE_EVENT(special_irq, special_irq_exit,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+#endif /* CONFIG_X86 */
+
 #endif /*  _TRACE_IRQ_H */
 
 /* This part must be outside protection */
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH] trace: Add special x86 irq entry/exit tracepoints
  2011-04-22 22:56 [PATCH] trace: Add special x86 irq entry/exit tracepoints Vaibhav Nagarnaik
@ 2011-04-25 23:41 ` Vaibhav Nagarnaik
  2011-04-28 23:16   ` Vaibhav Nagarnaik
                     ` (2 more replies)
  2011-04-29  0:14 ` [PATCH] trace: Add special x86 irq " Thomas Gleixner
  1 sibling, 3 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-04-25 23:41 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Ingo Molnar
  Cc: Michael Rubin, David Sharp, linux-kernel, x86, Jiaying Zhang,
	Vaibhav Nagarnaik

From: Jiaying Zhang <jiayingz@google.com>

Apart from the normal interrupts, we often also need to trace
architecture special interrupts. The following patch adds the event
definition and trace instrumentation for some x86 special interrupts. We
hope the similar approach can also be used to trace special interrupts
on other architectures. However, the patch only supports x86 at this
point.

With this patch, following 'special' interrupts are logged by ftrace
(when enabled):

Interrupt as in /proc/interrupts        Name as it appears in ftrace log
NMI: Non-maskable interrupts           : NMI_VECTOR
LOC: Local timer interrupts            : LOCAL_TIMER_VECTOR
SPU: Spurious interrupts               : SPURIOUS_APIC_VECTOR
PMI: Performance monitoring interrupts : <not added>
PND: Performance pending work          : LOCAL_PENDING_VECTOR
RES: Rescheduling interrupts           : RESCHEDULE_VECTOR
CAL: Function call interrupts          : CALL_FUNCTION_VECTOR or
                                         CALL_FUNCTION_SINGLE_VECTOR
TLB: TLB shootdowns                    : INVALIDATE_TLB_VECTOR_START to
                                         INVALIDATE_TLB_VECTOR_END
TRM: Thermal event interrupts          : THERMAL_APIC_VECTOR
THR: Threshold APIC interrupts         : THRESHOLD_APIC_VECTOR
MCE: Machine check exceptions          : <not added>
MCP: Machine check polls               : <not added>
ERR:                                   : ERROR_APIC_VECTOR
MIS:                                   : <not added>
PLT: Platform interrupts               : X86_PLATFORM_IPI_VECTOR

Enabled the added tracepoints and verified the output:
$ echo 1 > debug/tracing/events/irq/special_irq_entry/enable
$ echo 1 > debug/tracing/events/irq/special_irq_exit/enable
$ cat debug/tracing/trace | more
<idle>-0     [000]   263.699530: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [003]   263.699700: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [004]   263.699749: special_irq_entry: irq=237
name=INVALIDATE_TLB_VECTOR tlb_vector=30
<...>-9483  [000]   263.700498: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [004]   263.700498: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1
<idle>-0     [003]   263.700519: special_irq_entry: irq=239
name=LOCAL_TIMER_VECTOR tlb_vector=-1

A custom script compares the two:
$ sh ~/ftrace_test.sh
tracing...
  cpu0: overrun: 0
  cpu1: overrun: 0
  cpu2: overrun: 0
  cpu3: overrun: 0
IRQ ftrace record counts (ignoring numeric  irq=(0|1|4|5|7|9|10|24) )
      2 irq=242 name=INVALIDATE_TLB_VECTOR_2
      3 irq=252 name=CALL_FUNCTION_VECTOR
      6 irq=241 name=INVALIDATE_TLB_VECTOR_1
      6 irq=2 name=NMI_VECTOR
     22 irq=240 name=INVALIDATE_TLB_VECTOR_0
     40 irq=251 name=CALL_FUNCTION_SINGLE_VECTOR
    587 irq=253 name=RESCHEDULE_VECTOR
  15013 irq=239 name=LOCAL_TIMER_VECTOR
/proc/interrupts stats difference
      6 NMI (Non-maskable interrupts)
  14945 LOC (Local timer interrupts)
      6 PMI (Performance monitoring interrupts)
    578 RES (Rescheduling interrupts)
     42 CAL (Function call interrupts)
     30 TLB (TLB shootdowns)
      4 MCP (Machine check polls)

In above output:
NMI (Non-maskable interrupts) == NMI_VECTOR (irq 2)
LOC (Local timer interrupts) == LOCAL_TIMER_VECTOR (irq 239)
RES (Rescheduling interrupts) == RESCHEDULE_VECTOR (irq=253)
CAL (Function call interrupts) == CALL_FUNCTION_VECTOR (irq 252) +
                                  CALL_FUNCTION_SINGLE_VECTOR (irq 251)
TLB (TLB shootdowns) == INVALIDATE_TLB_VECTOR_0 (irq=240) +
                        INVALIDATE_TLB_VECTOR_1 (irq=241) +
                        INVALIDATE_TLB_VECTOR_2 (irq=242)

Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
---
 arch/x86/kernel/apic/apic.c              |    7 +++
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    3 +
 arch/x86/kernel/cpu/mcheck/threshold.c   |    3 +
 arch/x86/kernel/irq.c                    |    4 +-
 arch/x86/kernel/irq_work.c               |    4 ++
 arch/x86/kernel/smp.c                    |    7 +++
 arch/x86/kernel/time.c                   |    6 +++
 arch/x86/kernel/traps.c                  |    3 +
 arch/x86/mm/tlb.c                        |    3 +
 include/trace/events/irq.h               |   68 ++++++++++++++++++++++++++++++
 10 files changed, 107 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index fabf01e..e601c27 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -33,6 +33,7 @@
 #include <linux/dmi.h>
 #include <linux/smp.h>
 #include <linux/mm.h>
+#include <trace/events/irq.h>
 
 #include <asm/perf_event.h>
 #include <asm/x86_init.h>
@@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
 	 */
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(LOCAL_TIMER_VECTOR);
 	local_apic_timer_interrupt();
+	trace_special_irq_exit(LOCAL_TIMER_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
@@ -1790,6 +1793,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(SPURIOUS_APIC_VECTOR);
 	/*
 	 * Check if this really is a spurious interrupt and ACK it
 	 * if it is a vectored one.  Just in case...
@@ -1804,6 +1808,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
 	pr_info("spurious APIC interrupt on CPU#%d, "
 		"should never happen.\n", smp_processor_id());
+	trace_special_irq_exit(SPURIOUS_APIC_VECTOR);
 	irq_exit();
 }
 
@@ -1816,6 +1821,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(ERROR_APIC_VECTOR);
 	/* First tickle the hardware, only then report what went on. -- REW */
 	v = apic_read(APIC_ESR);
 	apic_write(APIC_ESR, 0);
@@ -1836,6 +1842,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 	 */
 	pr_debug("APIC error on CPU%d: %02x(%02x)\n",
 		smp_processor_id(), v , v1);
+	trace_special_irq_exit(ERROR_APIC_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 6f8c5e9..abdec08 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/smp.h>
 #include <linux/cpu.h>
+#include <trace/events/irq.h>
 
 #include <asm/processor.h>
 #include <asm/system.h>
@@ -402,8 +403,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
 {
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(THERMAL_APIC_VECTOR);
 	inc_irq_stat(irq_thermal_count);
 	smp_thermal_vector();
+	trace_special_irq_exit(THERMAL_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index d746df2..1943d9e 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -3,6 +3,7 @@
  */
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
+#include <trace/events/irq.h>
 
 #include <asm/irq_vectors.h>
 #include <asm/apic.h>
@@ -21,8 +22,10 @@ asmlinkage void smp_threshold_interrupt(void)
 {
 	exit_idle();
 	irq_enter();
+	trace_special_irq_entry(THRESHOLD_APIC_VECTOR);
 	inc_irq_stat(irq_threshold_count);
 	mce_threshold_vector();
+	trace_special_irq_exit(THRESHOLD_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 1cb0b9f..6f857c5 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -8,6 +8,7 @@
 #include <linux/seq_file.h>
 #include <linux/smp.h>
 #include <linux/ftrace.h>
+#include <trace/events/irq.h>
 #include <linux/delay.h>
 
 #include <asm/apic.h>
@@ -211,12 +212,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
 	exit_idle();
 
 	irq_enter();
-
+	trace_special_irq_entry(X86_PLATFORM_IPI_VECTOR);
 	inc_irq_stat(x86_platform_ipis);
 
 	if (x86_platform_ipi_callback)
 		x86_platform_ipi_callback();
 
+	trace_special_irq_exit(X86_PLATFORM_IPI_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index ca8f703..9ee00ba 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -8,13 +8,17 @@
 #include <linux/irq_work.h>
 #include <linux/hardirq.h>
 #include <asm/apic.h>
+#include <asm/irq_vectors.h>
+#include <trace/events/irq.h>
 
 void smp_irq_work_interrupt(struct pt_regs *regs)
 {
 	irq_enter();
 	ack_APIC_irq();
+	trace_special_irq_entry(IRQ_WORK_VECTOR);
 	inc_irq_stat(apic_irq_work_irqs);
 	irq_work_run();
+	trace_special_irq_exit(IRQ_WORK_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 513deac..51ec1f8 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -22,6 +22,7 @@
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
 #include <linux/gfp.h>
+#include <trace/events/irq.h>
 
 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
@@ -201,7 +202,9 @@ static void native_stop_other_cpus(int wait)
 void smp_reschedule_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
+	trace_special_irq_entry(RESCHEDULE_VECTOR);
 	inc_irq_stat(irq_resched_count);
+	trace_special_irq_exit(RESCHEDULE_VECTOR);
 	/*
 	 * KVM uses this interrupt to force a cpu out of guest mode
 	 */
@@ -211,8 +214,10 @@ void smp_call_function_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_special_irq_entry(CALL_FUNCTION_VECTOR);
 	generic_smp_call_function_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_special_irq_exit(CALL_FUNCTION_VECTOR);
 	irq_exit();
 }
 
@@ -220,8 +225,10 @@ void smp_call_function_single_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_special_irq_entry(CALL_FUNCTION_SINGLE_VECTOR);
 	generic_smp_call_function_single_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_special_irq_exit(CALL_FUNCTION_SINGLE_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 25a28a2..687b087 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -13,6 +13,7 @@
 #include <linux/interrupt.h>
 #include <linux/time.h>
 #include <linux/mca.h>
+#include <trace/events/irq.h>
 
 #include <asm/vsyscall.h>
 #include <asm/x86_init.h>
@@ -26,6 +27,8 @@
 volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
 #endif
 
+static struct irqaction *irq0_action;
+
 unsigned long profile_pc(struct pt_regs *regs)
 {
 	unsigned long pc = instruction_pointer(regs);
@@ -59,7 +62,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	/* Keep nmi watchdog up to date */
 	inc_irq_stat(irq0_irqs);
 
+	trace_irq_handler_entry(0, irq0_action);
 	global_clock_event->event_handler(global_clock_event);
+	trace_irq_handler_exit(0, irq0_action, 1);
 
 	/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
 	if (MCA_bus)
@@ -76,6 +81,7 @@ static struct irqaction irq0  = {
 
 void __init setup_default_timer_irq(void)
 {
+	irq0_action = &irq0;
 	setup_irq(0, &irq0);
 }
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..4857ff6 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -31,6 +31,7 @@
 #include <linux/mm.h>
 #include <linux/smp.h>
 #include <linux/io.h>
+#include <trace/events/irq.h>
 
 #ifdef CONFIG_EISA
 #include <linux/ioport.h>
@@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void
 do_nmi(struct pt_regs *regs, long error_code)
 {
 	nmi_enter();
+	trace_special_irq_entry(NMI_VECTOR);
 
 	inc_irq_stat(__nmi_count);
 
 	if (!ignore_nmis)
 		default_do_nmi(regs);
 
+	trace_special_irq_exit(NMI_VECTOR);
 	nmi_exit();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index d6c0418..f119306 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -5,6 +5,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <trace/events/irq.h>
 #include <linux/cpu.h>
 
 #include <asm/tlbflush.h>
@@ -141,6 +142,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs)
 	sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START;
 	f = &flush_state[sender];
 
+	trace_special_irq_entry(INVALIDATE_TLB_VECTOR_START + sender);
 	if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask)))
 		goto out;
 		/*
@@ -167,6 +169,7 @@ out:
 	cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask));
 	smp_mb__after_clear_bit();
 	inc_irq_stat(irq_tlb_count);
+	trace_special_irq_exit(INVALIDATE_TLB_VECTOR_START + sender);
 }
 
 static void flush_tlb_others_ipi(const struct cpumask *cpumask,
diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
index 4b4c7d9..6c0af78 100644
--- a/include/trace/events/irq.h
+++ b/include/trace/events/irq.h
@@ -139,6 +139,74 @@ DEFINE_EVENT(softirq, softirq_raise,
 	TP_ARGS(vec_nr)
 );
 
+#ifdef CONFIG_X86
+#include <asm/irq_vectors.h>
+
+#define special_irq_name(sirq) { sirq, #sirq }
+#define show_special_irq_name(val)					\
+	__print_symbolic(val,						\
+			 special_irq_name(NMI_VECTOR),			\
+			 special_irq_name(LOCAL_TIMER_VECTOR),		\
+			 special_irq_name(ERROR_APIC_VECTOR),		\
+			 special_irq_name(RESCHEDULE_VECTOR),		\
+			 special_irq_name(CALL_FUNCTION_VECTOR),	\
+			 special_irq_name(CALL_FUNCTION_SINGLE_VECTOR),	\
+			 special_irq_name(THERMAL_APIC_VECTOR),		\
+			 special_irq_name(THRESHOLD_APIC_VECTOR),	\
+			 special_irq_name(REBOOT_VECTOR),		\
+			 special_irq_name(SPURIOUS_APIC_VECTOR),	\
+			 special_irq_name(IRQ_WORK_VECTOR),		\
+			 special_irq_name(X86_PLATFORM_IPI_VECTOR)	\
+			)
+
+#define IS_INVALIDATE_TLB_VECTOR(__irq) (\
+		__irq >= INVALIDATE_TLB_VECTOR_START && \
+		__irq <= INVALIDATE_TLB_VECTOR_END)
+
+DECLARE_EVENT_CLASS(special_irq,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq),
+
+	TP_STRUCT__entry(
+		__field(	int,	irq	)
+	),
+
+	TP_fast_assign(
+		__entry->irq = irq;
+	),
+
+	TP_printk("irq=%d name=%s tlb_vector=%d", __entry->irq,
+		IS_INVALIDATE_TLB_VECTOR(__entry->irq) ?
+		"INVALIDATE_TLB_VECTOR" : show_special_irq_name(__entry->irq),
+		IS_INVALIDATE_TLB_VECTOR(__entry->irq) ?
+		__entry->irq - INVALIDATE_TLB_VECTOR_START : -1)
+);
+
+/*
+ * special_irq_entry - called before enterring a special interrupt
+ * such as apic timer, spurious interrupt, and etc.
+ */
+DEFINE_EVENT(special_irq, special_irq_entry,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
+/*
+ * special_irq_exit - called immediately after the special interrupt
+ * handler returns
+ */
+DEFINE_EVENT(special_irq, special_irq_exit,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+#endif /* CONFIG_X86 */
+
 #endif /*  _TRACE_IRQ_H */
 
 /* This part must be outside protection */
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] trace: Add special x86 irq entry/exit tracepoints
  2011-04-25 23:41 ` Vaibhav Nagarnaik
@ 2011-04-28 23:16   ` Vaibhav Nagarnaik
  2011-04-28 23:41   ` Steven Rostedt
  2011-04-29 20:12   ` [PATCH] trace: Add x86 irq vector " Vaibhav Nagarnaik
  2 siblings, 0 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-04-28 23:16 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Ingo Molnar
  Cc: Michael Rubin, David Sharp, linux-kernel, x86, Jiaying Zhang,
	Vaibhav Nagarnaik

All

Can you take a look at this patch?


Thanks
Vaibhav Nagarnaik



On Mon, Apr 25, 2011 at 4:41 PM, Vaibhav Nagarnaik
<vnagarnaik@google.com> wrote:
> From: Jiaying Zhang <jiayingz@google.com>
>
> Apart from the normal interrupts, we often also need to trace
> architecture special interrupts. The following patch adds the event
> definition and trace instrumentation for some x86 special interrupts. We
> hope the similar approach can also be used to trace special interrupts
> on other architectures. However, the patch only supports x86 at this
> point.
>
> With this patch, following 'special' interrupts are logged by ftrace
> (when enabled):
>
> Interrupt as in /proc/interrupts        Name as it appears in ftrace log
> NMI: Non-maskable interrupts           : NMI_VECTOR
> LOC: Local timer interrupts            : LOCAL_TIMER_VECTOR
> SPU: Spurious interrupts               : SPURIOUS_APIC_VECTOR
> PMI: Performance monitoring interrupts : <not added>
> PND: Performance pending work          : LOCAL_PENDING_VECTOR
> RES: Rescheduling interrupts           : RESCHEDULE_VECTOR
> CAL: Function call interrupts          : CALL_FUNCTION_VECTOR or
>                                         CALL_FUNCTION_SINGLE_VECTOR
> TLB: TLB shootdowns                    : INVALIDATE_TLB_VECTOR_START to
>                                         INVALIDATE_TLB_VECTOR_END
> TRM: Thermal event interrupts          : THERMAL_APIC_VECTOR
> THR: Threshold APIC interrupts         : THRESHOLD_APIC_VECTOR
> MCE: Machine check exceptions          : <not added>
> MCP: Machine check polls               : <not added>
> ERR:                                   : ERROR_APIC_VECTOR
> MIS:                                   : <not added>
> PLT: Platform interrupts               : X86_PLATFORM_IPI_VECTOR
>
> Enabled the added tracepoints and verified the output:
> $ echo 1 > debug/tracing/events/irq/special_irq_entry/enable
> $ echo 1 > debug/tracing/events/irq/special_irq_exit/enable
> $ cat debug/tracing/trace | more
> <idle>-0     [000]   263.699530: special_irq_entry: irq=239
> name=LOCAL_TIMER_VECTOR tlb_vector=-1
> <idle>-0     [003]   263.699700: special_irq_entry: irq=239
> name=LOCAL_TIMER_VECTOR tlb_vector=-1
> <idle>-0     [004]   263.699749: special_irq_entry: irq=237
> name=INVALIDATE_TLB_VECTOR tlb_vector=30
> <...>-9483  [000]   263.700498: special_irq_entry: irq=239
> name=LOCAL_TIMER_VECTOR tlb_vector=-1
> <idle>-0     [004]   263.700498: special_irq_entry: irq=239
> name=LOCAL_TIMER_VECTOR tlb_vector=-1
> <idle>-0     [003]   263.700519: special_irq_entry: irq=239
> name=LOCAL_TIMER_VECTOR tlb_vector=-1
>
> A custom script compares the two:
> $ sh ~/ftrace_test.sh
> tracing...
>  cpu0: overrun: 0
>  cpu1: overrun: 0
>  cpu2: overrun: 0
>  cpu3: overrun: 0
> IRQ ftrace record counts (ignoring numeric  irq=(0|1|4|5|7|9|10|24) )
>      2 irq=242 name=INVALIDATE_TLB_VECTOR_2
>      3 irq=252 name=CALL_FUNCTION_VECTOR
>      6 irq=241 name=INVALIDATE_TLB_VECTOR_1
>      6 irq=2 name=NMI_VECTOR
>     22 irq=240 name=INVALIDATE_TLB_VECTOR_0
>     40 irq=251 name=CALL_FUNCTION_SINGLE_VECTOR
>    587 irq=253 name=RESCHEDULE_VECTOR
>  15013 irq=239 name=LOCAL_TIMER_VECTOR
> /proc/interrupts stats difference
>      6 NMI (Non-maskable interrupts)
>  14945 LOC (Local timer interrupts)
>      6 PMI (Performance monitoring interrupts)
>    578 RES (Rescheduling interrupts)
>     42 CAL (Function call interrupts)
>     30 TLB (TLB shootdowns)
>      4 MCP (Machine check polls)
>
> In above output:
> NMI (Non-maskable interrupts) == NMI_VECTOR (irq 2)
> LOC (Local timer interrupts) == LOCAL_TIMER_VECTOR (irq 239)
> RES (Rescheduling interrupts) == RESCHEDULE_VECTOR (irq=253)
> CAL (Function call interrupts) == CALL_FUNCTION_VECTOR (irq 252) +
>                                  CALL_FUNCTION_SINGLE_VECTOR (irq 251)
> TLB (TLB shootdowns) == INVALIDATE_TLB_VECTOR_0 (irq=240) +
>                        INVALIDATE_TLB_VECTOR_1 (irq=241) +
>                        INVALIDATE_TLB_VECTOR_2 (irq=242)
>
> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
> ---
>  arch/x86/kernel/apic/apic.c              |    7 +++
>  arch/x86/kernel/cpu/mcheck/therm_throt.c |    3 +
>  arch/x86/kernel/cpu/mcheck/threshold.c   |    3 +
>  arch/x86/kernel/irq.c                    |    4 +-
>  arch/x86/kernel/irq_work.c               |    4 ++
>  arch/x86/kernel/smp.c                    |    7 +++
>  arch/x86/kernel/time.c                   |    6 +++
>  arch/x86/kernel/traps.c                  |    3 +
>  arch/x86/mm/tlb.c                        |    3 +
>  include/trace/events/irq.h               |   68 ++++++++++++++++++++++++++++++
>  10 files changed, 107 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index fabf01e..e601c27 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -33,6 +33,7 @@
>  #include <linux/dmi.h>
>  #include <linux/smp.h>
>  #include <linux/mm.h>
> +#include <trace/events/irq.h>
>
>  #include <asm/perf_event.h>
>  #include <asm/x86_init.h>
> @@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
>         */
>        exit_idle();
>        irq_enter();
> +       trace_special_irq_entry(LOCAL_TIMER_VECTOR);
>        local_apic_timer_interrupt();
> +       trace_special_irq_exit(LOCAL_TIMER_VECTOR);
>        irq_exit();
>
>        set_irq_regs(old_regs);
> @@ -1790,6 +1793,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
>
>        exit_idle();
>        irq_enter();
> +       trace_special_irq_entry(SPURIOUS_APIC_VECTOR);
>        /*
>         * Check if this really is a spurious interrupt and ACK it
>         * if it is a vectored one.  Just in case...
> @@ -1804,6 +1808,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
>        /* see sw-dev-man vol 3, chapter 7.4.13.5 */
>        pr_info("spurious APIC interrupt on CPU#%d, "
>                "should never happen.\n", smp_processor_id());
> +       trace_special_irq_exit(SPURIOUS_APIC_VECTOR);
>        irq_exit();
>  }
>
> @@ -1816,6 +1821,7 @@ void smp_error_interrupt(struct pt_regs *regs)
>
>        exit_idle();
>        irq_enter();
> +       trace_special_irq_entry(ERROR_APIC_VECTOR);
>        /* First tickle the hardware, only then report what went on. -- REW */
>        v = apic_read(APIC_ESR);
>        apic_write(APIC_ESR, 0);
> @@ -1836,6 +1842,7 @@ void smp_error_interrupt(struct pt_regs *regs)
>         */
>        pr_debug("APIC error on CPU%d: %02x(%02x)\n",
>                smp_processor_id(), v , v1);
> +       trace_special_irq_exit(ERROR_APIC_VECTOR);
>        irq_exit();
>  }
>
> diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> index 6f8c5e9..abdec08 100644
> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> @@ -23,6 +23,7 @@
>  #include <linux/init.h>
>  #include <linux/smp.h>
>  #include <linux/cpu.h>
> +#include <trace/events/irq.h>
>
>  #include <asm/processor.h>
>  #include <asm/system.h>
> @@ -402,8 +403,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
>  {
>        exit_idle();
>        irq_enter();
> +       trace_special_irq_entry(THERMAL_APIC_VECTOR);
>        inc_irq_stat(irq_thermal_count);
>        smp_thermal_vector();
> +       trace_special_irq_exit(THERMAL_APIC_VECTOR);
>        irq_exit();
>        /* Ack only at the end to avoid potential reentry */
>        ack_APIC_irq();
> diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
> index d746df2..1943d9e 100644
> --- a/arch/x86/kernel/cpu/mcheck/threshold.c
> +++ b/arch/x86/kernel/cpu/mcheck/threshold.c
> @@ -3,6 +3,7 @@
>  */
>  #include <linux/interrupt.h>
>  #include <linux/kernel.h>
> +#include <trace/events/irq.h>
>
>  #include <asm/irq_vectors.h>
>  #include <asm/apic.h>
> @@ -21,8 +22,10 @@ asmlinkage void smp_threshold_interrupt(void)
>  {
>        exit_idle();
>        irq_enter();
> +       trace_special_irq_entry(THRESHOLD_APIC_VECTOR);
>        inc_irq_stat(irq_threshold_count);
>        mce_threshold_vector();
> +       trace_special_irq_exit(THRESHOLD_APIC_VECTOR);
>        irq_exit();
>        /* Ack only at the end to avoid potential reentry */
>        ack_APIC_irq();
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 1cb0b9f..6f857c5 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -8,6 +8,7 @@
>  #include <linux/seq_file.h>
>  #include <linux/smp.h>
>  #include <linux/ftrace.h>
> +#include <trace/events/irq.h>
>  #include <linux/delay.h>
>
>  #include <asm/apic.h>
> @@ -211,12 +212,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
>        exit_idle();
>
>        irq_enter();
> -
> +       trace_special_irq_entry(X86_PLATFORM_IPI_VECTOR);
>        inc_irq_stat(x86_platform_ipis);
>
>        if (x86_platform_ipi_callback)
>                x86_platform_ipi_callback();
>
> +       trace_special_irq_exit(X86_PLATFORM_IPI_VECTOR);
>        irq_exit();
>
>        set_irq_regs(old_regs);
> diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
> index ca8f703..9ee00ba 100644
> --- a/arch/x86/kernel/irq_work.c
> +++ b/arch/x86/kernel/irq_work.c
> @@ -8,13 +8,17 @@
>  #include <linux/irq_work.h>
>  #include <linux/hardirq.h>
>  #include <asm/apic.h>
> +#include <asm/irq_vectors.h>
> +#include <trace/events/irq.h>
>
>  void smp_irq_work_interrupt(struct pt_regs *regs)
>  {
>        irq_enter();
>        ack_APIC_irq();
> +       trace_special_irq_entry(IRQ_WORK_VECTOR);
>        inc_irq_stat(apic_irq_work_irqs);
>        irq_work_run();
> +       trace_special_irq_exit(IRQ_WORK_VECTOR);
>        irq_exit();
>  }
>
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
> index 513deac..51ec1f8 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -22,6 +22,7 @@
>  #include <linux/interrupt.h>
>  #include <linux/cpu.h>
>  #include <linux/gfp.h>
> +#include <trace/events/irq.h>
>
>  #include <asm/mtrr.h>
>  #include <asm/tlbflush.h>
> @@ -201,7 +202,9 @@ static void native_stop_other_cpus(int wait)
>  void smp_reschedule_interrupt(struct pt_regs *regs)
>  {
>        ack_APIC_irq();
> +       trace_special_irq_entry(RESCHEDULE_VECTOR);
>        inc_irq_stat(irq_resched_count);
> +       trace_special_irq_exit(RESCHEDULE_VECTOR);
>        /*
>         * KVM uses this interrupt to force a cpu out of guest mode
>         */
> @@ -211,8 +214,10 @@ void smp_call_function_interrupt(struct pt_regs *regs)
>  {
>        ack_APIC_irq();
>        irq_enter();
> +       trace_special_irq_entry(CALL_FUNCTION_VECTOR);
>        generic_smp_call_function_interrupt();
>        inc_irq_stat(irq_call_count);
> +       trace_special_irq_exit(CALL_FUNCTION_VECTOR);
>        irq_exit();
>  }
>
> @@ -220,8 +225,10 @@ void smp_call_function_single_interrupt(struct pt_regs *regs)
>  {
>        ack_APIC_irq();
>        irq_enter();
> +       trace_special_irq_entry(CALL_FUNCTION_SINGLE_VECTOR);
>        generic_smp_call_function_single_interrupt();
>        inc_irq_stat(irq_call_count);
> +       trace_special_irq_exit(CALL_FUNCTION_SINGLE_VECTOR);
>        irq_exit();
>  }
>
> diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
> index 25a28a2..687b087 100644
> --- a/arch/x86/kernel/time.c
> +++ b/arch/x86/kernel/time.c
> @@ -13,6 +13,7 @@
>  #include <linux/interrupt.h>
>  #include <linux/time.h>
>  #include <linux/mca.h>
> +#include <trace/events/irq.h>
>
>  #include <asm/vsyscall.h>
>  #include <asm/x86_init.h>
> @@ -26,6 +27,8 @@
>  volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
>  #endif
>
> +static struct irqaction *irq0_action;
> +
>  unsigned long profile_pc(struct pt_regs *regs)
>  {
>        unsigned long pc = instruction_pointer(regs);
> @@ -59,7 +62,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
>        /* Keep nmi watchdog up to date */
>        inc_irq_stat(irq0_irqs);
>
> +       trace_irq_handler_entry(0, irq0_action);
>        global_clock_event->event_handler(global_clock_event);
> +       trace_irq_handler_exit(0, irq0_action, 1);
>
>        /* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
>        if (MCA_bus)
> @@ -76,6 +81,7 @@ static struct irqaction irq0  = {
>
>  void __init setup_default_timer_irq(void)
>  {
> +       irq0_action = &irq0;
>        setup_irq(0, &irq0);
>  }
>
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index b9b6716..4857ff6 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -31,6 +31,7 @@
>  #include <linux/mm.h>
>  #include <linux/smp.h>
>  #include <linux/io.h>
> +#include <trace/events/irq.h>
>
>  #ifdef CONFIG_EISA
>  #include <linux/ioport.h>
> @@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void
>  do_nmi(struct pt_regs *regs, long error_code)
>  {
>        nmi_enter();
> +       trace_special_irq_entry(NMI_VECTOR);
>
>        inc_irq_stat(__nmi_count);
>
>        if (!ignore_nmis)
>                default_do_nmi(regs);
>
> +       trace_special_irq_exit(NMI_VECTOR);
>        nmi_exit();
>  }
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index d6c0418..f119306 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -5,6 +5,7 @@
>  #include <linux/smp.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
> +#include <trace/events/irq.h>
>  #include <linux/cpu.h>
>
>  #include <asm/tlbflush.h>
> @@ -141,6 +142,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs)
>        sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START;
>        f = &flush_state[sender];
>
> +       trace_special_irq_entry(INVALIDATE_TLB_VECTOR_START + sender);
>        if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask)))
>                goto out;
>                /*
> @@ -167,6 +169,7 @@ out:
>        cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask));
>        smp_mb__after_clear_bit();
>        inc_irq_stat(irq_tlb_count);
> +       trace_special_irq_exit(INVALIDATE_TLB_VECTOR_START + sender);
>  }
>
>  static void flush_tlb_others_ipi(const struct cpumask *cpumask,
> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> index 4b4c7d9..6c0af78 100644
> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -139,6 +139,74 @@ DEFINE_EVENT(softirq, softirq_raise,
>        TP_ARGS(vec_nr)
>  );
>
> +#ifdef CONFIG_X86
> +#include <asm/irq_vectors.h>
> +
> +#define special_irq_name(sirq) { sirq, #sirq }
> +#define show_special_irq_name(val)                                     \
> +       __print_symbolic(val,                                           \
> +                        special_irq_name(NMI_VECTOR),                  \
> +                        special_irq_name(LOCAL_TIMER_VECTOR),          \
> +                        special_irq_name(ERROR_APIC_VECTOR),           \
> +                        special_irq_name(RESCHEDULE_VECTOR),           \
> +                        special_irq_name(CALL_FUNCTION_VECTOR),        \
> +                        special_irq_name(CALL_FUNCTION_SINGLE_VECTOR), \
> +                        special_irq_name(THERMAL_APIC_VECTOR),         \
> +                        special_irq_name(THRESHOLD_APIC_VECTOR),       \
> +                        special_irq_name(REBOOT_VECTOR),               \
> +                        special_irq_name(SPURIOUS_APIC_VECTOR),        \
> +                        special_irq_name(IRQ_WORK_VECTOR),             \
> +                        special_irq_name(X86_PLATFORM_IPI_VECTOR)      \
> +                       )
> +
> +#define IS_INVALIDATE_TLB_VECTOR(__irq) (\
> +               __irq >= INVALIDATE_TLB_VECTOR_START && \
> +               __irq <= INVALIDATE_TLB_VECTOR_END)
> +
> +DECLARE_EVENT_CLASS(special_irq,
> +
> +       TP_PROTO(int irq),
> +
> +       TP_ARGS(irq),
> +
> +       TP_STRUCT__entry(
> +               __field(        int,    irq     )
> +       ),
> +
> +       TP_fast_assign(
> +               __entry->irq = irq;
> +       ),
> +
> +       TP_printk("irq=%d name=%s tlb_vector=%d", __entry->irq,
> +               IS_INVALIDATE_TLB_VECTOR(__entry->irq) ?
> +               "INVALIDATE_TLB_VECTOR" : show_special_irq_name(__entry->irq),
> +               IS_INVALIDATE_TLB_VECTOR(__entry->irq) ?
> +               __entry->irq - INVALIDATE_TLB_VECTOR_START : -1)
> +);
> +
> +/*
> + * special_irq_entry - called before enterring a special interrupt
> + * such as apic timer, spurious interrupt, and etc.
> + */
> +DEFINE_EVENT(special_irq, special_irq_entry,
> +
> +       TP_PROTO(int irq),
> +
> +       TP_ARGS(irq)
> +);
> +
> +/*
> + * special_irq_exit - called immediately after the special interrupt
> + * handler returns
> + */
> +DEFINE_EVENT(special_irq, special_irq_exit,
> +
> +       TP_PROTO(int irq),
> +
> +       TP_ARGS(irq)
> +);
> +#endif /* CONFIG_X86 */
> +
>  #endif /*  _TRACE_IRQ_H */
>
>  /* This part must be outside protection */
> --
> 1.7.3.1
>
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] trace: Add special x86 irq entry/exit tracepoints
  2011-04-25 23:41 ` Vaibhav Nagarnaik
  2011-04-28 23:16   ` Vaibhav Nagarnaik
@ 2011-04-28 23:41   ` Steven Rostedt
  2011-04-29 20:12   ` [PATCH] trace: Add x86 irq vector " Vaibhav Nagarnaik
  2 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2011-04-28 23:41 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Thomas Gleixner, Ingo Molnar, Michael Rubin, David Sharp,
	linux-kernel, x86, Jiaying Zhang

On Mon, 2011-04-25 at 16:41 -0700, Vaibhav Nagarnaik wrote:
> From: Jiaying Zhang <jiayingz@google.com>
> 
> Apart from the normal interrupts, we often also need to trace
> architecture special interrupts. The following patch adds the event
> definition and trace instrumentation for some x86 special interrupts. We
> hope the similar approach can also be used to trace special interrupts
> on other architectures. However, the patch only supports x86 at this
> point.

Again, this requires an acked-by from Ingo or Thomas.

-- Steve



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] trace: Add special x86 irq entry/exit tracepoints
  2011-04-22 22:56 [PATCH] trace: Add special x86 irq entry/exit tracepoints Vaibhav Nagarnaik
  2011-04-25 23:41 ` Vaibhav Nagarnaik
@ 2011-04-29  0:14 ` Thomas Gleixner
  2011-04-29 20:15   ` Vaibhav Nagarnaik
  1 sibling, 1 reply; 35+ messages in thread
From: Thomas Gleixner @ 2011-04-29  0:14 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Steven Rostedt, Ingo Molnar, Michael Rubin, David Sharp,
	linux-kernel, x86, Jiaying Zhang

On Fri, 22 Apr 2011, Vaibhav Nagarnaik wrote:
>  #include <asm/perf_event.h>
>  #include <asm/x86_init.h>
> @@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
>  	 */
>  	exit_idle();
>  	irq_enter();
> +	trace_special_irq_entry(LOCAL_TIMER_VECTOR);

Gah. trace_special_irq is the worst name you could come up with. It's
tracing a vector which is nothing special and nothing x86 specific.

>  #include <asm/vsyscall.h>
>  #include <asm/x86_init.h>
> @@ -26,6 +27,8 @@
>  volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
>  #endif
>  
> +static struct irqaction *irq0_action;
> +

Ouch! You need an extra pointer for that ? Moving the stupid irq0
struct up a few lines would do the trick as well, right ?

> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -139,6 +139,74 @@ DEFINE_EVENT(softirq, softirq_raise,
>  	TP_ARGS(vec_nr)
>  );
>  
> +#ifdef CONFIG_X86
> +#include <asm/irq_vectors.h>
> +#define special_irq_name(sirq) { sirq, #sirq }
> +#define show_special_irq_name(val)					\
> +	__print_symbolic(val,						\
> +			 special_irq_name(NMI_VECTOR),			\
> +			 special_irq_name(LOCAL_TIMER_VECTOR),		\
> +			 special_irq_name(ERROR_APIC_VECTOR),		\
> +			 special_irq_name(RESCHEDULE_VECTOR),		\
> +			 special_irq_name(CALL_FUNCTION_VECTOR),	\
> +			 special_irq_name(CALL_FUNCTION_SINGLE_VECTOR),	\
> +			 special_irq_name(THERMAL_APIC_VECTOR),		\
> +			 special_irq_name(THRESHOLD_APIC_VECTOR),	\
> +			 special_irq_name(REBOOT_VECTOR),		\
> +			 special_irq_name(SPURIOUS_APIC_VECTOR),	\
> +			 special_irq_name(IRQ_WORK_VECTOR),		\
> +			 special_irq_name(X86_PLATFORM_IPI_VECTOR)	\
> +			)
> +
> +#define IS_INVALIDATE_TLB_VECTOR(__irq) (\
> +		__irq >= INVALIDATE_TLB_VECTOR_START && \
> +		__irq <= INVALIDATE_TLB_VECTOR_END)

No way, really.

Interrupt vectors are not x86 specific and they do no need any
architecture specific handling at all. 

The symbolic printout is a nice to have extra, but it can be solved by
providing an arch specific lookup table which does not require any of
that #ifdef X86 mess. If the table does not exist, then you simply can
print the vector number and be done with it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] trace: Add x86 irq vector entry/exit tracepoints
  2011-04-25 23:41 ` Vaibhav Nagarnaik
  2011-04-28 23:16   ` Vaibhav Nagarnaik
  2011-04-28 23:41   ` Steven Rostedt
@ 2011-04-29 20:12   ` Vaibhav Nagarnaik
  2011-04-29 20:26     ` Thomas Gleixner
  2011-05-31 21:28     ` [PATCH v2] " Vaibhav Nagarnaik
  2 siblings, 2 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-04-29 20:12 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Ingo Molnar
  Cc: Michael Rubin, David Sharp, linux-kernel, x86, Jiaying Zhang,
	Vaibhav Nagarnaik

From: Jiaying Zhang <jiayingz@google.com>

The current interrupt trace of irq_handler_entry and irq_handler_exit
give traces of when an interrupt is handled. They provide good data
about when the system is running in kernel space and how it affects the
currently running applications.

Apart from these, they are IRQ vectors which trigger the system into
kernel space. Tracing such events gives us the trace of where the system
is spending its time. We want to know which cores are handling
interrupts and how they are affecting other processes in the system.
Also, the trace provides information about when the cores are idle and
which interrupts are changing the state.

The following patch adds the event definition and trace instrumentation
for interrupt vectors. For x86, a lookup table is provided to print out
readable IRQ vector names. The template can be used to provide interrupt
vector lookup tables on other architectures.

With this patch, following interrupt vectors are logged by ftrace
(when enabled):

Interrupt as in /proc/interrupts        Name as it appears in ftrace log
NMI: Non-maskable interrupts           : NMI_VECTOR
LOC: Local timer interrupts            : LOCAL_TIMER_VECTOR
SPU: Spurious interrupts               : SPURIOUS_APIC_VECTOR
PMI: Performance monitoring interrupts : <not added>
IWI: IRQ work interrupts               : IRQ_WORK_VECTOR
RES: Rescheduling interrupts           : RESCHEDULE_VECTOR
CAL: Function call interrupts          : CALL_FUNCTION_VECTOR or
                                         CALL_FUNCTION_SINGLE_VECTOR
TLB: TLB shootdowns                    : INVALIDATE_TLB_VECTOR_START to
                                         INVALIDATE_TLB_VECTOR_END
TRM: Thermal event interrupts          : THERMAL_APIC_VECTOR
THR: Threshold APIC interrupts         : THRESHOLD_APIC_VECTOR
MCE: Machine check exceptions          : <not added>
MCP: Machine check polls               : <not added>
ERR:                                   : ERROR_APIC_VECTOR
MIS:                                   : <not added>
PLT: Platform interrupts               : X86_PLATFORM_IPI_VECTOR

$ sh ~/ftrace_test.sh
tracing...
  cpu0: overrun: 0
  cpu1: overrun: 0
  cpu2: overrun: 0
  cpu3: overrun: 0
IRQ ftrace record counts (ignoring numeric  irq=(0|1|4|5|7|9|10|24) )
      2 irq=242 name=INVALIDATE_TLB_VECTOR_2
      3 irq=252 name=CALL_FUNCTION_VECTOR
      6 irq=241 name=INVALIDATE_TLB_VECTOR_1
      6 irq=2 name=NMI_VECTOR
     22 irq=240 name=INVALIDATE_TLB_VECTOR_0
     40 irq=251 name=CALL_FUNCTION_SINGLE_VECTOR
    587 irq=253 name=RESCHEDULE_VECTOR
  15013 irq=239 name=LOCAL_TIMER_VECTOR
/proc/interrupts stats difference
      6 NMI (Non-maskable interrupts)
  14945 LOC (Local timer interrupts)
      6 PMI (Performance monitoring interrupts)
    578 RES (Rescheduling interrupts)
     42 CAL (Function call interrupts)
     30 TLB (TLB shootdowns)
      4 MCP (Machine check polls)

In above output:
NMI (Non-maskable interrupts) == NMI_VECTOR (irq 2)
LOC (Local timer interrupts) == LOCAL_TIMER_VECTOR (irq 239)
RES (Rescheduling interrupts) == RESCHEDULE_VECTOR (irq=253)
CAL (Function call interrupts) == CALL_FUNCTION_VECTOR (irq 252) +
                                  CALL_FUNCTION_SINGLE_VECTOR (irq 251)
TLB (TLB shootdowns) == INVALIDATE_TLB_VECTOR_0 (irq=240) +
                        INVALIDATE_TLB_VECTOR_1 (irq=241) +
                        INVALIDATE_TLB_VECTOR_2 (irq=242)

Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
---
 arch/x86/include/asm/irq_vectors.h       |   50 ++++++++++++++++++++++++++++++
 arch/x86/kernel/apic/apic.c              |    7 ++++
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    3 ++
 arch/x86/kernel/cpu/mcheck/threshold.c   |    3 ++
 arch/x86/kernel/irq.c                    |    4 ++-
 arch/x86/kernel/irq_work.c               |    4 ++
 arch/x86/kernel/smp.c                    |    7 ++++
 arch/x86/kernel/time.c                   |   16 ++++++---
 arch/x86/kernel/traps.c                  |    3 ++
 arch/x86/mm/tlb.c                        |    3 ++
 include/trace/events/irq.h               |   45 +++++++++++++++++++++++++++
 11 files changed, 138 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 6e976ee..81b83ae 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -182,4 +182,54 @@ static inline int invalid_vm86_irq(int irq)
 # define NR_IRQS			NR_IRQS_LEGACY
 #endif
 
+#define irq_vector_name(sirq) { sirq, #sirq }
+#define invalidate_tlb_vector_name(i) { INVALIDATE_TLB_VECTOR_END-31+i, \
+					"INVALIDATE_TLB_VECTOR" }
+
+#define irq_vector_name_table						\
+			 irq_vector_name(NMI_VECTOR),			\
+			 irq_vector_name(LOCAL_TIMER_VECTOR),		\
+			 irq_vector_name(ERROR_APIC_VECTOR),		\
+			 irq_vector_name(RESCHEDULE_VECTOR),		\
+			 irq_vector_name(CALL_FUNCTION_VECTOR),		\
+			 irq_vector_name(CALL_FUNCTION_SINGLE_VECTOR),	\
+			 irq_vector_name(THERMAL_APIC_VECTOR),		\
+			 irq_vector_name(THRESHOLD_APIC_VECTOR),	\
+			 irq_vector_name(REBOOT_VECTOR),		\
+			 irq_vector_name(SPURIOUS_APIC_VECTOR),		\
+			 irq_vector_name(IRQ_WORK_VECTOR),		\
+			 irq_vector_name(X86_PLATFORM_IPI_VECTOR),	\
+			 invalidate_tlb_vector_name(0),			\
+			 invalidate_tlb_vector_name(1),			\
+			 invalidate_tlb_vector_name(2),			\
+			 invalidate_tlb_vector_name(3),			\
+			 invalidate_tlb_vector_name(4),			\
+			 invalidate_tlb_vector_name(5),			\
+			 invalidate_tlb_vector_name(6),			\
+			 invalidate_tlb_vector_name(7),			\
+			 invalidate_tlb_vector_name(8),			\
+			 invalidate_tlb_vector_name(9),			\
+			 invalidate_tlb_vector_name(10),		\
+			 invalidate_tlb_vector_name(11),		\
+			 invalidate_tlb_vector_name(12),		\
+			 invalidate_tlb_vector_name(13),		\
+			 invalidate_tlb_vector_name(14),		\
+			 invalidate_tlb_vector_name(15),		\
+			 invalidate_tlb_vector_name(16),		\
+			 invalidate_tlb_vector_name(17),		\
+			 invalidate_tlb_vector_name(18),		\
+			 invalidate_tlb_vector_name(19),		\
+			 invalidate_tlb_vector_name(20),		\
+			 invalidate_tlb_vector_name(21),		\
+			 invalidate_tlb_vector_name(22),		\
+			 invalidate_tlb_vector_name(23),		\
+			 invalidate_tlb_vector_name(24),		\
+			 invalidate_tlb_vector_name(25),		\
+			 invalidate_tlb_vector_name(26),		\
+			 invalidate_tlb_vector_name(27),		\
+			 invalidate_tlb_vector_name(28),		\
+			 invalidate_tlb_vector_name(29),		\
+			 invalidate_tlb_vector_name(30),		\
+			 invalidate_tlb_vector_name(31)
+
 #endif /* _ASM_X86_IRQ_VECTORS_H */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index fabf01e..473bcab 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -33,6 +33,7 @@
 #include <linux/dmi.h>
 #include <linux/smp.h>
 #include <linux/mm.h>
+#include <trace/events/irq.h>
 
 #include <asm/perf_event.h>
 #include <asm/x86_init.h>
@@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
 	 */
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(LOCAL_TIMER_VECTOR);
 	local_apic_timer_interrupt();
+	trace_irq_vector_exit(LOCAL_TIMER_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
@@ -1790,6 +1793,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(SPURIOUS_APIC_VECTOR);
 	/*
 	 * Check if this really is a spurious interrupt and ACK it
 	 * if it is a vectored one.  Just in case...
@@ -1804,6 +1808,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
 	pr_info("spurious APIC interrupt on CPU#%d, "
 		"should never happen.\n", smp_processor_id());
+	trace_irq_vector_exit(SPURIOUS_APIC_VECTOR);
 	irq_exit();
 }
 
@@ -1816,6 +1821,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(ERROR_APIC_VECTOR);
 	/* First tickle the hardware, only then report what went on. -- REW */
 	v = apic_read(APIC_ESR);
 	apic_write(APIC_ESR, 0);
@@ -1836,6 +1842,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 	 */
 	pr_debug("APIC error on CPU%d: %02x(%02x)\n",
 		smp_processor_id(), v , v1);
+	trace_irq_vector_exit(ERROR_APIC_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 6f8c5e9..e2d5afb 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/smp.h>
 #include <linux/cpu.h>
+#include <trace/events/irq.h>
 
 #include <asm/processor.h>
 #include <asm/system.h>
@@ -402,8 +403,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
 {
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(THERMAL_APIC_VECTOR);
 	inc_irq_stat(irq_thermal_count);
 	smp_thermal_vector();
+	trace_irq_vector_exit(THERMAL_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index d746df2..0a3655d 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -3,6 +3,7 @@
  */
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
+#include <trace/events/irq.h>
 
 #include <asm/irq_vectors.h>
 #include <asm/apic.h>
@@ -21,8 +22,10 @@ asmlinkage void smp_threshold_interrupt(void)
 {
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(THRESHOLD_APIC_VECTOR);
 	inc_irq_stat(irq_threshold_count);
 	mce_threshold_vector();
+	trace_irq_vector_exit(THRESHOLD_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 1cb0b9f..2b53049 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -8,6 +8,7 @@
 #include <linux/seq_file.h>
 #include <linux/smp.h>
 #include <linux/ftrace.h>
+#include <trace/events/irq.h>
 #include <linux/delay.h>
 
 #include <asm/apic.h>
@@ -211,12 +212,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
 	exit_idle();
 
 	irq_enter();
-
+	trace_irq_vector_entry(X86_PLATFORM_IPI_VECTOR);
 	inc_irq_stat(x86_platform_ipis);
 
 	if (x86_platform_ipi_callback)
 		x86_platform_ipi_callback();
 
+	trace_irq_vector_exit(X86_PLATFORM_IPI_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index ca8f703..a07e9f8 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -8,13 +8,17 @@
 #include <linux/irq_work.h>
 #include <linux/hardirq.h>
 #include <asm/apic.h>
+#include <asm/irq_vectors.h>
+#include <trace/events/irq.h>
 
 void smp_irq_work_interrupt(struct pt_regs *regs)
 {
 	irq_enter();
 	ack_APIC_irq();
+	trace_irq_vector_entry(IRQ_WORK_VECTOR);
 	inc_irq_stat(apic_irq_work_irqs);
 	irq_work_run();
+	trace_irq_vector_exit(IRQ_WORK_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 513deac..b5c193f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -22,6 +22,7 @@
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
 #include <linux/gfp.h>
+#include <trace/events/irq.h>
 
 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
@@ -201,7 +202,9 @@ static void native_stop_other_cpus(int wait)
 void smp_reschedule_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
+	trace_irq_vector_entry(RESCHEDULE_VECTOR);
 	inc_irq_stat(irq_resched_count);
+	trace_irq_vector_exit(RESCHEDULE_VECTOR);
 	/*
 	 * KVM uses this interrupt to force a cpu out of guest mode
 	 */
@@ -211,8 +214,10 @@ void smp_call_function_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_irq_vector_entry(CALL_FUNCTION_VECTOR);
 	generic_smp_call_function_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_irq_vector_exit(CALL_FUNCTION_VECTOR);
 	irq_exit();
 }
 
@@ -220,8 +225,10 @@ void smp_call_function_single_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_irq_vector_entry(CALL_FUNCTION_SINGLE_VECTOR);
 	generic_smp_call_function_single_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_irq_vector_exit(CALL_FUNCTION_SINGLE_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 25a28a2..5195302 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -13,6 +13,7 @@
 #include <linux/interrupt.h>
 #include <linux/time.h>
 #include <linux/mca.h>
+#include <trace/events/irq.h>
 
 #include <asm/vsyscall.h>
 #include <asm/x86_init.h>
@@ -51,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs)
 }
 EXPORT_SYMBOL(profile_pc);
 
+static irqreturn_t timer_interrupt(int irq, void *dev_id);
+static struct irqaction irq0  = {
+	.handler = timer_interrupt,
+	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
+	.name = "timer"
+};
+
 /*
  * Default timer interrupt handler for PIT/HPET
  */
@@ -59,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	/* Keep nmi watchdog up to date */
 	inc_irq_stat(irq0_irqs);
 
+	trace_irq_handler_entry(0, &irq0);
 	global_clock_event->event_handler(global_clock_event);
+	trace_irq_handler_exit(0, &irq0, 1);
 
 	/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
 	if (MCA_bus)
@@ -68,12 +78,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static struct irqaction irq0  = {
-	.handler = timer_interrupt,
-	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
-	.name = "timer"
-};
-
 void __init setup_default_timer_irq(void)
 {
 	setup_irq(0, &irq0);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..fbfc162 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -31,6 +31,7 @@
 #include <linux/mm.h>
 #include <linux/smp.h>
 #include <linux/io.h>
+#include <trace/events/irq.h>
 
 #ifdef CONFIG_EISA
 #include <linux/ioport.h>
@@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void
 do_nmi(struct pt_regs *regs, long error_code)
 {
 	nmi_enter();
+	trace_irq_vector_entry(NMI_VECTOR);
 
 	inc_irq_stat(__nmi_count);
 
 	if (!ignore_nmis)
 		default_do_nmi(regs);
 
+	trace_irq_vector_exit(NMI_VECTOR);
 	nmi_exit();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index d6c0418..b45ad76 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -5,6 +5,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <trace/events/irq.h>
 #include <linux/cpu.h>
 
 #include <asm/tlbflush.h>
@@ -141,6 +142,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs)
 	sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START;
 	f = &flush_state[sender];
 
+	trace_irq_vector_entry(INVALIDATE_TLB_VECTOR_START + sender);
 	if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask)))
 		goto out;
 		/*
@@ -167,6 +169,7 @@ out:
 	cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask));
 	smp_mb__after_clear_bit();
 	inc_irq_stat(irq_tlb_count);
+	trace_irq_vector_exit(INVALIDATE_TLB_VECTOR_START + sender);
 }
 
 static void flush_tlb_others_ipi(const struct cpumask *cpumask,
diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
index 4b4c7d9..b70383e 100644
--- a/include/trace/events/irq.h
+++ b/include/trace/events/irq.h
@@ -139,6 +139,51 @@ DEFINE_EVENT(softirq, softirq_raise,
 	TP_ARGS(vec_nr)
 );
 
+#include <asm/irq_vectors.h>
+
+#ifndef irq_vector_name_table
+#define irq_vector_name_table { -1, NULL }
+#endif
+
+DECLARE_EVENT_CLASS(irq_vector,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq),
+
+	TP_STRUCT__entry(
+		__field(	int,	irq	)
+	),
+
+	TP_fast_assign(
+		__entry->irq = irq;
+	),
+
+	TP_printk("irq=%d name=%s", __entry->irq,
+		__print_symbolic(__entry->irq, irq_vector_name_table))
+);
+
+/*
+ * irq_vector_entry - called before enterring a interrupt vector handler
+ */
+DEFINE_EVENT(irq_vector, irq_vector_entry,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
+/*
+ * irq_vector_exit - called immediately after the interrupt vector
+ * handler returns
+ */
+DEFINE_EVENT(irq_vector, irq_vector_exit,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
 #endif /*  _TRACE_IRQ_H */
 
 /* This part must be outside protection */
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] trace: Add special x86 irq entry/exit tracepoints
  2011-04-29  0:14 ` [PATCH] trace: Add special x86 irq " Thomas Gleixner
@ 2011-04-29 20:15   ` Vaibhav Nagarnaik
  0 siblings, 0 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-04-29 20:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, Ingo Molnar, Michael Rubin, David Sharp,
	linux-kernel, x86, Jiaying Zhang

On Thu, Apr 28, 2011 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 22 Apr 2011, Vaibhav Nagarnaik wrote:
>>  #include <asm/perf_event.h>
>>  #include <asm/x86_init.h>
>> @@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
>>        */
>>       exit_idle();
>>       irq_enter();
>> +     trace_special_irq_entry(LOCAL_TIMER_VECTOR);
>
> Gah. trace_special_irq is the worst name you could come up with. It's
> tracing a vector which is nothing special and nothing x86 specific.
>
I will change it to trace_irq_vector_{entry|exit}.

>>  #include <asm/vsyscall.h>
>>  #include <asm/x86_init.h>
>> @@ -26,6 +27,8 @@
>>  volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
>>  #endif
>>
>> +static struct irqaction *irq0_action;
>> +
>
> Ouch! You need an extra pointer for that ? Moving the stupid irq0
> struct up a few lines would do the trick as well, right ?
>
Sure. I will update it.

>> --- a/include/trace/events/irq.h
>> +++ b/include/trace/events/irq.h
>> @@ -139,6 +139,74 @@ DEFINE_EVENT(softirq, softirq_raise,
>>       TP_ARGS(vec_nr)
>>  );
>>
>> +#ifdef CONFIG_X86
>> +#include <asm/irq_vectors.h>
>> +#define special_irq_name(sirq) { sirq, #sirq }
>> +#define show_special_irq_name(val)                                   \
>> +     __print_symbolic(val,                                           \
>> +                      special_irq_name(NMI_VECTOR),                  \
>> +                      special_irq_name(LOCAL_TIMER_VECTOR),          \
>> +                      special_irq_name(ERROR_APIC_VECTOR),           \
>> +                      special_irq_name(RESCHEDULE_VECTOR),           \
>> +                      special_irq_name(CALL_FUNCTION_VECTOR),        \
>> +                      special_irq_name(CALL_FUNCTION_SINGLE_VECTOR), \
>> +                      special_irq_name(THERMAL_APIC_VECTOR),         \
>> +                      special_irq_name(THRESHOLD_APIC_VECTOR),       \
>> +                      special_irq_name(REBOOT_VECTOR),               \
>> +                      special_irq_name(SPURIOUS_APIC_VECTOR),        \
>> +                      special_irq_name(IRQ_WORK_VECTOR),             \
>> +                      special_irq_name(X86_PLATFORM_IPI_VECTOR)      \
>> +                     )
>> +
>> +#define IS_INVALIDATE_TLB_VECTOR(__irq) (\
>> +             __irq >= INVALIDATE_TLB_VECTOR_START && \
>> +             __irq <= INVALIDATE_TLB_VECTOR_END)
>
> No way, really.
>
> Interrupt vectors are not x86 specific and they do no need any
> architecture specific handling at all.
>
> The symbolic printout is a nice to have extra, but it can be solved by
> providing an arch specific lookup table which does not require any of
> that #ifdef X86 mess. If the table does not exist, then you simply can
> print the vector number and be done with it.
>
I have changed that and now just include the asm/irq_vectors.h so other
architectures can provide the look up table definitions in their respective
files.

I have also updated the description to address the use case. We can know when
the system is moving between kernel and user spaces. How much time is being
spent for operations and how it affects the running processes.

> Thanks,
>
>        tglx
>


Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] trace: Add x86 irq vector entry/exit tracepoints
  2011-04-29 20:12   ` [PATCH] trace: Add x86 irq vector " Vaibhav Nagarnaik
@ 2011-04-29 20:26     ` Thomas Gleixner
  2011-04-29 22:04       ` Vaibhav Nagarnaik
  2011-05-31 21:28     ` [PATCH v2] " Vaibhav Nagarnaik
  1 sibling, 1 reply; 35+ messages in thread
From: Thomas Gleixner @ 2011-04-29 20:26 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Steven Rostedt, Ingo Molnar, Michael Rubin, David Sharp,
	linux-kernel, x86, Jiaying Zhang

On Fri, 29 Apr 2011, Vaibhav Nagarnaik wrote:

> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> index 4b4c7d9..b70383e 100644
> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -139,6 +139,51 @@ DEFINE_EVENT(softirq, softirq_raise,
>  	TP_ARGS(vec_nr)
>  );
>  
> +#include <asm/irq_vectors.h>

And how is that supposed to compile on anything else than arch/x86 and
arch/um ? And no, we don't want an #ifdef x86 for that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] trace: Add x86 irq vector entry/exit tracepoints
  2011-04-29 20:26     ` Thomas Gleixner
@ 2011-04-29 22:04       ` Vaibhav Nagarnaik
  0 siblings, 0 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-04-29 22:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, Ingo Molnar, Michael Rubin, David Sharp,
	linux-kernel, x86, Jiaying Zhang

On Fri, Apr 29, 2011 at 1:26 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 29 Apr 2011, Vaibhav Nagarnaik wrote:
>
>> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
>> index 4b4c7d9..b70383e 100644
>> --- a/include/trace/events/irq.h
>> +++ b/include/trace/events/irq.h
>> @@ -139,6 +139,51 @@ DEFINE_EVENT(softirq, softirq_raise,
>>       TP_ARGS(vec_nr)
>>  );
>>
>> +#include <asm/irq_vectors.h>
>
> And how is that supposed to compile on anything else than arch/x86 and
> arch/um ? And no, we don't want an #ifdef x86 for that.
>
You are right. I didn't think of other platforms.

If I include <asm/irq.h>, it should compile on all the platforms. What do you
think?

The other way is to define the irq_vectors as a separate trace system which
only gets initialized if the platform files create the tracepoints for it. The
only difference I see is that the tracepoints would move from under irq/ to
irq_vectors/.

> Thanks,
>
>        tglx
>


Thanks for your quick responses.

Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-04-29 20:12   ` [PATCH] trace: Add x86 irq vector " Vaibhav Nagarnaik
  2011-04-29 20:26     ` Thomas Gleixner
@ 2011-05-31 21:28     ` Vaibhav Nagarnaik
  2011-06-01  0:00       ` Frederic Weisbecker
  2011-06-21 18:45       ` [PATCH v3] " Vaibhav Nagarnaik
  1 sibling, 2 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-05-31 21:28 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Steven Rostedt
  Cc: Michael Rubin, David Sharp, x86, linux-kernel, Jiaying Zhang,
	Vaibhav Nagarnaik

From: Jiaying Zhang <jiayingz@google.com>

The current interrupt trace of irq_handler_entry and irq_handler_exit
give traces of when an interrupt is handled. They provide good data
about when the system is running in kernel space and how it affects the
currently running applications.

Apart from this, they are IRQ vectors which trigger the system into
kernel space. Tracing such events gives us the trace of where the system
is spending its time. We want to know which cores are handling
interrupts and how they are affecting other processes in the system.
Also, the trace provides information about when the cores are idle and
which interrupts are changing that state.

The following patch adds the event definition and trace instrumentation
for interrupt vectors. For x86, a lookup table is provided to print out
readable IRQ vector names. The template can be used to provide interrupt
vector lookup tables on other architectures.

With this patch, following interrupt vectors are logged by ftrace
(when enabled):

Interrupt as in /proc/interrupts        Name as it appears in ftrace log
NMI: Non-maskable interrupts           : NMI_VECTOR
LOC: Local timer interrupts            : LOCAL_TIMER_VECTOR
SPU: Spurious interrupts               : SPURIOUS_APIC_VECTOR
PMI: Performance monitoring interrupts : <not added>
IWI: IRQ work interrupts               : IRQ_WORK_VECTOR
RES: Rescheduling interrupts           : RESCHEDULE_VECTOR
CAL: Function call interrupts          : CALL_FUNCTION_VECTOR or
                                         CALL_FUNCTION_SINGLE_VECTOR
TLB: TLB shootdowns                    : INVALIDATE_TLB_VECTOR_START to
                                         INVALIDATE_TLB_VECTOR_END
TRM: Thermal event interrupts          : THERMAL_APIC_VECTOR
THR: Threshold APIC interrupts         : THRESHOLD_APIC_VECTOR
MCE: Machine check exceptions          : <not added>
MCP: Machine check polls               : <not added>
ERR:                                   : ERROR_APIC_VECTOR
MIS:                                   : <not added>
PLT: Platform interrupts               : X86_PLATFORM_IPI_VECTOR

$ sh ~/ftrace_test.sh
tracing...
  cpu0: overrun: 0
  cpu1: overrun: 0
  cpu2: overrun: 0
  cpu3: overrun: 0
IRQ ftrace record counts (ignoring numeric  irq=(0|1|4|5|7|9|10|24) )
      2 irq=242 name=INVALIDATE_TLB_VECTOR_2
      3 irq=252 name=CALL_FUNCTION_VECTOR
      6 irq=241 name=INVALIDATE_TLB_VECTOR_1
      6 irq=2 name=NMI_VECTOR
     22 irq=240 name=INVALIDATE_TLB_VECTOR_0
     40 irq=251 name=CALL_FUNCTION_SINGLE_VECTOR
    587 irq=253 name=RESCHEDULE_VECTOR
  15013 irq=239 name=LOCAL_TIMER_VECTOR
/proc/interrupts stats difference
      6 NMI (Non-maskable interrupts)
  14945 LOC (Local timer interrupts)
      6 PMI (Performance monitoring interrupts)
    578 RES (Rescheduling interrupts)
     42 CAL (Function call interrupts)
     30 TLB (TLB shootdowns)
      4 MCP (Machine check polls)

In above output:
NMI (Non-maskable interrupts) == NMI_VECTOR (irq 2)
LOC (Local timer interrupts) == LOCAL_TIMER_VECTOR (irq 239)
RES (Rescheduling interrupts) == RESCHEDULE_VECTOR (irq=253)
CAL (Function call interrupts) == CALL_FUNCTION_VECTOR (irq 252) +
                                  CALL_FUNCTION_SINGLE_VECTOR (irq 251)
TLB (TLB shootdowns) == INVALIDATE_TLB_VECTOR_0 (irq=240) +
                        INVALIDATE_TLB_VECTOR_1 (irq=241) +
                        INVALIDATE_TLB_VECTOR_2 (irq=242)

Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
---
Changelog:
v1-v2
* Move the irq_vector_{entry|exit} tracepoints under irq_vectors/ sub
  folder
* Rebased to 3.0.0

 arch/x86/include/asm/irq_vectors.h       |   50 ++++++++++++++++++++++++++
 arch/x86/kernel/apic/apic.c              |    7 ++++
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    3 ++
 arch/x86/kernel/cpu/mcheck/threshold.c   |    3 ++
 arch/x86/kernel/irq.c                    |    7 +++-
 arch/x86/kernel/irq_work.c               |    3 ++
 arch/x86/kernel/smp.c                    |    7 ++++
 arch/x86/kernel/time.c                   |   16 +++++---
 arch/x86/kernel/traps.c                  |    3 ++
 arch/x86/mm/tlb.c                        |    3 ++
 include/trace/events/irq_vectors.h       |   56 ++++++++++++++++++++++++++++++
 11 files changed, 151 insertions(+), 7 deletions(-)
 create mode 100644 include/trace/events/irq_vectors.h

diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 6e976ee..af06349 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -182,4 +182,54 @@ static inline int invalid_vm86_irq(int irq)
 # define NR_IRQS			NR_IRQS_LEGACY
 #endif
 
+#define irq_vector_name(sirq) { sirq, #sirq }
+#define invalidate_tlb_vector_name(i) { INVALIDATE_TLB_VECTOR_END-31+i, \
+					"INVALIDATE_TLB_VECTOR" }
+
+#define irq_vector_name_table						\
+			irq_vector_name(NMI_VECTOR),			\
+			irq_vector_name(LOCAL_TIMER_VECTOR),		\
+			irq_vector_name(ERROR_APIC_VECTOR),		\
+			irq_vector_name(RESCHEDULE_VECTOR),		\
+			irq_vector_name(CALL_FUNCTION_VECTOR),		\
+			irq_vector_name(CALL_FUNCTION_SINGLE_VECTOR),	\
+			irq_vector_name(THERMAL_APIC_VECTOR),		\
+			irq_vector_name(THRESHOLD_APIC_VECTOR),		\
+			irq_vector_name(REBOOT_VECTOR),			\
+			irq_vector_name(SPURIOUS_APIC_VECTOR),		\
+			irq_vector_name(IRQ_WORK_VECTOR),		\
+			irq_vector_name(X86_PLATFORM_IPI_VECTOR),	\
+			invalidate_tlb_vector_name(0),			\
+			invalidate_tlb_vector_name(1),			\
+			invalidate_tlb_vector_name(2),			\
+			invalidate_tlb_vector_name(3),			\
+			invalidate_tlb_vector_name(4),			\
+			invalidate_tlb_vector_name(5),			\
+			invalidate_tlb_vector_name(6),			\
+			invalidate_tlb_vector_name(7),			\
+			invalidate_tlb_vector_name(8),			\
+			invalidate_tlb_vector_name(9),			\
+			invalidate_tlb_vector_name(10),			\
+			invalidate_tlb_vector_name(11),			\
+			invalidate_tlb_vector_name(12),			\
+			invalidate_tlb_vector_name(13),			\
+			invalidate_tlb_vector_name(14),			\
+			invalidate_tlb_vector_name(15),			\
+			invalidate_tlb_vector_name(16),			\
+			invalidate_tlb_vector_name(17),			\
+			invalidate_tlb_vector_name(18),			\
+			invalidate_tlb_vector_name(19),			\
+			invalidate_tlb_vector_name(20),			\
+			invalidate_tlb_vector_name(21),			\
+			invalidate_tlb_vector_name(22),			\
+			invalidate_tlb_vector_name(23),			\
+			invalidate_tlb_vector_name(24),			\
+			invalidate_tlb_vector_name(25),			\
+			invalidate_tlb_vector_name(26),			\
+			invalidate_tlb_vector_name(27),			\
+			invalidate_tlb_vector_name(28),			\
+			invalidate_tlb_vector_name(29),			\
+			invalidate_tlb_vector_name(30),			\
+			invalidate_tlb_vector_name(31)
+
 #endif /* _ASM_X86_IRQ_VECTORS_H */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b961af8..f09f54c 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -33,6 +33,7 @@
 #include <linux/dmi.h>
 #include <linux/smp.h>
 #include <linux/mm.h>
+#include <trace/events/irq_vectors.h>
 
 #include <asm/perf_event.h>
 #include <asm/x86_init.h>
@@ -857,7 +858,9 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
 	 */
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(LOCAL_TIMER_VECTOR);
 	local_apic_timer_interrupt();
+	trace_irq_vector_exit(LOCAL_TIMER_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
@@ -1791,6 +1794,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(SPURIOUS_APIC_VECTOR);
 	/*
 	 * Check if this really is a spurious interrupt and ACK it
 	 * if it is a vectored one.  Just in case...
@@ -1805,6 +1809,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
 	pr_info("spurious APIC interrupt on CPU#%d, "
 		"should never happen.\n", smp_processor_id());
+	trace_irq_vector_exit(SPURIOUS_APIC_VECTOR);
 	irq_exit();
 }
 
@@ -1828,6 +1833,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(ERROR_APIC_VECTOR);
 	/* First tickle the hardware, only then report what went on. -- REW */
 	v0 = apic_read(APIC_ESR);
 	apic_write(APIC_ESR, 0);
@@ -1848,6 +1854,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	apic_printk(APIC_DEBUG, KERN_CONT "\n");
 
+	trace_irq_vector_exit(ERROR_APIC_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 27c6251..bd8fa96 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/smp.h>
 #include <linux/cpu.h>
+#include <trace/events/irq_vectors.h>
 
 #include <asm/processor.h>
 #include <asm/system.h>
@@ -398,8 +399,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
 {
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(THERMAL_APIC_VECTOR);
 	inc_irq_stat(irq_thermal_count);
 	smp_thermal_vector();
+	trace_irq_vector_exit(THERMAL_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index d746df2..ffde17b 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -3,6 +3,7 @@
  */
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
+#include <trace/events/irq_vectors.h>
 
 #include <asm/irq_vectors.h>
 #include <asm/apic.h>
@@ -21,8 +22,10 @@ asmlinkage void smp_threshold_interrupt(void)
 {
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(THRESHOLD_APIC_VECTOR);
 	inc_irq_stat(irq_threshold_count);
 	mce_threshold_vector();
+	trace_irq_vector_exit(THRESHOLD_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 6c0802e..1fb134e 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -17,6 +17,10 @@
 #include <asm/mce.h>
 #include <asm/hw_irq.h>
 
+#include <asm/irq_vectors.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/irq_vectors.h>
+
 atomic_t irq_err_count;
 
 /* Function pointer for generic interrupt vector handling */
@@ -211,12 +215,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
 	exit_idle();
 
 	irq_enter();
-
+	trace_irq_vector_entry(X86_PLATFORM_IPI_VECTOR);
 	inc_irq_stat(x86_platform_ipis);
 
 	if (x86_platform_ipi_callback)
 		x86_platform_ipi_callback();
 
+	trace_irq_vector_exit(X86_PLATFORM_IPI_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index ca8f703..107754c 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -8,13 +8,16 @@
 #include <linux/irq_work.h>
 #include <linux/hardirq.h>
 #include <asm/apic.h>
+#include <trace/events/irq_vectors.h>
 
 void smp_irq_work_interrupt(struct pt_regs *regs)
 {
 	irq_enter();
 	ack_APIC_irq();
+	trace_irq_vector_entry(IRQ_WORK_VECTOR);
 	inc_irq_stat(apic_irq_work_irqs);
 	irq_work_run();
+	trace_irq_vector_exit(IRQ_WORK_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 013e7eb..1b0ac99 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -22,6 +22,7 @@
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
 #include <linux/gfp.h>
+#include <trace/events/irq_vectors.h>
 
 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
@@ -199,8 +200,10 @@ static void native_stop_other_cpus(int wait)
 void smp_reschedule_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
+	trace_irq_vector_entry(RESCHEDULE_VECTOR);
 	inc_irq_stat(irq_resched_count);
 	scheduler_ipi();
+	trace_irq_vector_exit(RESCHEDULE_VECTOR);
 	/*
 	 * KVM uses this interrupt to force a cpu out of guest mode
 	 */
@@ -210,8 +213,10 @@ void smp_call_function_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_irq_vector_entry(CALL_FUNCTION_VECTOR);
 	generic_smp_call_function_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_irq_vector_exit(CALL_FUNCTION_VECTOR);
 	irq_exit();
 }
 
@@ -219,8 +224,10 @@ void smp_call_function_single_interrupt(struct pt_regs *regs)
 {
 	ack_APIC_irq();
 	irq_enter();
+	trace_irq_vector_entry(CALL_FUNCTION_SINGLE_VECTOR);
 	generic_smp_call_function_single_interrupt();
 	inc_irq_stat(irq_call_count);
+	trace_irq_vector_exit(CALL_FUNCTION_SINGLE_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 00cbb27..799c781 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -13,6 +13,7 @@
 #include <linux/interrupt.h>
 #include <linux/time.h>
 #include <linux/mca.h>
+#include <trace/events/irq.h>
 
 #include <asm/vsyscall.h>
 #include <asm/x86_init.h>
@@ -51,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs)
 }
 EXPORT_SYMBOL(profile_pc);
 
+static irqreturn_t timer_interrupt(int irq, void *dev_id);
+static struct irqaction irq0  = {
+	.handler = timer_interrupt,
+	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
+	.name = "timer"
+};
+
 /*
  * Default timer interrupt handler for PIT/HPET
  */
@@ -59,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	/* Keep nmi watchdog up to date */
 	inc_irq_stat(irq0_irqs);
 
+	trace_irq_handler_entry(0, &irq0);
 	global_clock_event->event_handler(global_clock_event);
+	trace_irq_handler_exit(0, &irq0, 1);
 
 	/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
 	if (MCA_bus)
@@ -68,12 +78,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static struct irqaction irq0  = {
-	.handler = timer_interrupt,
-	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
-	.name = "timer"
-};
-
 void __init setup_default_timer_irq(void)
 {
 	setup_irq(0, &irq0);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..1725502 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -31,6 +31,7 @@
 #include <linux/mm.h>
 #include <linux/smp.h>
 #include <linux/io.h>
+#include <trace/events/irq_vectors.h>
 
 #ifdef CONFIG_EISA
 #include <linux/ioport.h>
@@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void
 do_nmi(struct pt_regs *regs, long error_code)
 {
 	nmi_enter();
+	trace_irq_vector_entry(NMI_VECTOR);
 
 	inc_irq_stat(__nmi_count);
 
 	if (!ignore_nmis)
 		default_do_nmi(regs);
 
+	trace_irq_vector_exit(NMI_VECTOR);
 	nmi_exit();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index d6c0418..bf9475d 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -5,6 +5,7 @@
 #include <linux/smp.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
+#include <trace/events/irq_vectors.h>
 #include <linux/cpu.h>
 
 #include <asm/tlbflush.h>
@@ -141,6 +142,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs)
 	sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START;
 	f = &flush_state[sender];
 
+	trace_irq_vector_entry(INVALIDATE_TLB_VECTOR_START + sender);
 	if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask)))
 		goto out;
 		/*
@@ -167,6 +169,7 @@ out:
 	cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask));
 	smp_mb__after_clear_bit();
 	inc_irq_stat(irq_tlb_count);
+	trace_irq_vector_exit(INVALIDATE_TLB_VECTOR_START + sender);
 }
 
 static void flush_tlb_others_ipi(const struct cpumask *cpumask,
diff --git a/include/trace/events/irq_vectors.h b/include/trace/events/irq_vectors.h
new file mode 100644
index 0000000..a1b94ff
--- /dev/null
+++ b/include/trace/events/irq_vectors.h
@@ -0,0 +1,56 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM irq_vectors
+
+#if !defined(_TRACE_IRQ_VECTORS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_IRQ_VECTORS_H
+
+#include <linux/tracepoint.h>
+
+#ifndef irq_vector_name_table
+#warn "irq_vector_name_table not defined!"
+#define irq_vector_name_table { -1, NULL }
+#endif
+
+DECLARE_EVENT_CLASS(irq_vector,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq),
+
+	TP_STRUCT__entry(
+		__field(	int,	irq	)
+	),
+
+	TP_fast_assign(
+		__entry->irq = irq;
+	),
+
+	TP_printk("irq=%d name=%s", __entry->irq,
+		__print_symbolic(__entry->irq, irq_vector_name_table))
+);
+
+/*
+ * irq_vector_entry - called before enterring a interrupt vector handler
+ */
+DEFINE_EVENT(irq_vector, irq_vector_entry,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
+/*
+ * irq_vector_exit - called immediately after the interrupt vector
+ * handler returns
+ */
+DEFINE_EVENT(irq_vector, irq_vector_exit,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
+#endif /*  _TRACE_IRQ_VECTORS_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-05-31 21:28     ` [PATCH v2] " Vaibhav Nagarnaik
@ 2011-06-01  0:00       ` Frederic Weisbecker
  2011-06-01 22:38         ` Vaibhav Nagarnaik
  2011-06-21 18:45       ` [PATCH v3] " Vaibhav Nagarnaik
  1 sibling, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-06-01  0:00 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Thomas Gleixner, Ingo Molnar, Steven Rostedt, Michael Rubin,
	David Sharp, x86, linux-kernel, Jiaying Zhang

On Tue, May 31, 2011 at 02:28:18PM -0700, Vaibhav Nagarnaik wrote:
> From: Jiaying Zhang <jiayingz@google.com>
> 
> The current interrupt trace of irq_handler_entry and irq_handler_exit
> give traces of when an interrupt is handled. They provide good data
> about when the system is running in kernel space and how it affects the
> currently running applications.
> 
> Apart from this, they are IRQ vectors which trigger the system into
> kernel space. Tracing such events gives us the trace of where the system
> is spending its time. We want to know which cores are handling
> interrupts and how they are affecting other processes in the system.
> Also, the trace provides information about when the cores are idle and
> which interrupts are changing that state.
> 
> The following patch adds the event definition and trace instrumentation
> for interrupt vectors. For x86, a lookup table is provided to print out
> readable IRQ vector names. The template can be used to provide interrupt
> vector lookup tables on other architectures.
> 
> With this patch, following interrupt vectors are logged by ftrace
> (when enabled):
> 
> Interrupt as in /proc/interrupts        Name as it appears in ftrace log
> NMI: Non-maskable interrupts           : NMI_VECTOR
> LOC: Local timer interrupts            : LOCAL_TIMER_VECTOR
> SPU: Spurious interrupts               : SPURIOUS_APIC_VECTOR
> PMI: Performance monitoring interrupts : <not added>
> IWI: IRQ work interrupts               : IRQ_WORK_VECTOR
> RES: Rescheduling interrupts           : RESCHEDULE_VECTOR
> CAL: Function call interrupts          : CALL_FUNCTION_VECTOR or
>                                          CALL_FUNCTION_SINGLE_VECTOR
> TLB: TLB shootdowns                    : INVALIDATE_TLB_VECTOR_START to
>                                          INVALIDATE_TLB_VECTOR_END
> TRM: Thermal event interrupts          : THERMAL_APIC_VECTOR
> THR: Threshold APIC interrupts         : THRESHOLD_APIC_VECTOR
> MCE: Machine check exceptions          : <not added>
> MCP: Machine check polls               : <not added>
> ERR:                                   : ERROR_APIC_VECTOR
> MIS:                                   : <not added>
> PLT: Platform interrupts               : X86_PLATFORM_IPI_VECTOR
> 
> $ sh ~/ftrace_test.sh
> tracing...
>   cpu0: overrun: 0
>   cpu1: overrun: 0
>   cpu2: overrun: 0
>   cpu3: overrun: 0
> IRQ ftrace record counts (ignoring numeric  irq=(0|1|4|5|7|9|10|24) )
>       2 irq=242 name=INVALIDATE_TLB_VECTOR_2
>       3 irq=252 name=CALL_FUNCTION_VECTOR
>       6 irq=241 name=INVALIDATE_TLB_VECTOR_1
>       6 irq=2 name=NMI_VECTOR
>      22 irq=240 name=INVALIDATE_TLB_VECTOR_0
>      40 irq=251 name=CALL_FUNCTION_SINGLE_VECTOR
>     587 irq=253 name=RESCHEDULE_VECTOR
>   15013 irq=239 name=LOCAL_TIMER_VECTOR
> /proc/interrupts stats difference
>       6 NMI (Non-maskable interrupts)
>   14945 LOC (Local timer interrupts)
>       6 PMI (Performance monitoring interrupts)
>     578 RES (Rescheduling interrupts)
>      42 CAL (Function call interrupts)
>      30 TLB (TLB shootdowns)
>       4 MCP (Machine check polls)
> 
> In above output:
> NMI (Non-maskable interrupts) == NMI_VECTOR (irq 2)
> LOC (Local timer interrupts) == LOCAL_TIMER_VECTOR (irq 239)
> RES (Rescheduling interrupts) == RESCHEDULE_VECTOR (irq=253)
> CAL (Function call interrupts) == CALL_FUNCTION_VECTOR (irq 252) +
>                                   CALL_FUNCTION_SINGLE_VECTOR (irq 251)
> TLB (TLB shootdowns) == INVALIDATE_TLB_VECTOR_0 (irq=240) +
>                         INVALIDATE_TLB_VECTOR_1 (irq=241) +
>                         INVALIDATE_TLB_VECTOR_2 (irq=242)
> 
> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
> ---
> Changelog:
> v1-v2
> * Move the irq_vector_{entry|exit} tracepoints under irq_vectors/ sub
>   folder
> * Rebased to 3.0.0
> 
>  arch/x86/include/asm/irq_vectors.h       |   50 ++++++++++++++++++++++++++
>  arch/x86/kernel/apic/apic.c              |    7 ++++
>  arch/x86/kernel/cpu/mcheck/therm_throt.c |    3 ++
>  arch/x86/kernel/cpu/mcheck/threshold.c   |    3 ++
>  arch/x86/kernel/irq.c                    |    7 +++-
>  arch/x86/kernel/irq_work.c               |    3 ++
>  arch/x86/kernel/smp.c                    |    7 ++++
>  arch/x86/kernel/time.c                   |   16 +++++---
>  arch/x86/kernel/traps.c                  |    3 ++
>  arch/x86/mm/tlb.c                        |    3 ++
>  include/trace/events/irq_vectors.h       |   56 ++++++++++++++++++++++++++++++
>  11 files changed, 151 insertions(+), 7 deletions(-)
>  create mode 100644 include/trace/events/irq_vectors.h
> 
> diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
> index 6e976ee..af06349 100644
> --- a/arch/x86/include/asm/irq_vectors.h
> +++ b/arch/x86/include/asm/irq_vectors.h
> @@ -182,4 +182,54 @@ static inline int invalid_vm86_irq(int irq)
>  # define NR_IRQS			NR_IRQS_LEGACY
>  #endif
>  
> +#define irq_vector_name(sirq) { sirq, #sirq }
> +#define invalidate_tlb_vector_name(i) { INVALIDATE_TLB_VECTOR_END-31+i, \
> +					"INVALIDATE_TLB_VECTOR" }

IIRC, Thomas suggested in the v1 to turn arch tracepoints into core ones when they
stand for generic (ie: shared by all archs) event meaning.

I don't know where the discussion went after that, but that still seems to
me an important direction. Tracing irqs by their vector number should be reserved
for pure arch meanings. Sharing common tracepoints for generic events amongst archs makes life
easier for tools. Bonus when the tracepoint can be placed in the generic code.


> +#define irq_vector_name_table						\
> +			irq_vector_name(NMI_VECTOR),			\
> +			irq_vector_name(LOCAL_TIMER_VECTOR),		\

This should be a tracepoint in the generic timer interrupt handler(s).

> +			irq_vector_name(ERROR_APIC_VECTOR),		\
> +			irq_vector_name(RESCHEDULE_VECTOR),		\

Although each archs have their own implementations of rescheduling interrupt,
it's a common event amongst arch. Better have a common tracepoint for that
and not a different vector number between them.

> +			irq_vector_name(CALL_FUNCTION_VECTOR),		\
> +			irq_vector_name(CALL_FUNCTION_SINGLE_VECTOR),	\

Both should be tracepoints in kernel/smp.c: generic_smp_call_function_*_interrupt()

> +			irq_vector_name(THERMAL_APIC_VECTOR),		\
> +			irq_vector_name(THRESHOLD_APIC_VECTOR),		\
> +			irq_vector_name(REBOOT_VECTOR),			\

Reboot seems to be generic as well.

> +			irq_vector_name(SPURIOUS_APIC_VECTOR),		\
> +			irq_vector_name(IRQ_WORK_VECTOR),		\

Should be in the irq work generic code.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-01  0:00       ` Frederic Weisbecker
@ 2011-06-01 22:38         ` Vaibhav Nagarnaik
  2011-06-01 23:30           ` David Sharp
  0 siblings, 1 reply; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-06-01 22:38 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, Ingo Molnar, Steven Rostedt, Michael Rubin,
	David Sharp, x86, linux-kernel, Jiaying Zhang

On Tue, May 31, 2011 at 5:00 PM, Frederic Weisbecker
<fweisbec@gmail.com> wrote:
>
> IIRC, Thomas suggested in the v1 to turn arch tracepoints into core ones when they
> stand for generic (ie: shared by all archs) event meaning.
>
> I don't know where the discussion went after that, but that still seems to
> me an important direction. Tracing irqs by their vector number should be reserved
> for pure arch meanings. Sharing common tracepoints for generic events amongst archs makes life
> easier for tools. Bonus when the tracepoint can be placed in the generic code.
>

I agree with you that the tracepoints should be as generic as possible
to have them traced across platforms, but I see these tracepoints as
specific to x86 platform. Since most of these vectors are x86 specific.
The only ones that you pointed out as generic were:

LOCAL_TIMER_VECTOR - The generic callbacks that I found are in:
kernel/time/tick-broadcast.c (tick_handle_periodic(),
tick_handle_periodic_broadcast(), tick_handle_oneshot_broadcast()).

RESCHEDULE_VECTOR - The callback is in kernel/sched.c (scheduler_ipi())

CALL_FUNCTION_VECTOR - The callback is in kernel/smp.c
(generic_smp_call_function_interrupt())
CALL_FUNCTION_SINGLE_VECTOR - The callback is in kernel/smp.c
(generic_smp_call_function_single_interrupt())

IRQ_WORK_VECTOR - kernel/irq_work.c (irq_work_run())

REBOOT_VECTOR - For x86, this does not enter any generic code. It is
also not traced. The definition was in there only as part of the
interrupt table.

So, as I see it, there are only 5 vectors that enter the generic
handling code of the kernel. The other vectors have platform specific
handlers or are just x86 specific IRQs.

This is the reason that I added the irq_vectors tracing class under the
arch-specific code. If other platforms want to use the same event class,
it can be initialized in their platform specific code for their use with
the appropriate tracepoints added.


Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-01 22:38         ` Vaibhav Nagarnaik
@ 2011-06-01 23:30           ` David Sharp
  2011-06-16  3:02             ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: David Sharp @ 2011-06-01 23:30 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Frederic Weisbecker, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On Wed, Jun 1, 2011 at 3:38 PM, Vaibhav Nagarnaik <vnagarnaik@google.com> wrote:
> On Tue, May 31, 2011 at 5:00 PM, Frederic Weisbecker
> <fweisbec@gmail.com> wrote:
>>
>> IIRC, Thomas suggested in the v1 to turn arch tracepoints into core ones when they
>> stand for generic (ie: shared by all archs) event meaning.
>>
>> I don't know where the discussion went after that, but that still seems to
>> me an important direction. Tracing irqs by their vector number should be reserved
>> for pure arch meanings. Sharing common tracepoints for generic events amongst archs makes life
>> easier for tools. Bonus when the tracepoint can be placed in the generic code.
>>
>
> I agree with you that the tracepoints should be as generic as possible
> to have them traced across platforms, but I see these tracepoints as
> specific to x86 platform. Since most of these vectors are x86 specific.
> The only ones that you pointed out as generic were:

I think what Vaibhav means is that this is "the one trace event for
all hard IRQs on the platform". A tracepoint for the cross-platform
irq handlers is also useful, but that is not the purpose of this
tracepoint. There's not actually much overlap between platforms for
IRQs, so each platform needs to do the work of covering all their
IRQs, and won't see much benefit from the generic tracepoints.

On Tue, May 31, 2011 at 5:00 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Tracing irqs by their vector number should be reserved for pure arch meanings.

exactly, and this tracepoint is for that "arch meaning".

I'd also add that having these tracepoints here makes it more obvious
that we will get all IRQ events, which is what is important to us.

>
> LOCAL_TIMER_VECTOR - The generic callbacks that I found are in:
> kernel/time/tick-broadcast.c (tick_handle_periodic(),
> tick_handle_periodic_broadcast(), tick_handle_oneshot_broadcast()).
>
> RESCHEDULE_VECTOR - The callback is in kernel/sched.c (scheduler_ipi())
>
> CALL_FUNCTION_VECTOR - The callback is in kernel/smp.c
> (generic_smp_call_function_interrupt())
> CALL_FUNCTION_SINGLE_VECTOR - The callback is in kernel/smp.c
> (generic_smp_call_function_single_interrupt())
>
> IRQ_WORK_VECTOR - kernel/irq_work.c (irq_work_run())
>
> REBOOT_VECTOR - For x86, this does not enter any generic code. It is
> also not traced. The definition was in there only as part of the
> interrupt table.
>
> So, as I see it, there are only 5 vectors that enter the generic
> handling code of the kernel. The other vectors have platform specific
> handlers or are just x86 specific IRQs.
>
> This is the reason that I added the irq_vectors tracing class under the
> arch-specific code. If other platforms want to use the same event class,
> it can be initialized in their platform specific code for their use with
> the appropriate tracepoints added.
>
>
> Vaibhav Nagarnaik
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-01 23:30           ` David Sharp
@ 2011-06-16  3:02             ` Frederic Weisbecker
  2011-06-21 18:43               ` Vaibhav Nagarnaik
  2011-07-06 23:43               ` H. Peter Anvin
  0 siblings, 2 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2011-06-16  3:02 UTC (permalink / raw)
  To: David Sharp
  Cc: Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, x86, linux-kernel, Jiaying Zhang

Sorry guys I only found your answers today by browsing my email client.
I guess I posponed my reading but completely forgot :-(

On Wed, Jun 01, 2011 at 04:30:18PM -0700, David Sharp wrote:
> On Wed, Jun 1, 2011 at 3:38 PM, Vaibhav Nagarnaik <vnagarnaik@google.com> wrote:
> > On Tue, May 31, 2011 at 5:00 PM, Frederic Weisbecker
> > <fweisbec@gmail.com> wrote:
> >>
> >> IIRC, Thomas suggested in the v1 to turn arch tracepoints into core ones when they
> >> stand for generic (ie: shared by all archs) event meaning.
> >>
> >> I don't know where the discussion went after that, but that still seems to
> >> me an important direction. Tracing irqs by their vector number should be reserved
> >> for pure arch meanings. Sharing common tracepoints for generic events amongst archs makes life
> >> easier for tools. Bonus when the tracepoint can be placed in the generic code.
> >>
> >
> > I agree with you that the tracepoints should be as generic as possible
> > to have them traced across platforms, but I see these tracepoints as
> > specific to x86 platform. Since most of these vectors are x86 specific.
> > The only ones that you pointed out as generic were:
> 
> I think what Vaibhav means is that this is "the one trace event for
> all hard IRQs on the platform". A tracepoint for the cross-platform
> irq handlers is also useful, but that is not the purpose of this
> tracepoint.

If a generic tracepoint can replace one per arch, we all want that instead.
This is 24 times (24 archs) less maintainance burden.

So if that doesn't meet your needs, you really need to explain us in detail why.

I believe it's fine to deduce from a generic timer tracepoint that the arch fired
a specific interrupt to achieve that.

You might need some low level details, but then tell us why.

> There's not actually much overlap between platforms for
> IRQs, so each platform needs to do the work of covering all their
> IRQs, and won't see much benefit from the generic tracepoints.

There is at least 5 overlaps. And everybody will benefit from the generic tracepoints.

> On Tue, May 31, 2011 at 5:00 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > Tracing irqs by their vector number should be reserved for pure arch meanings.
> 
> exactly, and this tracepoint is for that "arch meaning".

Ok but there is no point in having an arch tracepoint where the arch code is only
doing:

arch_foo_irq()
{
	generic_foo_irq();
}

It may do some specific low level things, ack the apic, etc... But that's not what
is interesting, right? What is interesting is the fact you have an interrupt.

> I'd also add that having these tracepoints here makes it more obvious
> that we will get all IRQ events, which is what is important to us.

Look at those arch level handlers, it's doesn't seem you can miss anything there.

It can happen in the case of the local apic timer, but then you get a warning
of spurious LAPIC timer interrupt. They are corner cases, better check these
rather than tracing every layers to ensure you always get everything.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-16  3:02             ` Frederic Weisbecker
@ 2011-06-21 18:43               ` Vaibhav Nagarnaik
  2011-07-06 23:43               ` H. Peter Anvin
  1 sibling, 0 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-06-21 18:43 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, x86, linux-kernel, Jiaying Zhang

On Wed, Jun 15, 2011 at 8:02 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> If a generic tracepoint can replace one per arch, we all want that instead.
> This is 24 times (24 archs) less maintainance burden.
>
> So if that doesn't meet your needs, you really need to explain us in detail why.
>
> I believe it's fine to deduce from a generic timer tracepoint that the arch fired
> a specific interrupt to achieve that.

OK. We were approaching this problem by treating the IRQ vector
tracepoint as being a arch specific one since it is easier to trace it
there and every arch can add their specific instrumentation.

Anyhow, I looked through the patch and changed it to have the IRQs
traced in their generic handlers rather than the arch-specific handlers.
I may have missed a place or two where a timer IRQ vector might be
handled.

I am sending this patch out in a few minutes.


Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-05-31 21:28     ` [PATCH v2] " Vaibhav Nagarnaik
  2011-06-01  0:00       ` Frederic Weisbecker
@ 2011-06-21 18:45       ` Vaibhav Nagarnaik
  2011-07-06 21:50         ` Vaibhav Nagarnaik
  2011-07-07 23:34         ` Frederic Weisbecker
  1 sibling, 2 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-06-21 18:45 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Steven Rostedt, Frederic Weisbecker
  Cc: Michael Rubin, David Sharp, linux-kernel, x86, Jiaying Zhang,
	Vaibhav Nagarnaik

From: Jiaying Zhang <jiayingz@google.com>

The current interrupt trace of irq_handler_entry and irq_handler_exit
give traces of when an interrupt is handled. They provide good data
about when the system is running in kernel space and how it affects the
currently running applications.

Apart from this, they are IRQ vectors which trigger the system into
kernel space. Tracing such events gives us the trace of where the system
is spending its time. We want to know which cores are handling
interrupts and how they are affecting other processes in the system.
Also, the trace provides information about when the cores are idle and
which interrupts are changing that state.

The following patch adds the event definition and trace instrumentation
for interrupt vectors. A lookup table is provided to print out readable
IRQ vector names. Apart from the IRQ vectors handled in the generic
kernel code, some x86 specific IRQ vectors are also traced. The lookup
table can be extended by adding other arch-specific IRQ vectors.

Changelog:
v2-v3
* Move the irq_vector_{entry|exit} tracepoints back under irq/ sub
  folder
* Trace the common IRQs in generic kernel code so that all archs can
  benefit.

v1-v2
* Move the irq_vector_{entry|exit} tracepoints under irq_vectors/ sub
  folder

Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
---
 arch/x86/kernel/apic/apic.c              |    4 ++
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    2 +
 arch/x86/kernel/cpu/mcheck/threshold.c   |    2 +
 arch/x86/kernel/irq.c                    |    4 ++
 arch/x86/kernel/time.c                   |   16 ++++--
 arch/x86/kernel/traps.c                  |    3 +
 arch/x86/mm/tlb.c                        |    2 +
 include/trace/events/irq.h               |   84 ++++++++++++++++++++++++++++++
 kernel/hrtimer.c                         |    6 ++
 kernel/irq_work.c                        |    4 ++
 kernel/sched.c                           |    4 ++
 kernel/smp.c                             |    5 ++
 kernel/time/tick-broadcast.c             |   16 +++++-
 kernel/time/tick-common.c                |    8 +++-
 kernel/time/tick-sched.c                 |    4 ++
 15 files changed, 155 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b9338b8..159851c 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1792,6 +1792,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(TRACE_SPURIOUS_APIC_VECTOR);
 	/*
 	 * Check if this really is a spurious interrupt and ACK it
 	 * if it is a vectored one.  Just in case...
@@ -1806,6 +1807,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
 	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
 	pr_info("spurious APIC interrupt on CPU#%d, "
 		"should never happen.\n", smp_processor_id());
+	trace_irq_vector_exit(TRACE_SPURIOUS_APIC_VECTOR);
 	irq_exit();
 }
 
@@ -1829,6 +1831,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(TRACE_ERROR_APIC_VECTOR);
 	/* First tickle the hardware, only then report what went on. -- REW */
 	v0 = apic_read(APIC_ESR);
 	apic_write(APIC_ESR, 0);
@@ -1849,6 +1852,7 @@ void smp_error_interrupt(struct pt_regs *regs)
 
 	apic_printk(APIC_DEBUG, KERN_CONT "\n");
 
+	trace_irq_vector_exit(TRACE_ERROR_APIC_VECTOR);
 	irq_exit();
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 27c6251..424b818 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -398,8 +398,10 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
 {
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(TRACE_THERMAL_APIC_VECTOR);
 	inc_irq_stat(irq_thermal_count);
 	smp_thermal_vector();
+	trace_irq_vector_exit(TRACE_THERMAL_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index d746df2..f836a0e 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -21,8 +21,10 @@ asmlinkage void smp_threshold_interrupt(void)
 {
 	exit_idle();
 	irq_enter();
+	trace_irq_vector_entry(TRACE_THRESHOLD_APIC_VECTOR);
 	inc_irq_stat(irq_threshold_count);
 	mce_threshold_vector();
+	trace_irq_vector_exit(TRACE_THRESHOLD_APIC_VECTOR);
 	irq_exit();
 	/* Ack only at the end to avoid potential reentry */
 	ack_APIC_irq();
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 6c0802e..3b432c0 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -17,6 +17,8 @@
 #include <asm/mce.h>
 #include <asm/hw_irq.h>
 
+#include <trace/events/irq.h>
+
 atomic_t irq_err_count;
 
 /* Function pointer for generic interrupt vector handling */
@@ -212,11 +214,13 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
 
 	irq_enter();
 
+	trace_irq_vector_entry(TRACE_X86_PLATFORM_IPI_VECTOR);
 	inc_irq_stat(x86_platform_ipis);
 
 	if (x86_platform_ipi_callback)
 		x86_platform_ipi_callback();
 
+	trace_irq_vector_exit(TRACE_X86_PLATFORM_IPI_VECTOR);
 	irq_exit();
 
 	set_irq_regs(old_regs);
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 00cbb27..9179dca 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -13,6 +13,7 @@
 #include <linux/interrupt.h>
 #include <linux/time.h>
 #include <linux/mca.h>
+#include <trace/events/irq.h>
 
 #include <asm/vsyscall.h>
 #include <asm/x86_init.h>
@@ -51,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs)
 }
 EXPORT_SYMBOL(profile_pc);
 
+static irqreturn_t timer_interrupt(int irq, void *dev_id);
+static struct irqaction irq0  = {
+	.handler = timer_interrupt,
+	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
+	.name = "timer"
+};
+
 /*
  * Default timer interrupt handler for PIT/HPET
  */
@@ -59,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	/* Keep nmi watchdog up to date */
 	inc_irq_stat(irq0_irqs);
 
+	trace_irq_handler_entry(irq, &irq0);
 	global_clock_event->event_handler(global_clock_event);
+	trace_irq_handler_exit(irq, &irq0, 1);
 
 	/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
 	if (MCA_bus)
@@ -68,12 +78,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static struct irqaction irq0  = {
-	.handler = timer_interrupt,
-	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
-	.name = "timer"
-};
-
 void __init setup_default_timer_irq(void)
 {
 	setup_irq(0, &irq0);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..481ffce 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -31,6 +31,7 @@
 #include <linux/mm.h>
 #include <linux/smp.h>
 #include <linux/io.h>
+#include <trace/events/irq.h>
 
 #ifdef CONFIG_EISA
 #include <linux/ioport.h>
@@ -434,12 +435,14 @@ dotraplinkage notrace __kprobes void
 do_nmi(struct pt_regs *regs, long error_code)
 {
 	nmi_enter();
+	trace_irq_vector_entry(TRACE_NMI_VECTOR);
 
 	inc_irq_stat(__nmi_count);
 
 	if (!ignore_nmis)
 		default_do_nmi(regs);
 
+	trace_irq_vector_exit(TRACE_NMI_VECTOR);
 	nmi_exit();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index d6c0418..f1a12bc 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -141,6 +141,7 @@ void smp_invalidate_interrupt(struct pt_regs *regs)
 	sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START;
 	f = &flush_state[sender];
 
+	trace_irq_vector_entry(TRACE_INVALIDATE_TLB_VECTOR);
 	if (!cpumask_test_cpu(cpu, to_cpumask(f->flush_cpumask)))
 		goto out;
 		/*
@@ -167,6 +168,7 @@ out:
 	cpumask_clear_cpu(cpu, to_cpumask(f->flush_cpumask));
 	smp_mb__after_clear_bit();
 	inc_irq_stat(irq_tlb_count);
+	trace_irq_vector_exit(TRACE_INVALIDATE_TLB_VECTOR);
 }
 
 static void flush_tlb_others_ipi(const struct cpumask *cpumask,
diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
index 4fc1205..d2f3f1f 100644
--- a/include/trace/events/irq.h
+++ b/include/trace/events/irq.h
@@ -138,6 +138,90 @@ DEFINE_EVENT(softirq, softirq_raise,
 	TP_ARGS(vec_nr)
 );
 
+#if !defined(_TRACE_IRQ_VECTOR_ENUM)
+#define _TRACE_IRQ_VECTOR_ENUM
+enum {
+	TRACE_NMI_VECTOR,
+	TRACE_NOHZ_TIMER_VECTOR,
+	TRACE_HRTIMER_VECTOR,
+	TRACE_ONESHOT_TIMER_VECTOR,
+	TRACE_PERIODIC_TIMER_BROADCAST_VECTOR,
+	TRACE_PERIODIC_TIMER_VECTOR,
+	TRACE_ERROR_APIC_VECTOR,
+	TRACE_RESCHEDULE_VECTOR,
+	TRACE_CALL_FUNCTION_VECTOR,
+	TRACE_CALL_FUNCTION_SINGLE_VECTOR,
+	TRACE_THERMAL_APIC_VECTOR,
+	TRACE_THRESHOLD_APIC_VECTOR,
+	TRACE_REBOOT_VECTOR,
+	TRACE_SPURIOUS_APIC_VECTOR,
+	TRACE_IRQ_WORK_VECTOR,
+	TRACE_X86_PLATFORM_IPI_VECTOR,
+	TRACE_INVALIDATE_TLB_VECTOR,
+};
+#endif
+
+#define irq_vector_name(sirq) { sirq, #sirq }
+
+#define show_irq_vector_name(val)					\
+	__print_symbolic(val,						\
+			irq_vector_name(TRACE_NMI_VECTOR),		\
+			irq_vector_name(TRACE_NOHZ_TIMER_VECTOR),	\
+			irq_vector_name(TRACE_HRTIMER_VECTOR),		\
+			irq_vector_name(TRACE_ONESHOT_TIMER_VECTOR),	\
+			irq_vector_name(TRACE_PERIODIC_TIMER_BROADCAST_VECTOR),\
+			irq_vector_name(TRACE_PERIODIC_TIMER_VECTOR),	\
+			irq_vector_name(TRACE_ERROR_APIC_VECTOR),	\
+			irq_vector_name(TRACE_RESCHEDULE_VECTOR),	\
+			irq_vector_name(TRACE_CALL_FUNCTION_VECTOR),	\
+			irq_vector_name(TRACE_CALL_FUNCTION_SINGLE_VECTOR),\
+			irq_vector_name(TRACE_THERMAL_APIC_VECTOR),	\
+			irq_vector_name(TRACE_THRESHOLD_APIC_VECTOR),	\
+			irq_vector_name(TRACE_REBOOT_VECTOR),		\
+			irq_vector_name(TRACE_SPURIOUS_APIC_VECTOR),	\
+			irq_vector_name(TRACE_IRQ_WORK_VECTOR),		\
+			irq_vector_name(TRACE_X86_PLATFORM_IPI_VECTOR),	\
+			irq_vector_name(TRACE_INVALIDATE_TLB_VECTOR))
+
+
+DECLARE_EVENT_CLASS(irq_vector,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq),
+
+	TP_STRUCT__entry(
+		__field(	int,	irq	)
+	),
+
+	TP_fast_assign(
+		__entry->irq = irq;
+	),
+
+	TP_printk("name=%s", show_irq_vector_name(__entry->irq))
+);
+
+/*
+ * irq_vector_entry - called before enterring a interrupt vector handler
+ */
+DEFINE_EVENT(irq_vector, irq_vector_entry,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
+/*
+ * irq_vector_exit - called immediately after the interrupt vector
+ * handler returns
+ */
+DEFINE_EVENT(irq_vector, irq_vector_exit,
+
+	TP_PROTO(int irq),
+
+	TP_ARGS(irq)
+);
+
 #endif /*  _TRACE_IRQ_H */
 
 /* This part must be outside protection */
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index a9205e3..84a0c70 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -48,6 +48,7 @@
 
 #include <asm/uaccess.h>
 
+#include <trace/events/irq.h>
 #include <trace/events/timer.h>
 
 /*
@@ -1244,6 +1245,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	ktime_t expires_next, now, entry_time, delta;
 	int i, retries = 0;
 
+	trace_irq_vector_entry(TRACE_HRTIMER_VECTOR);
+
 	BUG_ON(!cpu_base->hres_active);
 	cpu_base->nr_events++;
 	dev->next_event.tv64 = KTIME_MAX;
@@ -1316,6 +1319,7 @@ retry:
 	if (expires_next.tv64 == KTIME_MAX ||
 	    !tick_program_event(expires_next, 0)) {
 		cpu_base->hang_detected = 0;
+		trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
 		return;
 	}
 
@@ -1355,6 +1359,8 @@ retry:
 	tick_program_event(expires_next, 1);
 	printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
 		    ktime_to_ns(delta));
+
+	trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
 }
 
 /*
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index c58fa7d..8729262 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -9,6 +9,7 @@
 #include <linux/module.h>
 #include <linux/irq_work.h>
 #include <linux/hardirq.h>
+#include <trace/events/irq.h>
 
 /*
  * An entry can be in one of four states:
@@ -125,6 +126,8 @@ void irq_work_run(void)
 	if (this_cpu_read(irq_work_list) == NULL)
 		return;
 
+	trace_irq_vector_entry(TRACE_IRQ_WORK_VECTOR);
+
 	BUG_ON(!in_irq());
 	BUG_ON(!irqs_disabled());
 
@@ -149,6 +152,7 @@ void irq_work_run(void)
 			      next_flags(NULL, IRQ_WORK_BUSY),
 			      NULL);
 	}
+	trace_irq_vector_exit(TRACE_IRQ_WORK_VECTOR);
 }
 EXPORT_SYMBOL_GPL(irq_work_run);
 
diff --git a/kernel/sched.c b/kernel/sched.c
index 3f2e502..3264754 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -80,6 +80,8 @@
 #include "workqueue_sched.h"
 #include "sched_autogroup.h"
 
+#include <trace/events/irq.h>
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/sched.h>
 
@@ -2565,7 +2567,9 @@ static void sched_ttwu_pending(void)
 
 void scheduler_ipi(void)
 {
+	trace_irq_vector_entry(TRACE_RESCHEDULE_VECTOR);
 	sched_ttwu_pending();
+	trace_irq_vector_exit(TRACE_RESCHEDULE_VECTOR);
 }
 
 static void ttwu_queue_remote(struct task_struct *p, int cpu)
diff --git a/kernel/smp.c b/kernel/smp.c
index 73a1951..1a1ce22 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -12,6 +12,7 @@
 #include <linux/gfp.h>
 #include <linux/smp.h>
 #include <linux/cpu.h>
+#include <trace/events/irq.h>
 
 #ifdef CONFIG_USE_GENERIC_SMP_HELPERS
 static struct {
@@ -175,6 +176,7 @@ void generic_smp_call_function_interrupt(void)
 	struct call_function_data *data;
 	int cpu = smp_processor_id();
 
+	trace_irq_vector_entry(TRACE_CALL_FUNCTION_VECTOR);
 	/*
 	 * Shouldn't receive this interrupt on a cpu that is not yet online.
 	 */
@@ -243,6 +245,7 @@ void generic_smp_call_function_interrupt(void)
 		csd_unlock(&data->csd);
 	}
 
+	trace_irq_vector_exit(TRACE_CALL_FUNCTION_VECTOR);
 }
 
 /*
@@ -255,6 +258,7 @@ void generic_smp_call_function_single_interrupt(void)
 	unsigned int data_flags;
 	LIST_HEAD(list);
 
+	trace_irq_vector_entry(TRACE_CALL_FUNCTION_SINGLE_VECTOR);
 	/*
 	 * Shouldn't receive this interrupt on a cpu that is not yet online.
 	 */
@@ -285,6 +289,7 @@ void generic_smp_call_function_single_interrupt(void)
 		if (data_flags & CSD_FLAG_LOCK)
 			csd_unlock(data);
 	}
+	trace_irq_vector_exit(TRACE_CALL_FUNCTION_SINGLE_VECTOR);
 }
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_single_data, csd_data);
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index c7218d1..ee98acb 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -20,6 +20,7 @@
 #include <linux/sched.h>
 
 #include "tick-internal.h"
+#include <trace/events/irq.h>
 
 /*
  * Broadcast support for broken x86 hardware, where the local apic
@@ -176,13 +177,17 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
 {
 	ktime_t next;
 
+	trace_irq_vector_entry(TRACE_PERIODIC_TIMER_BROADCAST_VECTOR);
+
 	tick_do_periodic_broadcast();
 
 	/*
 	 * The device is in periodic mode. No reprogramming necessary:
 	 */
-	if (dev->mode == CLOCK_EVT_MODE_PERIODIC)
+	if (dev->mode == CLOCK_EVT_MODE_PERIODIC) {
+		trace_irq_vector_exit(TRACE_PERIODIC_TIMER_BROADCAST_VECTOR);
 		return;
+	}
 
 	/*
 	 * Setup the next period for devices, which do not have
@@ -194,10 +199,13 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
 	for (next = dev->next_event; ;) {
 		next = ktime_add(next, tick_period);
 
-		if (!clockevents_program_event(dev, next, ktime_get()))
+		if (!clockevents_program_event(dev, next, ktime_get())) {
+			trace_irq_vector_exit(TRACE_PERIODIC_TIMER_BROADCAST_VECTOR);
 			return;
+		}
 		tick_do_periodic_broadcast();
 	}
+	trace_irq_vector_exit(TRACE_PERIODIC_TIMER_BROADCAST_VECTOR);
 }
 
 /*
@@ -404,6 +412,8 @@ static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
 	ktime_t now, next_event;
 	int cpu;
 
+	trace_irq_vector_entry(TRACE_ONESHOT_TIMER_VECTOR);
+
 	raw_spin_lock(&tick_broadcast_lock);
 again:
 	dev->next_event.tv64 = KTIME_MAX;
@@ -443,6 +453,8 @@ again:
 			goto again;
 	}
 	raw_spin_unlock(&tick_broadcast_lock);
+
+	trace_irq_vector_exit(TRACE_ONESHOT_TIMER_VECTOR);
 }
 
 /*
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 119528d..c8cf32d 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -18,6 +18,7 @@
 #include <linux/percpu.h>
 #include <linux/profile.h>
 #include <linux/sched.h>
+#include <trace/events/irq.h>
 
 #include <asm/irq_regs.h>
 
@@ -84,6 +85,8 @@ void tick_handle_periodic(struct clock_event_device *dev)
 	int cpu = smp_processor_id();
 	ktime_t next;
 
+	trace_irq_vector_entry(TRACE_PERIODIC_TIMER_VECTOR);
+
 	tick_periodic(cpu);
 
 	if (dev->mode != CLOCK_EVT_MODE_ONESHOT)
@@ -94,8 +97,10 @@ void tick_handle_periodic(struct clock_event_device *dev)
 	 */
 	next = ktime_add(dev->next_event, tick_period);
 	for (;;) {
-		if (!clockevents_program_event(dev, next, ktime_get()))
+		if (!clockevents_program_event(dev, next, ktime_get())) {
+			trace_irq_vector_exit(TRACE_PERIODIC_TIMER_VECTOR);
 			return;
+		}
 		/*
 		 * Have to be careful here. If we're in oneshot mode,
 		 * before we call tick_periodic() in a loop, we need
@@ -109,6 +114,7 @@ void tick_handle_periodic(struct clock_event_device *dev)
 			tick_periodic(cpu);
 		next = ktime_add(next, tick_period);
 	}
+	trace_irq_vector_exit(TRACE_PERIODIC_TIMER_VECTOR);
 }
 
 /*
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index d5097c4..b3f0070 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -20,6 +20,7 @@
 #include <linux/profile.h>
 #include <linux/sched.h>
 #include <linux/module.h>
+#include <trace/events/irq.h>
 
 #include <asm/irq_regs.h>
 
@@ -568,6 +569,8 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	int cpu = smp_processor_id();
 	ktime_t now = ktime_get();
 
+	trace_irq_vector_entry(TRACE_NOHZ_TIMER_VECTOR);
+
 	dev->next_event.tv64 = KTIME_MAX;
 
 	/*
@@ -604,6 +607,7 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 		now = ktime_get();
 		tick_do_update_jiffies64(now);
 	}
+	trace_irq_vector_exit(TRACE_NOHZ_TIMER_VECTOR);
 }
 
 /**
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-21 18:45       ` [PATCH v3] " Vaibhav Nagarnaik
@ 2011-07-06 21:50         ` Vaibhav Nagarnaik
  2011-07-06 23:38           ` Andi Kleen
  2011-07-07 23:34         ` Frederic Weisbecker
  1 sibling, 1 reply; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-07-06 21:50 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Steven Rostedt, Frederic Weisbecker
  Cc: Michael Rubin, David Sharp, linux-kernel, x86, Jiaying Zhang,
	Vaibhav Nagarnaik

On Tue, Jun 21, 2011 at 11:45 AM, Vaibhav Nagarnaik
<vnagarnaik@google.com> wrote:
> From: Jiaying Zhang <jiayingz@google.com>
>
> The current interrupt trace of irq_handler_entry and irq_handler_exit
> give traces of when an interrupt is handled. They provide good data
> about when the system is running in kernel space and how it affects the
> currently running applications.
>
> Apart from this, they are IRQ vectors which trigger the system into
> kernel space. Tracing such events gives us the trace of where the system
> is spending its time. We want to know which cores are handling
> interrupts and how they are affecting other processes in the system.
> Also, the trace provides information about when the cores are idle and
> which interrupts are changing that state.
>
> The following patch adds the event definition and trace instrumentation
> for interrupt vectors. A lookup table is provided to print out readable
> IRQ vector names. Apart from the IRQ vectors handled in the generic
> kernel code, some x86 specific IRQ vectors are also traced. The lookup
> table can be extended by adding other arch-specific IRQ vectors.
>
> Changelog:
> v2-v3
> * Move the irq_vector_{entry|exit} tracepoints back under irq/ sub
>  folder
> * Trace the common IRQs in generic kernel code so that all archs can
>  benefit.
>
> v1-v2
> * Move the irq_vector_{entry|exit} tracepoints under irq_vectors/ sub
>  folder
>
> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>

Hi All

Can you please take a look at this patch? Does it cover all the generic
IRQ handlers so they are traced?



Thanks

Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-06 21:50         ` Vaibhav Nagarnaik
@ 2011-07-06 23:38           ` Andi Kleen
  0 siblings, 0 replies; 35+ messages in thread
From: Andi Kleen @ 2011-07-06 23:38 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Frederic Weisbecker, Michael Rubin, David Sharp, linux-kernel,
	x86, Jiaying Zhang

Vaibhav Nagarnaik <vnagarnaik@google.com> writes:
>
> Can you please take a look at this patch? Does it cover all the generic
> IRQ handlers so they are traced?

It seems quite complete from a quick look. 

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-16  3:02             ` Frederic Weisbecker
  2011-06-21 18:43               ` Vaibhav Nagarnaik
@ 2011-07-06 23:43               ` H. Peter Anvin
  2011-07-06 23:56                 ` Frederic Weisbecker
  1 sibling, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2011-07-06 23:43 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On 06/15/2011 08:02 PM, Frederic Weisbecker wrote:
> 
> If a generic tracepoint can replace one per arch, we all want that instead.
> This is 24 times (24 archs) less maintainance burden.
> 

Are you in effect saying that tracepoints are now a stable ABI?  They
sure as hell haven't been architected, reviewed, or accepted for any
such purpose.

	-hpa

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-06 23:43               ` H. Peter Anvin
@ 2011-07-06 23:56                 ` Frederic Weisbecker
  2011-07-07  0:02                   ` H. Peter Anvin
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-06 23:56 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: David Sharp, Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On Wed, Jul 06, 2011 at 04:43:23PM -0700, H. Peter Anvin wrote:
> On 06/15/2011 08:02 PM, Frederic Weisbecker wrote:
> > 
> > If a generic tracepoint can replace one per arch, we all want that instead.
> > This is 24 times (24 archs) less maintainance burden.
> > 
> 
> Are you in effect saying that tracepoints are now a stable ABI?  They
> sure as hell haven't been architected, reviewed, or accepted for any
> such purpose.

Nope, I'm rather suggesting that it's better to have 1 line of code instead of 24.

And even if tracepoints are not supposed to be stable ABI (I actually believe there
is no absolute answer to that, it depends on the tracepoint and on the context),
tools are much happier with a common tracepoint interface amongst archs.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-06 23:56                 ` Frederic Weisbecker
@ 2011-07-07  0:02                   ` H. Peter Anvin
  2011-07-07  0:25                     ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2011-07-07  0:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On 07/06/2011 04:56 PM, Frederic Weisbecker wrote:
> And even if tracepoints are not supposed to be stable ABI (I actually believe there
> is no absolute answer to that, it depends on the tracepoint and on the context),
> tools are much happier with a common tracepoint interface amongst archs.

What you're suggesting here is an ad hoc ABI.  This is insanely
irresponsible -- ABIs need to be designed, not randomly appeared.

	-hpa

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07  0:02                   ` H. Peter Anvin
@ 2011-07-07  0:25                     ` Frederic Weisbecker
  2011-07-07  0:30                       ` H. Peter Anvin
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-07  0:25 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: David Sharp, Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On Wed, Jul 06, 2011 at 05:02:31PM -0700, H. Peter Anvin wrote:
> On 07/06/2011 04:56 PM, Frederic Weisbecker wrote:
> > And even if tracepoints are not supposed to be stable ABI (I actually believe there
> > is no absolute answer to that, it depends on the tracepoint and on the context),
> > tools are much happier with a common tracepoint interface amongst archs.
> 
> What you're suggesting here is an ad hoc ABI.  This is insanely
> irresponsible -- ABIs need to be designed, not randomly appeared.

I'm suggesting two things:

1) If every arch implement a tracepoint for a generic event, then move the tracepoint
to the generic code. I believe that part is not very controversial.

2) If every arch implement a common event that is not implemented in core code (I believe
it was the case for reschedule_interrupt few times ago but changed lately) then try
to have a common tracepoint for every archs if possible. So that we don't have
thousand names for the same event, or different parameters name when those parameters and the
event semantic are exactly the same amongst every architecture.

Trace events are irresponsible anyway because they involve that "ABI not really stable but tools
rely on it so...well...)". I'm not sure the above points make the situation worse though. Probably
the cases that fit in 2) need to be carefully checked to ensure they really fit in 2).

Having each archs proposing their own wild tracepoints is not going to cure the fundamental
nature of trace events. An arch that implements its own tracepoint already applies a new ad-hoc
ABI. Its scope is just smaller.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07  0:25                     ` Frederic Weisbecker
@ 2011-07-07  0:30                       ` H. Peter Anvin
  2011-07-07  0:51                         ` Frederic Weisbecker
  2011-07-07  9:57                         ` Ingo Molnar
  0 siblings, 2 replies; 35+ messages in thread
From: H. Peter Anvin @ 2011-07-07  0:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On 07/06/2011 05:25 PM, Frederic Weisbecker wrote:
> 
> I'm suggesting two things:
> 
> 1) If every arch implement a tracepoint for a generic event, then move the tracepoint
> to the generic code. I believe that part is not very controversial.
> 

Agreed -- as long as it doesn't mean breaking the flow for specific arches.

> 2) If every arch implement a common event that is not implemented in core code (I believe
> it was the case for reschedule_interrupt few times ago but changed lately) then try
> to have a common tracepoint for every archs if possible. So that we don't have
> thousand names for the same event, or different parameters name when those parameters and the
> event semantic are exactly the same amongst every architecture.

Agreed as well.

> Trace events are irresponsible anyway because they involve that "ABI not really stable but tools
> rely on it so...well...)". I'm not sure the above points make the situation worse though. Probably
> the cases that fit in 2) need to be carefully checked to ensure they really fit in 2).

Yes, it was much more of a generic concern.  However, it is very
important that people have a correct idea about what the stability of
something like tracepoint is -- or we'll end up in a situation where we
can never change the kernel because anything is suddenly "user space
visible."

	-hpa

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07  0:30                       ` H. Peter Anvin
@ 2011-07-07  0:51                         ` Frederic Weisbecker
  2011-07-07  9:57                         ` Ingo Molnar
  1 sibling, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-07  0:51 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: David Sharp, Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Michael Rubin, x86, linux-kernel, Jiaying Zhang

On Wed, Jul 06, 2011 at 05:30:18PM -0700, H. Peter Anvin wrote:
> > Trace events are irresponsible anyway because they involve that "ABI not really stable but tools
> > rely on it so...well...)". I'm not sure the above points make the situation worse though. Probably
> > the cases that fit in 2) need to be carefully checked to ensure they really fit in 2).
> 
> Yes, it was much more of a generic concern.  However, it is very
> important that people have a correct idea about what the stability of
> something like tracepoint is -- or we'll end up in a situation where we
> can never change the kernel because anything is suddenly "user space
> visible."

Agreed. Well the current situation is a best effort to keep trace events stable, but when
subsystems evolve, some trace events don't make any sense anymore, or some parameters need
to be changed or removed.

This break tools but OTOH those tools were measuring something that don't make sense anymore
with newer kernels. I guess we don't have the choice there.

But the situation is more scary when some tracepoints seem to have been badly designed and need
refactoring. I believe it was the case of power events. And now we are forced to maintain old
tracepoints that halfway make sense, just to avoid breaking old tools.

So yeah, we definetly need to care about new tracepoints coming. Especially as they touch generic
parts but archs are also to be considered.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07  0:30                       ` H. Peter Anvin
  2011-07-07  0:51                         ` Frederic Weisbecker
@ 2011-07-07  9:57                         ` Ingo Molnar
  2011-07-07 22:50                           ` David Sharp
  1 sibling, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2011-07-07  9:57 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Frederic Weisbecker, David Sharp, Vaibhav Nagarnaik,
	Thomas Gleixner, Ingo Molnar, Steven Rostedt, Michael Rubin, x86,
	linux-kernel, Jiaying Zhang


* H. Peter Anvin <hpa@zytor.com> wrote:

> Yes, it was much more of a generic concern.  However, it is very 
> important that people have a correct idea about what the stability 
> of something like tracepoint is -- or we'll end up in a situation 
> where we can never change the kernel because anything is suddenly 
> "user space visible."

We've transitioned even ABI-assuming tracepoints in the past, so it's 
not a big issue in practice. The reason is that this is an atypical 
type of ABI: information is read-only exported, for observation 
purposes.

If the kernel changes in a fundamental way that removes a tracepoint 
altogether, then there's nothing left to observe - so apps don't 
break per se.

So i've yet to see a single example of the kernel 'never being able 
to change' due to a tracepoint. The worst we've seen in practice is 
the inability to change a specific tracepoint (not the surrounding 
kernel code - while preserving the information that is exposed) - so 
the worst effect was limited to tracing itself - never to the 
subsystem that it traces.

Note that even in that (single known) example we were able to resolve 
the problem (which was limited to the tracing subsystem) by adding 
new tracepoints and thus phasing out the old ones.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07  9:57                         ` Ingo Molnar
@ 2011-07-07 22:50                           ` David Sharp
  2011-07-07 23:00                             ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: David Sharp @ 2011-07-07 22:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Frederic Weisbecker, Vaibhav Nagarnaik,
	Thomas Gleixner, Ingo Molnar, Steven Rostedt, Michael Rubin, x86,
	linux-kernel, Jiaying Zhang

On Thu, Jul 7, 2011 at 2:57 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * H. Peter Anvin <hpa@zytor.com> wrote:
>
>> Yes, it was much more of a generic concern.  However, it is very
>> important that people have a correct idea about what the stability
>> of something like tracepoint is -- or we'll end up in a situation
>> where we can never change the kernel because anything is suddenly
>> "user space visible."
>
> We've transitioned even ABI-assuming tracepoints in the past, so it's
> not a big issue in practice. The reason is that this is an atypical
> type of ABI: information is read-only exported, for observation
> purposes.
>
> If the kernel changes in a fundamental way that removes a tracepoint
> altogether, then there's nothing left to observe - so apps don't
> break per se.
>
> So i've yet to see a single example of the kernel 'never being able
> to change' due to a tracepoint. The worst we've seen in practice is
> the inability to change a specific tracepoint (not the surrounding
> kernel code - while preserving the information that is exposed) - so
> the worst effect was limited to tracing itself - never to the
> subsystem that it traces.
>
> Note that even in that (single known) example we were able to resolve
> the problem (which was limited to the tracing subsystem) by adding
> new tracepoints and thus phasing out the old ones.
>
> Thanks,
>
>        Ingo
>

Thanks all for your thorough review. :) It sounds like there is some
agreement now. I think this Steve is waiting for an Acked-by from an
x86 maintainer to apply this patch. Are there any further objections
or comments on the patch?

d#

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07 22:50                           ` David Sharp
@ 2011-07-07 23:00                             ` Frederic Weisbecker
  0 siblings, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-07 23:00 UTC (permalink / raw)
  To: David Sharp
  Cc: Ingo Molnar, H. Peter Anvin, Vaibhav Nagarnaik, Thomas Gleixner,
	Ingo Molnar, Steven Rostedt, Michael Rubin, x86, linux-kernel,
	Jiaying Zhang

On Thu, Jul 07, 2011 at 03:50:04PM -0700, David Sharp wrote:
> On Thu, Jul 7, 2011 at 2:57 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * H. Peter Anvin <hpa@zytor.com> wrote:
> >
> >> Yes, it was much more of a generic concern.  However, it is very
> >> important that people have a correct idea about what the stability
> >> of something like tracepoint is -- or we'll end up in a situation
> >> where we can never change the kernel because anything is suddenly
> >> "user space visible."
> >
> > We've transitioned even ABI-assuming tracepoints in the past, so it's
> > not a big issue in practice. The reason is that this is an atypical
> > type of ABI: information is read-only exported, for observation
> > purposes.
> >
> > If the kernel changes in a fundamental way that removes a tracepoint
> > altogether, then there's nothing left to observe - so apps don't
> > break per se.
> >
> > So i've yet to see a single example of the kernel 'never being able
> > to change' due to a tracepoint. The worst we've seen in practice is
> > the inability to change a specific tracepoint (not the surrounding
> > kernel code - while preserving the information that is exposed) - so
> > the worst effect was limited to tracing itself - never to the
> > subsystem that it traces.
> >
> > Note that even in that (single known) example we were able to resolve
> > the problem (which was limited to the tracing subsystem) by adding
> > new tracepoints and thus phasing out the old ones.
> >
> > Thanks,
> >
> >        Ingo
> >
> 
> Thanks all for your thorough review. :) It sounds like there is some
> agreement now. I think this Steve is waiting for an Acked-by from an
> x86 maintainer to apply this patch. Are there any further objections
> or comments on the patch?

Yeah I have some comments on it. I just postponed my reply because I did
not have clear suggestions to propose back until now.

I'm posting that now.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-06-21 18:45       ` [PATCH v3] " Vaibhav Nagarnaik
  2011-07-06 21:50         ` Vaibhav Nagarnaik
@ 2011-07-07 23:34         ` Frederic Weisbecker
  2011-07-08  0:54           ` David Sharp
  1 sibling, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-07 23:34 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: Thomas Gleixner, Ingo Molnar, Steven Rostedt, Michael Rubin,
	David Sharp, linux-kernel, x86, Jiaying Zhang

On Tue, Jun 21, 2011 at 11:45:03AM -0700, Vaibhav Nagarnaik wrote:
> From: Jiaying Zhang <jiayingz@google.com>
> 
> The current interrupt trace of irq_handler_entry and irq_handler_exit
> give traces of when an interrupt is handled. They provide good data
> about when the system is running in kernel space and how it affects the
> currently running applications.
> 
> Apart from this, they are IRQ vectors which trigger the system into
> kernel space. Tracing such events gives us the trace of where the system
> is spending its time. We want to know which cores are handling
> interrupts and how they are affecting other processes in the system.
> Also, the trace provides information about when the cores are idle and
> which interrupts are changing that state.
> 
> The following patch adds the event definition and trace instrumentation
> for interrupt vectors. A lookup table is provided to print out readable
> IRQ vector names. Apart from the IRQ vectors handled in the generic
> kernel code, some x86 specific IRQ vectors are also traced. The lookup
> table can be extended by adding other arch-specific IRQ vectors.
> 
> Changelog:
> v2-v3
> * Move the irq_vector_{entry|exit} tracepoints back under irq/ sub
>   folder
> * Trace the common IRQs in generic kernel code so that all archs can
>   benefit.
> 
> v1-v2
> * Move the irq_vector_{entry|exit} tracepoints under irq_vectors/ sub
>   folder
> 
> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
> ---
>  arch/x86/kernel/apic/apic.c              |    4 ++
>  arch/x86/kernel/cpu/mcheck/therm_throt.c |    2 +
>  arch/x86/kernel/cpu/mcheck/threshold.c   |    2 +
>  arch/x86/kernel/irq.c                    |    4 ++
>  arch/x86/kernel/time.c                   |   16 ++++--
>  arch/x86/kernel/traps.c                  |    3 +
>  arch/x86/mm/tlb.c                        |    2 +
>  include/trace/events/irq.h               |   84 ++++++++++++++++++++++++++++++
>  kernel/hrtimer.c                         |    6 ++
>  kernel/irq_work.c                        |    4 ++
>  kernel/sched.c                           |    4 ++
>  kernel/smp.c                             |    5 ++
>  kernel/time/tick-broadcast.c             |   16 +++++-
>  kernel/time/tick-common.c                |    8 +++-
>  kernel/time/tick-sched.c                 |    4 ++
>  15 files changed, 155 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index b9338b8..159851c 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1792,6 +1792,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
>  
>  	exit_idle();
>  	irq_enter();
> +	trace_irq_vector_entry(TRACE_SPURIOUS_APIC_VECTOR);
>  	/*
>  	 * Check if this really is a spurious interrupt and ACK it
>  	 * if it is a vectored one.  Just in case...
> @@ -1806,6 +1807,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
>  	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
>  	pr_info("spurious APIC interrupt on CPU#%d, "
>  		"should never happen.\n", smp_processor_id());
> +	trace_irq_vector_exit(TRACE_SPURIOUS_APIC_VECTOR);
>  	irq_exit();
>  }
>

Seems a lot of these tracepoints fit under that pattern:

                irq_enter();
        +       trace_irq_entry_entry(foo)

                do things

        +       trace_irq_exit(foo)
                irq_exit();


What if we instead had trace_irq_enter() inside irq_enter() and
trace_irq_exit() inside irq_exit() ?

Then your specific irq tracepoints would just be one thing that inform us
about the nature of the interrupt:

                irq_enter();
        +       trace_irq_vector_entry(foo);

                do things

                irq_exit();

Those who are only interested in knowing when we have irqs can just turn
on trace_irq_enter() and trace_irq_exit().

Those who want more details about the nature of these interrupts can turn
on trace_irq_vector() and trace_irq_exit().

So you don't need the trace_irq_vector_exit anymore.

How does that sound?

  
> diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
> index 00cbb27..9179dca 100644
> --- a/arch/x86/kernel/time.c
> +++ b/arch/x86/kernel/time.c
> @@ -13,6 +13,7 @@
>  #include <linux/interrupt.h>
>  #include <linux/time.h>
>  #include <linux/mca.h>
> +#include <trace/events/irq.h>
>  
>  #include <asm/vsyscall.h>
>  #include <asm/x86_init.h>
> @@ -51,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs)
>  }
>  EXPORT_SYMBOL(profile_pc);
>  
> +static irqreturn_t timer_interrupt(int irq, void *dev_id);
> +static struct irqaction irq0  = {
> +	.handler = timer_interrupt,
> +	.flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
> +	.name = "timer"
> +};
> +
>  /*
>   * Default timer interrupt handler for PIT/HPET
>   */
> @@ -59,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
>  	/* Keep nmi watchdog up to date */
>  	inc_irq_stat(irq0_irqs);
>  
> +	trace_irq_handler_entry(irq, &irq0);
>  	global_clock_event->event_handler(global_clock_event);
> +	trace_irq_handler_exit(irq, &irq0, 1);

Seems it should go to generic event handlers in kernel/time ?

> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> index 4fc1205..d2f3f1f 100644
> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -138,6 +138,90 @@ DEFINE_EVENT(softirq, softirq_raise,
>  	TP_ARGS(vec_nr)
>  );
>  
> +#if !defined(_TRACE_IRQ_VECTOR_ENUM)
> +#define _TRACE_IRQ_VECTOR_ENUM
> +enum {
> +	TRACE_NMI_VECTOR,
> +	TRACE_NOHZ_TIMER_VECTOR,
> +	TRACE_HRTIMER_VECTOR,
> +	TRACE_ONESHOT_TIMER_VECTOR,
> +	TRACE_PERIODIC_TIMER_BROADCAST_VECTOR,
> +	TRACE_PERIODIC_TIMER_VECTOR,
> +	TRACE_ERROR_APIC_VECTOR,
> +	TRACE_RESCHEDULE_VECTOR,
> +	TRACE_CALL_FUNCTION_VECTOR,
> +	TRACE_CALL_FUNCTION_SINGLE_VECTOR,
> +	TRACE_THERMAL_APIC_VECTOR,
> +	TRACE_THRESHOLD_APIC_VECTOR,
> +	TRACE_REBOOT_VECTOR,
> +	TRACE_SPURIOUS_APIC_VECTOR,
> +	TRACE_IRQ_WORK_VECTOR,
> +	TRACE_X86_PLATFORM_IPI_VECTOR,
> +	TRACE_INVALIDATE_TLB_VECTOR,
> +};
> +#endif

Pure x86 things shouldn't be in a generic header.

include/trace/event/irq.h should contain generic things. And I'm not sure
we want some generalized vector namespace for that but rather single contained
tracepoint names that make sense:

trace_ipi_function(), trace_irq_reschedule()...

But vector namespace makes sense for archs yeah.

>  /* This part must be outside protection */
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index a9205e3..84a0c70 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -48,6 +48,7 @@
>  
>  #include <asm/uaccess.h>
>  
> +#include <trace/events/irq.h>
>  #include <trace/events/timer.h>
>  
>  /*
> @@ -1244,6 +1245,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>  	ktime_t expires_next, now, entry_time, delta;
>  	int i, retries = 0;
>  
> +	trace_irq_vector_entry(TRACE_HRTIMER_VECTOR);
> +
>  	BUG_ON(!cpu_base->hres_active);
>  	cpu_base->nr_events++;
>  	dev->next_event.tv64 = KTIME_MAX;
> @@ -1316,6 +1319,7 @@ retry:
>  	if (expires_next.tv64 == KTIME_MAX ||
>  	    !tick_program_event(expires_next, 0)) {
>  		cpu_base->hang_detected = 0;
> +		trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
>  		return;
>  	}
>  
> @@ -1355,6 +1359,8 @@ retry:
>  	tick_program_event(expires_next, 1);
>  	printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
>  		    ktime_to_ns(delta));
> +
> +	trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);

We already have hrtimer tracepoints.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-07 23:34         ` Frederic Weisbecker
@ 2011-07-08  0:54           ` David Sharp
  2011-07-11 15:54             ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: David Sharp @ 2011-07-08  0:54 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Thu, Jul 7, 2011 at 4:34 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jun 21, 2011 at 11:45:03AM -0700, Vaibhav Nagarnaik wrote:
>> From: Jiaying Zhang <jiayingz@google.com>
>>
>> The current interrupt trace of irq_handler_entry and irq_handler_exit
>> give traces of when an interrupt is handled. They provide good data
>> about when the system is running in kernel space and how it affects the
>> currently running applications.
>>
>> Apart from this, they are IRQ vectors which trigger the system into
>> kernel space. Tracing such events gives us the trace of where the system
>> is spending its time. We want to know which cores are handling
>> interrupts and how they are affecting other processes in the system.
>> Also, the trace provides information about when the cores are idle and
>> which interrupts are changing that state.
>>
>> The following patch adds the event definition and trace instrumentation
>> for interrupt vectors. A lookup table is provided to print out readable
>> IRQ vector names. Apart from the IRQ vectors handled in the generic
>> kernel code, some x86 specific IRQ vectors are also traced. The lookup
>> table can be extended by adding other arch-specific IRQ vectors.
>>
>> Changelog:
>> v2-v3
>> * Move the irq_vector_{entry|exit} tracepoints back under irq/ sub
>>   folder
>> * Trace the common IRQs in generic kernel code so that all archs can
>>   benefit.
>>
>> v1-v2
>> * Move the irq_vector_{entry|exit} tracepoints under irq_vectors/ sub
>>   folder
>>
>> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
>> ---
>>  arch/x86/kernel/apic/apic.c              |    4 ++
>>  arch/x86/kernel/cpu/mcheck/therm_throt.c |    2 +
>>  arch/x86/kernel/cpu/mcheck/threshold.c   |    2 +
>>  arch/x86/kernel/irq.c                    |    4 ++
>>  arch/x86/kernel/time.c                   |   16 ++++--
>>  arch/x86/kernel/traps.c                  |    3 +
>>  arch/x86/mm/tlb.c                        |    2 +
>>  include/trace/events/irq.h               |   84 ++++++++++++++++++++++++++++++
>>  kernel/hrtimer.c                         |    6 ++
>>  kernel/irq_work.c                        |    4 ++
>>  kernel/sched.c                           |    4 ++
>>  kernel/smp.c                             |    5 ++
>>  kernel/time/tick-broadcast.c             |   16 +++++-
>>  kernel/time/tick-common.c                |    8 +++-
>>  kernel/time/tick-sched.c                 |    4 ++
>>  15 files changed, 155 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>> index b9338b8..159851c 100644
>> --- a/arch/x86/kernel/apic/apic.c
>> +++ b/arch/x86/kernel/apic/apic.c
>> @@ -1792,6 +1792,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
>>
>>       exit_idle();
>>       irq_enter();
>> +     trace_irq_vector_entry(TRACE_SPURIOUS_APIC_VECTOR);
>>       /*
>>        * Check if this really is a spurious interrupt and ACK it
>>        * if it is a vectored one.  Just in case...
>> @@ -1806,6 +1807,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
>>       /* see sw-dev-man vol 3, chapter 7.4.13.5 */
>>       pr_info("spurious APIC interrupt on CPU#%d, "
>>               "should never happen.\n", smp_processor_id());
>> +     trace_irq_vector_exit(TRACE_SPURIOUS_APIC_VECTOR);
>>       irq_exit();
>>  }
>>
>
> Seems a lot of these tracepoints fit under that pattern:
>
>                irq_enter();
>        +       trace_irq_entry_entry(foo)
>
>                do things
>
>        +       trace_irq_exit(foo)
>                irq_exit();
>
>
> What if we instead had trace_irq_enter() inside irq_enter() and
> trace_irq_exit() inside irq_exit() ?
>
> Then your specific irq tracepoints would just be one thing that inform us
> about the nature of the interrupt:
>
>                irq_enter();
>        +       trace_irq_vector_entry(foo);
>
>                do things
>
>                irq_exit();
>
> Those who are only interested in knowing when we have irqs can just turn
> on trace_irq_enter() and trace_irq_exit().
>
> Those who want more details about the nature of these interrupts can turn
> on trace_irq_vector() and trace_irq_exit().
>
> So you don't need the trace_irq_vector_exit anymore.
>
> How does that sound?

That roughly doubles the overhead of hitting these irqs, and since
irqs can be nested, it's not guaranteed that the "irq_vector" event
immediately follows "irq_enter", so I think the trace could be
confusing.

I also think it's far more useful to know which interrupt occurred
than "some" interrupt occurred, and I don't see a case where one would
not want that info.

Moving the tracepoints into irq_enter() and irq_exit() might be
sensible, but I'd want to pass down that info.

>
>
>> diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
>> index 00cbb27..9179dca 100644
>> --- a/arch/x86/kernel/time.c
>> +++ b/arch/x86/kernel/time.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/interrupt.h>
>>  #include <linux/time.h>
>>  #include <linux/mca.h>
>> +#include <trace/events/irq.h>
>>
>>  #include <asm/vsyscall.h>
>>  #include <asm/x86_init.h>
>> @@ -51,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs)
>>  }
>>  EXPORT_SYMBOL(profile_pc);
>>
>> +static irqreturn_t timer_interrupt(int irq, void *dev_id);
>> +static struct irqaction irq0  = {
>> +     .handler = timer_interrupt,
>> +     .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
>> +     .name = "timer"
>> +};
>> +
>>  /*
>>   * Default timer interrupt handler for PIT/HPET
>>   */
>> @@ -59,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
>>       /* Keep nmi watchdog up to date */
>>       inc_irq_stat(irq0_irqs);
>>
>> +     trace_irq_handler_entry(irq, &irq0);
>>       global_clock_event->event_handler(global_clock_event);
>> +     trace_irq_handler_exit(irq, &irq0, 1);
>
> Seems it should go to generic event handlers in kernel/time ?

(same response as hrtimers, below...)

>
>> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
>> index 4fc1205..d2f3f1f 100644
>> --- a/include/trace/events/irq.h
>> +++ b/include/trace/events/irq.h
>> @@ -138,6 +138,90 @@ DEFINE_EVENT(softirq, softirq_raise,
>>       TP_ARGS(vec_nr)
>>  );
>>
>> +#if !defined(_TRACE_IRQ_VECTOR_ENUM)
>> +#define _TRACE_IRQ_VECTOR_ENUM
>> +enum {
>> +     TRACE_NMI_VECTOR,
>> +     TRACE_NOHZ_TIMER_VECTOR,
>> +     TRACE_HRTIMER_VECTOR,
>> +     TRACE_ONESHOT_TIMER_VECTOR,
>> +     TRACE_PERIODIC_TIMER_BROADCAST_VECTOR,
>> +     TRACE_PERIODIC_TIMER_VECTOR,
>> +     TRACE_ERROR_APIC_VECTOR,
>> +     TRACE_RESCHEDULE_VECTOR,
>> +     TRACE_CALL_FUNCTION_VECTOR,
>> +     TRACE_CALL_FUNCTION_SINGLE_VECTOR,
>> +     TRACE_THERMAL_APIC_VECTOR,
>> +     TRACE_THRESHOLD_APIC_VECTOR,
>> +     TRACE_REBOOT_VECTOR,
>> +     TRACE_SPURIOUS_APIC_VECTOR,
>> +     TRACE_IRQ_WORK_VECTOR,
>> +     TRACE_X86_PLATFORM_IPI_VECTOR,
>> +     TRACE_INVALIDATE_TLB_VECTOR,
>> +};
>> +#endif
>
> Pure x86 things shouldn't be in a generic header.

Other platforms have IPI vector interrupts, so we can just remove
"X86_" from the name. Would that work?

>
> include/trace/event/irq.h should contain generic things. And I'm not sure
> we want some generalized vector namespace for that but rather single contained
> tracepoint names that make sense:
>
> trace_ipi_function(), trace_irq_reschedule()...

I don't see how that can fit with putting the tracepoints in
irq_enter() and irq_exit()...

Also I'm not sure it makes sense to bloat the trace events with
individual tracepoints for interrupts. Some of these events are
relatively rare in archs: RESCHEDULE_VECTOR is only in x86 and ia64,
and IPI will not be in any unicore machine, for example. They may not
warrant having their own events instead of an enum entry.

>
> But vector namespace makes sense for archs yeah.
>
>>  /* This part must be outside protection */
>> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
>> index a9205e3..84a0c70 100644
>> --- a/kernel/hrtimer.c
>> +++ b/kernel/hrtimer.c
>> @@ -48,6 +48,7 @@
>>
>>  #include <asm/uaccess.h>
>>
>> +#include <trace/events/irq.h>
>>  #include <trace/events/timer.h>
>>
>>  /*
>> @@ -1244,6 +1245,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>>       ktime_t expires_next, now, entry_time, delta;
>>       int i, retries = 0;
>>
>> +     trace_irq_vector_entry(TRACE_HRTIMER_VECTOR);
>> +
>>       BUG_ON(!cpu_base->hres_active);
>>       cpu_base->nr_events++;
>>       dev->next_event.tv64 = KTIME_MAX;
>> @@ -1316,6 +1319,7 @@ retry:
>>       if (expires_next.tv64 == KTIME_MAX ||
>>           !tick_program_event(expires_next, 0)) {
>>               cpu_base->hang_detected = 0;
>> +             trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
>>               return;
>>       }
>>
>> @@ -1355,6 +1359,8 @@ retry:
>>       tick_program_event(expires_next, 1);
>>       printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
>>                   ktime_to_ns(delta));
>> +
>> +     trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
>
> We already have hrtimer tracepoints.
>

Yes, but those events are about hrtimer events and buried deep in
hrtimer code. If I enable all events in the "irq" trace event
subsystem, I want to see every interrupt. I don't want to remember "oh
yeah, I need to enable x,y,z events too." Putting the event in the
"irq" subsystem and in the interrupt handler makes this crystal clear.
And if the hrtimer code is (or is ever) such that an interrupt path
doesn't reach the hrtimer tracepoints, I still want to know that the
interrupt occurred.

Here's our motivation: We use interrupt events, along with traps,
syscall, and sched_switch events to know when kernel code is running
instead of user code. So, being clear that we got all interrupts (and
which interrupt) is important.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-08  0:54           ` David Sharp
@ 2011-07-11 15:54             ` Frederic Weisbecker
  2011-07-11 18:21               ` Vaibhav Nagarnaik
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-11 15:54 UTC (permalink / raw)
  To: David Sharp
  Cc: Vaibhav Nagarnaik, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Thu, Jul 07, 2011 at 05:54:02PM -0700, David Sharp wrote:
> On Thu, Jul 7, 2011 at 4:34 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> >> index b9338b8..159851c 100644
> >> --- a/arch/x86/kernel/apic/apic.c
> >> +++ b/arch/x86/kernel/apic/apic.c
> >> @@ -1792,6 +1792,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
> >>
> >>       exit_idle();
> >>       irq_enter();
> >> +     trace_irq_vector_entry(TRACE_SPURIOUS_APIC_VECTOR);
> >>       /*
> >>        * Check if this really is a spurious interrupt and ACK it
> >>        * if it is a vectored one.  Just in case...
> >> @@ -1806,6 +1807,7 @@ void smp_spurious_interrupt(struct pt_regs *regs)
> >>       /* see sw-dev-man vol 3, chapter 7.4.13.5 */
> >>       pr_info("spurious APIC interrupt on CPU#%d, "
> >>               "should never happen.\n", smp_processor_id());
> >> +     trace_irq_vector_exit(TRACE_SPURIOUS_APIC_VECTOR);
> >>       irq_exit();
> >>  }
> >>
> >
> > Seems a lot of these tracepoints fit under that pattern:
> >
> >                irq_enter();
> >        +       trace_irq_entry_entry(foo)
> >
> >                do things
> >
> >        +       trace_irq_exit(foo)
> >                irq_exit();
> >
> >
> > What if we instead had trace_irq_enter() inside irq_enter() and
> > trace_irq_exit() inside irq_exit() ?
> >
> > Then your specific irq tracepoints would just be one thing that inform us
> > about the nature of the interrupt:
> >
> >                irq_enter();
> >        +       trace_irq_vector_entry(foo);
> >
> >                do things
> >
> >                irq_exit();
> >
> > Those who are only interested in knowing when we have irqs can just turn
> > on trace_irq_enter() and trace_irq_exit().
> >
> > Those who want more details about the nature of these interrupts can turn
> > on trace_irq_vector() and trace_irq_exit().
> >
> > So you don't need the trace_irq_vector_exit anymore.
> >
> > How does that sound?
> 
> That roughly doubles the overhead of hitting these irqs, and since
> irqs can be nested, it's not guaranteed that the "irq_vector" event
> immediately follows "irq_enter", so I think the trace could be
> confusing.

irqs can nest if irqs are unmasked which I believe doesn't happen until
we reach the handler or we switch to softirq.

> 
> I also think it's far more useful to know which interrupt occurred
> than "some" interrupt occurred, and I don't see a case where one would
> not want that info.
> 
> Moving the tracepoints into irq_enter() and irq_exit() might be
> sensible, but I'd want to pass down that info.

Let me clarify what I proposed, we have that:

	irq_enter() // trace_irq_enter();

	trace_irq_vector_entry(foo);

	// handle irq...

	irq_exit() // trace_irq_exit()

So either you're only interested in tracing raw irqs and you don't care
about the nature of these and you only enable trace_irq_enter() and
trace_irq_exit(). Or you're interested in the nature of these and you
only enable trace_irq_vector_entry() and trace_irq_exit().

In fact we could even drop the trace_irq_enter() thing because I'm
not sure it's interesting.

All in one it doesn't bring more overhead.

What I'm more worried about actually is that in the case of trace_irq_handler_entry()
and trace_irq_handler_exit(), it becomes a useless tracepoint. And if you enable it
for other irq tracepoint symetric purposes, you'll have it there too and that's a bit
odd.

So perhaps we want to keep your way of doing this.

> >
> >
> >> diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
> >> index 00cbb27..9179dca 100644
> >> --- a/arch/x86/kernel/time.c
> >> +++ b/arch/x86/kernel/time.c
> >> @@ -13,6 +13,7 @@
> >>  #include <linux/interrupt.h>
> >>  #include <linux/time.h>
> >>  #include <linux/mca.h>
> >> +#include <trace/events/irq.h>
> >>
> >>  #include <asm/vsyscall.h>
> >>  #include <asm/x86_init.h>
> >> @@ -51,6 +52,13 @@ unsigned long profile_pc(struct pt_regs *regs)
> >>  }
> >>  EXPORT_SYMBOL(profile_pc);
> >>
> >> +static irqreturn_t timer_interrupt(int irq, void *dev_id);
> >> +static struct irqaction irq0  = {
> >> +     .handler = timer_interrupt,
> >> +     .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL | IRQF_TIMER,
> >> +     .name = "timer"
> >> +};
> >> +
> >>  /*
> >>   * Default timer interrupt handler for PIT/HPET
> >>   */
> >> @@ -59,7 +67,9 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
> >>       /* Keep nmi watchdog up to date */
> >>       inc_irq_stat(irq0_irqs);
> >>
> >> +     trace_irq_handler_entry(irq, &irq0);
> >>       global_clock_event->event_handler(global_clock_event);
> >> +     trace_irq_handler_exit(irq, &irq0, 1);
> >
> > Seems it should go to generic event handlers in kernel/time ?
> 
> (same response as hrtimers, below...)
> 
> >
> >> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> >> index 4fc1205..d2f3f1f 100644
> >> --- a/include/trace/events/irq.h
> >> +++ b/include/trace/events/irq.h
> >> @@ -138,6 +138,90 @@ DEFINE_EVENT(softirq, softirq_raise,
> >>       TP_ARGS(vec_nr)
> >>  );
> >>
> >> +#if !defined(_TRACE_IRQ_VECTOR_ENUM)
> >> +#define _TRACE_IRQ_VECTOR_ENUM
> >> +enum {
> >> +     TRACE_NMI_VECTOR,
> >> +     TRACE_NOHZ_TIMER_VECTOR,
> >> +     TRACE_HRTIMER_VECTOR,
> >> +     TRACE_ONESHOT_TIMER_VECTOR,
> >> +     TRACE_PERIODIC_TIMER_BROADCAST_VECTOR,
> >> +     TRACE_PERIODIC_TIMER_VECTOR,
> >> +     TRACE_ERROR_APIC_VECTOR,
> >> +     TRACE_RESCHEDULE_VECTOR,
> >> +     TRACE_CALL_FUNCTION_VECTOR,
> >> +     TRACE_CALL_FUNCTION_SINGLE_VECTOR,
> >> +     TRACE_THERMAL_APIC_VECTOR,
> >> +     TRACE_THRESHOLD_APIC_VECTOR,
> >> +     TRACE_REBOOT_VECTOR,
> >> +     TRACE_SPURIOUS_APIC_VECTOR,
> >> +     TRACE_IRQ_WORK_VECTOR,
> >> +     TRACE_X86_PLATFORM_IPI_VECTOR,
> >> +     TRACE_INVALIDATE_TLB_VECTOR,
> >> +};
> >> +#endif
> >
> > Pure x86 things shouldn't be in a generic header.
> 
> Other platforms have IPI vector interrupts, so we can just remove
> "X86_" from the name. Would that work?

If that's not a generic thing, we probably don't want that here.

I'd rather propose you to include an arch file from include/trace/events/irq.h
and define in that arch file specific things.

> > include/trace/event/irq.h should contain generic things. And I'm not sure
> > we want some generalized vector namespace for that but rather single contained
> > tracepoint names that make sense:
> >
> > trace_ipi_function(), trace_irq_reschedule()...
> 
> I don't see how that can fit with putting the tracepoints in
> irq_enter() and irq_exit()...

With my above explanations I think it's clearer.

> Also I'm not sure it makes sense to bloat the trace events with
> individual tracepoints for interrupts. Some of these events are
> relatively rare in archs: RESCHEDULE_VECTOR is only in x86 and ia64,
> and IPI will not be in any unicore machine, for example. They may not
> warrant having their own events instead of an enum entry.

10 tracepoints is no bloating really. And the bloating can only be seen in
slow path (tracing enablement) and in image size. And yeah on boot too
probably but perhaps we can defer that one day.
The point is we should split things when that makes sense. Having an
all-in-one irq tracepoint is very uncomfortable.

What if someone is interested in tracing only resched ipis (and I do
sometimes, using a trace_printk() currently)? He would be
forced to use those vector_entry/vector_exit with a filter. Filters
should not be used to cope with a lack of tracepoint granularity. They
are not really efficient for that.

Also about reschedule interrupt, it's not so rare:

$ git grep scheduler_ipi
arch/alpha/kernel/smp.c:                        scheduler_ipi();
arch/arm/kernel/smp.c:          scheduler_ipi();
arch/blackfin/mach-common/smp.c:                        scheduler_ipi();
arch/cris/arch-v32/kernel/smp.c:                scheduler_ipi();
arch/ia64/kernel/irq_ia64.c:                    scheduler_ipi();
arch/ia64/xen/irq_xen.c:        scheduler_ipi();
arch/m32r/kernel/smp.c: scheduler_ipi();
arch/mips/cavium-octeon/smp.c:          scheduler_ipi();
arch/mips/kernel/smtc.c:        scheduler_ipi();
arch/mips/mti-malta/malta-int.c:        scheduler_ipi();
arch/mips/pmc-sierra/yosemite/smp.c:                    scheduler_ipi();
arch/mips/pmc-sierra/yosemite/smp.c:                    scheduler_ipi();
arch/mips/sgi-ip27/ip27-irq.c:          scheduler_ipi();
arch/mips/sgi-ip27/ip27-irq.c:          scheduler_ipi();
arch/mips/sibyte/bcm1480/smp.c:         scheduler_ipi();
arch/mips/sibyte/sb1250/smp.c:          scheduler_ipi();
arch/mn10300/kernel/smp.c:      scheduler_ipi();
arch/parisc/kernel/smp.c:                               scheduler_ipi();
arch/powerpc/kernel/smp.c:      scheduler_ipi();
arch/powerpc/kernel/smp.c:                      scheduler_ipi();
arch/s390/kernel/smp.c:         scheduler_ipi();
arch/sh/kernel/smp.c:           scheduler_ipi();
arch/sparc/kernel/smp_32.c:     scheduler_ipi();
arch/sparc/kernel/smp_64.c:     scheduler_ipi();
arch/tile/kernel/smp.c: scheduler_ipi();
arch/um/kernel/smp.c:                   scheduler_ipi();
arch/x86/kernel/smp.c:  scheduler_ipi();
arch/x86/xen/smp.c:     scheduler_ipi();

Just put a tracepoint in scheduler_ipi() and you are done.

> >
> > But vector namespace makes sense for archs yeah.
> >
> >>  /* This part must be outside protection */
> >> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> >> index a9205e3..84a0c70 100644
> >> --- a/kernel/hrtimer.c
> >> +++ b/kernel/hrtimer.c
> >> @@ -48,6 +48,7 @@
> >>
> >>  #include <asm/uaccess.h>
> >>
> >> +#include <trace/events/irq.h>
> >>  #include <trace/events/timer.h>
> >>
> >>  /*
> >> @@ -1244,6 +1245,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> >>       ktime_t expires_next, now, entry_time, delta;
> >>       int i, retries = 0;
> >>
> >> +     trace_irq_vector_entry(TRACE_HRTIMER_VECTOR);
> >> +
> >>       BUG_ON(!cpu_base->hres_active);
> >>       cpu_base->nr_events++;
> >>       dev->next_event.tv64 = KTIME_MAX;
> >> @@ -1316,6 +1319,7 @@ retry:
> >>       if (expires_next.tv64 == KTIME_MAX ||
> >>           !tick_program_event(expires_next, 0)) {
> >>               cpu_base->hang_detected = 0;
> >> +             trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
> >>               return;
> >>       }
> >>
> >> @@ -1355,6 +1359,8 @@ retry:
> >>       tick_program_event(expires_next, 1);
> >>       printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
> >>                   ktime_to_ns(delta));
> >> +
> >> +     trace_irq_vector_exit(TRACE_HRTIMER_VECTOR);
> >
> > We already have hrtimer tracepoints.
> >
> 
> Yes, but those events are about hrtimer events and buried deep in
> hrtimer code. If I enable all events in the "irq" trace event
> subsystem, I want to see every interrupt. I don't want to remember "oh
> yeah, I need to enable x,y,z events too." Putting the event in the
> "irq" subsystem and in the interrupt handler makes this crystal clear.
> And if the hrtimer code is (or is ever) such that an interrupt path
> doesn't reach the hrtimer tracepoints, I still want to know that the
> interrupt occurred.
> 
> Here's our motivation: We use interrupt events, along with traps,
> syscall, and sched_switch events to know when kernel code is running
> instead of user code. So, being clear that we got all interrupts (and
> which interrupt) is important.

Ok I see your point then.

But then have a generic single trace_irq_timer() (may be cut down in entry/exit if
you want).

But I don't think we really want to  into HRTIMER, PERIODIC, ONESHOT, BROADCAST
vectors. That's sounds like the wrong place to provide these details.

If we want higher level details, then we can enable hrtimer tracepoints, and/or we add some
tracepoints in the timer subsystem to know with what we are dealing with.

Does that make sense?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-11 15:54             ` Frederic Weisbecker
@ 2011-07-11 18:21               ` Vaibhav Nagarnaik
  2011-07-12 18:09                 ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-07-11 18:21 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Mon, Jul 11, 2011 at 8:54 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Let me clarify what I proposed, we have that:
>
>        irq_enter() // trace_irq_enter();
>
>        trace_irq_vector_entry(foo);
>
>        // handle irq...
>
>        irq_exit() // trace_irq_exit()
>
> So either you're only interested in tracing raw irqs and you don't care
> about the nature of these and you only enable trace_irq_enter() and
> trace_irq_exit(). Or you're interested in the nature of these and you
> only enable trace_irq_vector_entry() and trace_irq_exit().
>
> In fact we could even drop the trace_irq_enter() thing because I'm
> not sure it's interesting.
>
> All in one it doesn't bring more overhead.
>
> What I'm more worried about actually is that in the case of trace_irq_handler_entry()
> and trace_irq_handler_exit(), it becomes a useless tracepoint. And if you enable it
> for other irq tracepoint symetric purposes, you'll have it there too and that's a bit
> odd.
>
> So perhaps we want to keep your way of doing this.
>

We proposed this patch as a way of tracing raw interrupts from the
platform, which is why it traces all the raw interrupts and does not use
any specific interrupt data in the events. So when someone wants to know
just the interrupts that are being triggered in the system, this
tracepoint can be used.

There can be more involved and detailed tracepoints for individual IRQs,
as you proposed to have trace_ipi_function(), etc.

> On Thu, Jul 07, 2011 at 05:54:02PM -0700, David Sharp wrote:
>> Here's our motivation: We use interrupt events, along with traps,
>> syscall, and sched_switch events to know when kernel code is running
>> instead of user code. So, being clear that we got all interrupts (and
>> which interrupt) is important.
>
> Ok I see your point then.
>
> But then have a generic single trace_irq_timer() (may be cut down in entry/exit if
> you want).
>
> But I don't think we really want to  into HRTIMER, PERIODIC, ONESHOT, BROADCAST
> vectors. That's sounds like the wrong place to provide these details.
>
> If we want higher level details, then we can enable hrtimer tracepoints, and/or we add some
> tracepoints in the timer subsystem to know with what we are dealing with.
>
> Does that make sense?
>

I added the details about the timer handler because it would be
difficult to understand exactly which generic event handlers the
platform LOCAL_TIMER_IRQ_VECTOR led to.

Once again, this is meant as a raw interrupt tracepoint which provides 2
things: an interrupt occurred and the nature of interrupt. As you
mentioned there are high level hrtimer tracepoints and there can be
similar highly detailed tracepoints for each IRQ as required with more
data about the interrupt, but I believe this is the wrong patch to put
them in.


Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-11 18:21               ` Vaibhav Nagarnaik
@ 2011-07-12 18:09                 ` Frederic Weisbecker
  2011-07-12 22:08                   ` Vaibhav Nagarnaik
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-12 18:09 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: David Sharp, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Mon, Jul 11, 2011 at 11:21:04AM -0700, Vaibhav Nagarnaik wrote:
> On Mon, Jul 11, 2011 at 8:54 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > Let me clarify what I proposed, we have that:
> >
> >        irq_enter() // trace_irq_enter();
> >
> >        trace_irq_vector_entry(foo);
> >
> >        // handle irq...
> >
> >        irq_exit() // trace_irq_exit()
> >
> > So either you're only interested in tracing raw irqs and you don't care
> > about the nature of these and you only enable trace_irq_enter() and
> > trace_irq_exit(). Or you're interested in the nature of these and you
> > only enable trace_irq_vector_entry() and trace_irq_exit().
> >
> > In fact we could even drop the trace_irq_enter() thing because I'm
> > not sure it's interesting.
> >
> > All in one it doesn't bring more overhead.
> >
> > What I'm more worried about actually is that in the case of trace_irq_handler_entry()
> > and trace_irq_handler_exit(), it becomes a useless tracepoint. And if you enable it
> > for other irq tracepoint symetric purposes, you'll have it there too and that's a bit
> > odd.
> >
> > So perhaps we want to keep your way of doing this.
> >
> 
> We proposed this patch as a way of tracing raw interrupts from the
> platform, which is why it traces all the raw interrupts and does not use
> any specific interrupt data in the events. So when someone wants to know
> just the interrupts that are being triggered in the system, this
> tracepoint can be used.

Which one? Not sure I understand what you mean.

> There can be more involved and detailed tracepoints for individual IRQs,
> as you proposed to have trace_ipi_function(), etc.

I was arguing about having a tracepoint in irq_exit(), so I guess we are not talking about the same
things.
 
> > On Thu, Jul 07, 2011 at 05:54:02PM -0700, David Sharp wrote:
> >> Here's our motivation: We use interrupt events, along with traps,
> >> syscall, and sched_switch events to know when kernel code is running
> >> instead of user code. So, being clear that we got all interrupts (and
> >> which interrupt) is important.
> >
> > Ok I see your point then.
> >
> > But then have a generic single trace_irq_timer() (may be cut down in entry/exit if
> > you want).
> >
> > But I don't think we really want to  into HRTIMER, PERIODIC, ONESHOT, BROADCAST
> > vectors. That's sounds like the wrong place to provide these details.
> >
> > If we want higher level details, then we can enable hrtimer tracepoints, and/or we add some
> > tracepoints in the timer subsystem to know with what we are dealing with.
> >
> > Does that make sense?
> >
> 
> I added the details about the timer handler because it would be
> difficult to understand exactly which generic event handlers the
> platform LOCAL_TIMER_IRQ_VECTOR led to.

Generic event handler is already a high level detail. At the point we call
that generic event handler, we have climbed several steps, we have left the
world of pure irq vectors.

What is normally suitable here to get more semantic details is to use higher level
tracepoints like high resolution timer ones.

If these are not enough for you, you can add a trace_hrtimer_interrupt()
in the irq tracepoint subsystem. If it's in the irq subsystem you include
it with the rest of irq traces.

> Once again, this is meant as a raw interrupt tracepoint which provides 2
> things: an interrupt occurred and the nature of interrupt. As you
> mentioned there are high level hrtimer tracepoints and there can be
> similar highly detailed tracepoints for each IRQ as required with more
> data about the interrupt, but I believe this is the wrong patch to put
> them in.

But the problem is you are actually messing up irq raw events and higher
level details.

That vector space should lay in the arch low ground and never go further.
Finding one of these trace vector things elsewhere than arch/ shouldn't
ever happen.

As far as possible, tracepoints must be self-contained events and be
self-explanatory. We want granularity there.

There should never be a reschedule vector to trace. That event is self
contained and wants its own tracepoints. irq/reschedule_interrupt is
something any kernel developer can understand and give some sense out
by browsing the events. But irq/interrupt_vector is opaque as hell.

What happens if I want to trace only reschedule interrupts? I'd be
forced to enable that vector, browse its format, find the value
for the reschedule vector, apply a filter.
That requires a lot of time and skills compared to just enable a tracepoint.

Your patch mixes up too much things:

- pure arch tracepoints. Only those want that vector naming I think.

- IPIs that should go to kernel/smp.c

- reschedule interrupt

- irq work

- ...

If you cut the things into more self-contained topics, this is going to be
much easier to review and thus much faster to apply and push.

Also your goal is to not miss any interrupts. So if everything goes to the irq
tracepoint subsystem, you just need to enable everything there and you are
done. You don't need a big vector tracepoint to ensure you won't miss something.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-12 18:09                 ` Frederic Weisbecker
@ 2011-07-12 22:08                   ` Vaibhav Nagarnaik
  2011-07-13 14:11                     ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-07-12 22:08 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Tue, Jul 12, 2011 at 11:09 AM, Frederic Weisbecker
<fweisbec@gmail.com> wrote:
> On Mon, Jul 11, 2011 at 11:21:04AM -0700, Vaibhav Nagarnaik wrote:
>> On Mon, Jul 11, 2011 at 8:54 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
>> > Let me clarify what I proposed, we have that:
>> >
>> >        irq_enter() // trace_irq_enter();
>> >
>> >        trace_irq_vector_entry(foo);
>> >
>> >        // handle irq...
>> >
>> >        irq_exit() // trace_irq_exit()
>> >
>> > So either you're only interested in tracing raw irqs and you don't care
>> > about the nature of these and you only enable trace_irq_enter() and
>> > trace_irq_exit(). Or you're interested in the nature of these and you
>> > only enable trace_irq_vector_entry() and trace_irq_exit().
>> >
>> > In fact we could even drop the trace_irq_enter() thing because I'm
>> > not sure it's interesting.
>> >
>> > All in one it doesn't bring more overhead.
>> >
>> > What I'm more worried about actually is that in the case of trace_irq_handler_entry()
>> > and trace_irq_handler_exit(), it becomes a useless tracepoint. And if you enable it
>> > for other irq tracepoint symetric purposes, you'll have it there too and that's a bit
>> > odd.
>> >
>> > So perhaps we want to keep your way of doing this.
>> >
>>
>> We proposed this patch as a way of tracing raw interrupts from the
>> platform, which is why it traces all the raw interrupts and does not use
>> any specific interrupt data in the events. So when someone wants to know
>> just the interrupts that are being triggered in the system, this
>> tracepoint can be used.
>
> Which one? Not sure I understand what you mean.
>
>> There can be more involved and detailed tracepoints for individual IRQs,
>> as you proposed to have trace_ipi_function(), etc.
>
> I was arguing about having a tracepoint in irq_exit(), so I guess we are not talking about the same
> things.
>
>> > On Thu, Jul 07, 2011 at 05:54:02PM -0700, David Sharp wrote:
>> >> Here's our motivation: We use interrupt events, along with traps,
>> >> syscall, and sched_switch events to know when kernel code is running
>> >> instead of user code. So, being clear that we got all interrupts (and
>> >> which interrupt) is important.
>> >
>> > Ok I see your point then.
>> >
>> > But then have a generic single trace_irq_timer() (may be cut down in entry/exit if
>> > you want).
>> >
>> > But I don't think we really want to  into HRTIMER, PERIODIC, ONESHOT, BROADCAST
>> > vectors. That's sounds like the wrong place to provide these details.
>> >
>> > If we want higher level details, then we can enable hrtimer tracepoints, and/or we add some
>> > tracepoints in the timer subsystem to know with what we are dealing with.
>> >
>> > Does that make sense?
>> >
>>
>> I added the details about the timer handler because it would be
>> difficult to understand exactly which generic event handlers the
>> platform LOCAL_TIMER_IRQ_VECTOR led to.
>
> Generic event handler is already a high level detail. At the point we call
> that generic event handler, we have climbed several steps, we have left the
> world of pure irq vectors.
>
> What is normally suitable here to get more semantic details is to use higher level
> tracepoints like high resolution timer ones.
>
> If these are not enough for you, you can add a trace_hrtimer_interrupt()
> in the irq tracepoint subsystem. If it's in the irq subsystem you include
> it with the rest of irq traces.
>
>> Once again, this is meant as a raw interrupt tracepoint which provides 2
>> things: an interrupt occurred and the nature of interrupt. As you
>> mentioned there are high level hrtimer tracepoints and there can be
>> similar highly detailed tracepoints for each IRQ as required with more
>> data about the interrupt, but I believe this is the wrong patch to put
>> them in.
>
> But the problem is you are actually messing up irq raw events and higher
> level details.
>
> That vector space should lay in the arch low ground and never go further.
> Finding one of these trace vector things elsewhere than arch/ shouldn't
> ever happen.
>
> As far as possible, tracepoints must be self-contained events and be
> self-explanatory. We want granularity there.
>
> There should never be a reschedule vector to trace. That event is self
> contained and wants its own tracepoints. irq/reschedule_interrupt is
> something any kernel developer can understand and give some sense out
> by browsing the events. But irq/interrupt_vector is opaque as hell.
>
> What happens if I want to trace only reschedule interrupts? I'd be
> forced to enable that vector, browse its format, find the value
> for the reschedule vector, apply a filter.
> That requires a lot of time and skills compared to just enable a tracepoint.
>
> Your patch mixes up too much things:
>
> - pure arch tracepoints. Only those want that vector naming I think.
>
> - IPIs that should go to kernel/smp.c
>
> - reschedule interrupt
>
> - irq work
>
> - ...
>
> If you cut the things into more self-contained topics, this is going to be
> much easier to review and thus much faster to apply and push.
>
> Also your goal is to not miss any interrupts. So if everything goes to the irq
> tracepoint subsystem, you just need to enable everything there and you are
> done. You don't need a big vector tracepoint to ensure you won't miss something.
>

Breaking this patch up in different small ones makes sense. Can you
comment on this proposal for the following trace events?

For tracepoints in generic IRQ handlers:
1. trace_timer_vector - takes an enum for BROADCAST, HRTIMER, ONESHOT,
   PERIODIC and NOHZ.
2. trace_irq_work_vector - for IRQ_WORK_VECTOR
3. trace_reschedule_vector - for RESCHEDULE_IPI vector
4. trace_call_function_vector - takes an enum for CALL_FUNCTION and
   CALL_FUNCTION_SINGLE

Another trace event for arch-specific IRQ vectors which don't have
generic event handlers:
5. trace_platform_irq_vector - takes an enum, which is defined in
   asm/irq.h for each platform. This is traced in arch-specific files
   only.

How does this sound?



Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-12 22:08                   ` Vaibhav Nagarnaik
@ 2011-07-13 14:11                     ` Frederic Weisbecker
  2011-07-13 18:18                       ` Vaibhav Nagarnaik
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2011-07-13 14:11 UTC (permalink / raw)
  To: Vaibhav Nagarnaik
  Cc: David Sharp, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Tue, Jul 12, 2011 at 03:08:48PM -0700, Vaibhav Nagarnaik wrote:
> Breaking this patch up in different small ones makes sense. Can you
> comment on this proposal for the following trace events?
> 
> For tracepoints in generic IRQ handlers:
> 1. trace_timer_vector - takes an enum for BROADCAST, HRTIMER, ONESHOT,
>    PERIODIC and NOHZ.

Have you read my previous email entirely? 

Where I explained we shouldn't use that vector naming outside of low
level archs, and that those enums break tracing granularity?

> 2. trace_irq_work_vector - for IRQ_WORK_VECTOR

trace_irq_work_run() should be enough. And people can add
trace_irq_work_queue() if they need to one day.


> 3. trace_reschedule_vector - for RESCHEDULE_IPI vector

trace_reschedule_interrupt()


> 4. trace_call_function_vector - takes an enum for CALL_FUNCTION and
>    CALL_FUNCTION_SINGLE

Two seperate tracepoints should be better.


> Another trace event for arch-specific IRQ vectors which don't have
> generic event handlers:
> 5. trace_platform_irq_vector - takes an enum, which is defined in
>    asm/irq.h for each platform. This is traced in arch-specific files
>    only.

Yep. For this perhaps an enum can make sense.

Thanks.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3] trace: Add x86 irq vector entry/exit tracepoints
  2011-07-13 14:11                     ` Frederic Weisbecker
@ 2011-07-13 18:18                       ` Vaibhav Nagarnaik
  0 siblings, 0 replies; 35+ messages in thread
From: Vaibhav Nagarnaik @ 2011-07-13 18:18 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: David Sharp, Thomas Gleixner, Ingo Molnar, Steven Rostedt,
	Michael Rubin, linux-kernel, x86, Jiaying Zhang

On Wed, Jul 13, 2011 at 7:11 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Tue, Jul 12, 2011 at 03:08:48PM -0700, Vaibhav Nagarnaik wrote:
>> Breaking this patch up in different small ones makes sense. Can you
>> comment on this proposal for the following trace events?
>>
>> For tracepoints in generic IRQ handlers:
>> 1. trace_timer_vector - takes an enum for BROADCAST, HRTIMER, ONESHOT,
>>    PERIODIC and NOHZ.
>
> Have you read my previous email entirely?
>
> Where I explained we shouldn't use that vector naming outside of low
> level archs, and that those enums break tracing granularity?
>

I was arguing for an enum for this since it is a single contained trace
event for all timers. But you are right, since I'm breaking up the
interrupts in different trace events already, timer interrupts should
also have separate trace events for granularity.

>> 2. trace_irq_work_vector - for IRQ_WORK_VECTOR
>
> trace_irq_work_run() should be enough. And people can add
> trace_irq_work_queue() if they need to one day.
>
>
>> 3. trace_reschedule_vector - for RESCHEDULE_IPI vector
>
> trace_reschedule_interrupt()
>
>
>> 4. trace_call_function_vector - takes an enum for CALL_FUNCTION and
>>    CALL_FUNCTION_SINGLE
>
> Two seperate tracepoints should be better.
>
>
>> Another trace event for arch-specific IRQ vectors which don't have
>> generic event handlers:
>> 5. trace_platform_irq_vector - takes an enum, which is defined in
>>    asm/irq.h for each platform. This is traced in arch-specific files
>>    only.
>
> Yep. For this perhaps an enum can make sense.
>
> Thanks.
>

I will create and post the patch series soon.


Thanks

Vaibhav Nagarnaik

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2011-07-13 18:18 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-22 22:56 [PATCH] trace: Add special x86 irq entry/exit tracepoints Vaibhav Nagarnaik
2011-04-25 23:41 ` Vaibhav Nagarnaik
2011-04-28 23:16   ` Vaibhav Nagarnaik
2011-04-28 23:41   ` Steven Rostedt
2011-04-29 20:12   ` [PATCH] trace: Add x86 irq vector " Vaibhav Nagarnaik
2011-04-29 20:26     ` Thomas Gleixner
2011-04-29 22:04       ` Vaibhav Nagarnaik
2011-05-31 21:28     ` [PATCH v2] " Vaibhav Nagarnaik
2011-06-01  0:00       ` Frederic Weisbecker
2011-06-01 22:38         ` Vaibhav Nagarnaik
2011-06-01 23:30           ` David Sharp
2011-06-16  3:02             ` Frederic Weisbecker
2011-06-21 18:43               ` Vaibhav Nagarnaik
2011-07-06 23:43               ` H. Peter Anvin
2011-07-06 23:56                 ` Frederic Weisbecker
2011-07-07  0:02                   ` H. Peter Anvin
2011-07-07  0:25                     ` Frederic Weisbecker
2011-07-07  0:30                       ` H. Peter Anvin
2011-07-07  0:51                         ` Frederic Weisbecker
2011-07-07  9:57                         ` Ingo Molnar
2011-07-07 22:50                           ` David Sharp
2011-07-07 23:00                             ` Frederic Weisbecker
2011-06-21 18:45       ` [PATCH v3] " Vaibhav Nagarnaik
2011-07-06 21:50         ` Vaibhav Nagarnaik
2011-07-06 23:38           ` Andi Kleen
2011-07-07 23:34         ` Frederic Weisbecker
2011-07-08  0:54           ` David Sharp
2011-07-11 15:54             ` Frederic Weisbecker
2011-07-11 18:21               ` Vaibhav Nagarnaik
2011-07-12 18:09                 ` Frederic Weisbecker
2011-07-12 22:08                   ` Vaibhav Nagarnaik
2011-07-13 14:11                     ` Frederic Weisbecker
2011-07-13 18:18                       ` Vaibhav Nagarnaik
2011-04-29  0:14 ` [PATCH] trace: Add special x86 irq " Thomas Gleixner
2011-04-29 20:15   ` Vaibhav Nagarnaik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).