All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/5] netdev: show a process of packets
@ 2010-06-24  8:09 Koki Sanagi
  2010-06-24  8:16 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-24  8:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku

CHANGE-LOG since v1:
    1) fix to enable "perf trace record netdev-times" run with a script.
    2) add some options to "perf trace report netdev-times". A detail is in
       PATCH 5/5.

These patch-set adds tracepoints to show us a process of packets.
Using these tracepoints and existing points, we can get the time when
packet passes through some points in transmit or receive sequence.
For example, this is an output of perf script which is attached by patch 5/5.

79074.756672832sec cpu=1
irq_entry(+0.000000msec,irq=77:eth3)
         |------------softirq_raise(+0.001277msec)
irq_exit (+0.002278msec)     |
                             |
                      softirq_entry(+0.003562msec)
                             |
                             |---netif_receive_skb(+0.006279msec,len=100)
                             |            |
                             |   skb_copy_datagram_iovec(+0.038778msec, 2285:sshd)
                             |
                      napi_poll_exit(+0.017160msec, eth3)
                             |
                      softirq_exit(+0.018248msec)

The above is a receive side. Like this, it can show receive sequence from
interrupt(irq_entry) to application(skb_copy_datagram_iovec). There are 8
tracepoints in this side. All events except for skb_copy_datagram_iovec can be
associated with each other by CPU number. skb_copy_datagram_iovec can be
associated with netif_receive_skb by skbaddr.
This script shows one NET_RX softirq and events related to it. All relative
time bases on first irq_entry which raise NET_RX softirq.

   dev    len      dev_queue_xmit|----------|dev_hard_start_xmit|-----|free_skb
                         |             |                           |
   eth3   114  79044.417123332sec     0.005242msec          0.103843msec
   eth3   114  79044.580090422sec     0.002306msec          0.103632msec
   eth3   114  79044.719078251sec     0.002288msec          0.104093msec

The above is a transmit side. There are three tracepoints in this side.
Point1 is before putting a packet to Qdisc. point2 is after ndo_start_xmit in
dev_hard_start_xmit. It indicates finishing putting a packet to driver.
point3 is in consume_skb and dev_kfree_skb_irq. It indicates freeing a
transmitted packet.
Values of this script are, from left, device name, length of a packet, a time of
point1, an interval time between point1 and point2 and an interval time between
point2 and point3.

These times are useful to analyze a performance or to detect a point where
packet delays. For example,
- NET_RX softirq calling is late.
- Application is late to take a packet.
- It takes much time to put a transmitting packet to driver
  (It may be caused by packed queue)

And also, these tracepoint help us to investigate a network driver's trouble
from memory dump because ftrace records it to memory. And ftrace is so light
even if always trace on. So, in a case investigating a problem which doesn't
reproduce, it is useful.

Thanks,
Koki Sanagi.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
@ 2010-06-24  8:16 ` Koki Sanagi
  2010-06-24 19:15   ` Steven Rostedt
  2010-06-24  8:17 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Koki Sanagi @ 2010-06-24  8:16 UTC (permalink / raw)
  To: netdev; +Cc: davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku, linux-kernel

This patch adds a tracepoint to raising of softirq.
This is useful if you want to detect which hard interrupt raise softirq
and lets you know a time between raising softirq and performing softirq.
Combinating with other tracepoint, it lets us know a process of packets
(See patch 0/5).

          <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
          <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
          <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]

This is a same patch Lai Jiangshan submitted.
http://marc.info/?l=linux-kernel&m=126026122728732&w=2

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/linux/interrupt.h  |    8 +++++++-
 include/trace/events/irq.h |   34 +++++++++++++++++++++++++++++++---
 2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index c233113..1cb5726 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -18,6 +18,7 @@
 #include <asm/atomic.h>
 #include <asm/ptrace.h>
 #include <asm/system.h>
+#include <trace/events/irq.h>
 
 /*
  * These correspond to the IORESOURCE_IRQ_* defines in
@@ -402,7 +403,12 @@ asmlinkage void do_softirq(void);
 asmlinkage void __do_softirq(void);
 extern void open_softirq(int nr, void (*action)(struct softirq_action *));
 extern void softirq_init(void);
-#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
+static inline void __raise_softirq_irqoff(unsigned int nr)
+{
+	trace_softirq_raise(nr);
+	or_softirq_pending(1UL << nr);
+}
+
 extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
 extern void wakeup_softirqd(void);
diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
index 0e4cfb6..7cb7435 100644
--- a/include/trace/events/irq.h
+++ b/include/trace/events/irq.h
@@ -5,7 +5,9 @@
 #define _TRACE_IRQ_H
 
 #include <linux/tracepoint.h>
-#include <linux/interrupt.h>
+
+struct irqaction;
+struct softirq_action;
 
 #define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
 #define show_softirq_name(val)				\
@@ -82,6 +84,32 @@ TRACE_EVENT(irq_handler_exit,
 		  __entry->irq, __entry->ret ? "handled" : "unhandled")
 );
 
+/**
+ * softirq_raise - called immediately when a softirq is raised
+ * @nr: softirq vector number
+ *
+ * Tracepoint for tracing when softirq action is raised.
+ * Also, when used in combination with the softirq_entry tracepoint
+ * we can determine the softirq raise latency.
+ */
+TRACE_EVENT(softirq_raise,
+
+	TP_PROTO(unsigned int nr),
+
+	TP_ARGS(nr),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	vec	)
+	),
+
+	TP_fast_assign(
+		__entry->vec	= nr;
+	),
+
+	TP_printk("vec=%d [action=%s]", __entry->vec,
+		show_softirq_name(__entry->vec))
+);
+
 DECLARE_EVENT_CLASS(softirq,
 
 	TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
@@ -89,11 +117,11 @@ DECLARE_EVENT_CLASS(softirq,
 	TP_ARGS(h, vec),
 
 	TP_STRUCT__entry(
-		__field(	int,	vec			)
+		__field(	unsigned int,	vec	)
 	),
 
 	TP_fast_assign(
-		__entry->vec = (int)(h - vec);
+		__entry->vec = (unsigned int)(h - vec);
 	),
 
 	TP_printk("vec=%d [action=%s]", __entry->vec,



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
  2010-06-24  8:16 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
@ 2010-06-24  8:17 ` Koki Sanagi
  2010-06-24  8:18 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-24  8:17 UTC (permalink / raw)
  To: netdev; +Cc: davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku

This patch converts trace_napi_poll from DECLARE_EVENT to TRACE_EVENT.
This lets you know how long napi_poll takes.

          <idle>-0     [001] 241302.750777: napi_poll: napi poll on napi struct f6acc480 for device eth3
          <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d70c for device eth1
          <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d20c for device eth1

This is a same patch Neil Horman submitted.
http://marc.info/?l=linux-kernel&m=125978157926853&w=2

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/trace/events/napi.h |   23 +++++++++++++++++++++--
 1 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/napi.h b/include/trace/events/napi.h
index 188deca..512a057 100644
--- a/include/trace/events/napi.h
+++ b/include/trace/events/napi.h
@@ -6,10 +6,29 @@
 
 #include <linux/netdevice.h>
 #include <linux/tracepoint.h>
+#include <linux/ftrace.h>
+
+#define NO_DEV "(no_device)"
+
+TRACE_EVENT(napi_poll,
 
-DECLARE_TRACE(napi_poll,
 	TP_PROTO(struct napi_struct *napi),
-	TP_ARGS(napi));
+
+	TP_ARGS(napi),
+
+	TP_STRUCT__entry(
+		__field(	struct napi_struct *,	napi)
+		__string(	dev_name, napi->dev ? napi->dev->name : NO_DEV)
+	),
+
+	TP_fast_assign(
+		__entry->napi = napi;
+		__assign_str(dev_name, napi->dev ? napi->dev->name : NO_DEV);
+	),
+
+	TP_printk("napi poll on napi struct %p for device %s",
+		__entry->napi, __get_str(dev_name))
+);
 
 #endif /* _TRACE_NAPI_H_ */
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
  2010-06-24  8:16 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
  2010-06-24  8:17 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
@ 2010-06-24  8:18 ` Koki Sanagi
  2010-06-24  8:19 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-24  8:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku

This patch adds tracepoint to dev_queue_xmit, dev_hard_start_xmit and
netif_receive_skb. These tracepints help you to monitor network driver's
input/output.

            sshd-4445  [001] 241367.066046: net_dev_queue: dev=eth3 skbaddr=dd6b2538 len=114
            sshd-4445  [001] 241367.066047: net_dev_xmit: dev=eth3 skbaddr=dd6b2538 len=114 rc=0
          <idle>-0     [001] 241367.067472: net_dev_receive: dev=eth3 skbaddr=f5e59000 len=52

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/trace/events/net.h |   83 ++++++++++++++++++++++++++++++++++++++++++++
 net/core/dev.c             |    5 +++
 net/core/net-traces.c      |    1 +
 3 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/net.h b/include/trace/events/net.h
new file mode 100644
index 0000000..ee10970
--- /dev/null
+++ b/include/trace/events/net.h
@@ -0,0 +1,83 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM net
+
+#if !defined(_TRACE_NET_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NET_H
+
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/ip.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(net_dev_xmit,
+
+	TP_PROTO(struct sk_buff *skb,
+		 int rc),
+
+	TP_ARGS(skb, rc),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+		__field(	unsigned int,	len		)
+		__field(	int,		rc		)
+		__string(	name,		skb->dev->name	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__entry->rc = rc;
+		__assign_str(name, skb->dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u rc=%d",
+		__get_str(name), __entry->skbaddr, __entry->len, __entry->rc)
+);
+
+TRACE_EVENT(net_dev_queue,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+		__field(	unsigned int,	len		)
+		__string(	name,		skb->dev->name	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__assign_str(name, skb->dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u",
+		__get_str(name), __entry->skbaddr, __entry->len)
+);
+
+TRACE_EVENT(net_dev_receive,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+		__field(	unsigned int,	len		)
+		__string(	name,		skb->dev->name	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__assign_str(name, skb->dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u",
+		__get_str(name), __entry->skbaddr, __entry->len)
+);
+#endif /* _TRACE_NET_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5902426..4b64b21 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -130,6 +130,7 @@
 #include <linux/jhash.h>
 #include <linux/random.h>
 #include <trace/events/napi.h>
+#include <trace/events/net.h>
 #include <linux/pci.h>
 
 #include "net-sysfs.h"
@@ -1922,6 +1923,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 		}
 
 		rc = ops->ndo_start_xmit(skb, dev);
+		trace_net_dev_xmit(skb, rc);
 		if (rc == NETDEV_TX_OK)
 			txq_trans_update(txq);
 		return rc;
@@ -1942,6 +1944,7 @@ gso:
 			skb_dst_drop(nskb);
 
 		rc = ops->ndo_start_xmit(nskb, dev);
+		trace_net_dev_xmit(nskb, rc);
 		if (unlikely(rc != NETDEV_TX_OK)) {
 			if (rc & ~NETDEV_TX_MASK)
 				goto out_kfree_gso_skb;
@@ -2156,6 +2159,7 @@ int dev_queue_xmit(struct sk_buff *skb)
 	}
 
 gso:
+	trace_net_dev_queue(skb);
 	/* Disable soft irqs for various locks below. Also
 	 * stops preemption for RCU.
 	 */
@@ -2942,6 +2946,7 @@ int netif_receive_skb(struct sk_buff *skb)
 	if (netdev_tstamp_prequeue)
 		net_timestamp_check(skb);
 
+	trace_net_dev_receive(skb);
 #ifdef CONFIG_RPS
 	{
 		struct rps_dev_flow voidflow, *rflow = &voidflow;
diff --git a/net/core/net-traces.c b/net/core/net-traces.c
index afa6380..7f1bb2a 100644
--- a/net/core/net-traces.c
+++ b/net/core/net-traces.c
@@ -26,6 +26,7 @@
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/skb.h>
+#include <trace/events/net.h>
 #include <trace/events/napi.h>
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb);


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (2 preceding siblings ...)
  2010-06-24  8:18 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
@ 2010-06-24  8:19 ` Koki Sanagi
  2010-06-24  8:21 ` [RFC PATCH v2 5/5] perf:add a script shows a process of packet Koki Sanagi
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-24  8:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku

 This patch adds tracepoint to consume_skb and dev_kfree_skb_irq.
Combinating with tracepoint on dev_hard_start_xmit, we can check how long it
takes to free transmited packets. And using it, we can calculate how many
packets driver had at that time. It is useful when a drop of transmited packet
is a problem.

          <idle>-0     [001] 241409.218333: consume_skb: skbaddr=dd6b2fb8
          <idle>-0     [001] 241409.490555: dev_kfree_skb_irq: skbaddr=f5e29840

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
include/trace/events/skb.h |   36 ++++++++++++++++++++++++++++++++++++
 net/core/dev.c             |    2 ++
 net/core/skbuff.c          |    1 +
 3 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 4b2be6d..6ab5b34 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -35,6 +35,42 @@ TRACE_EVENT(kfree_skb,
 		__entry->skbaddr, __entry->protocol, __entry->location)
 );
 
+TRACE_EVENT(consume_skb,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,	skbaddr	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+	),
+
+	TP_printk("skbaddr=%p",
+		__entry->skbaddr)
+);
+
+TRACE_EVENT(dev_kfree_skb_irq,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,	skbaddr	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+	),
+
+	TP_printk("skbaddr=%p",
+		__entry->skbaddr)
+);
+
 TRACE_EVENT(skb_copy_datagram_iovec,
 
 	TP_PROTO(const struct sk_buff *skb, int len),
diff --git a/net/core/dev.c b/net/core/dev.c
index 4b64b21..807b1ca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -131,6 +131,7 @@
 #include <linux/random.h>
 #include <trace/events/napi.h>
 #include <trace/events/net.h>
+#include <trace/events/skb.h>
 #include <linux/pci.h>
 
 #include "net-sysfs.h"
@@ -1580,6 +1581,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
 		struct softnet_data *sd;
 		unsigned long flags;
 
+		trace_dev_kfree_skb_irq(skb);
 		local_irq_save(flags);
 		sd = &__get_cpu_var(softnet_data);
 		skb->next = sd->completion_queue;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 34432b4..a7b4036 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
 		smp_rmb();
 	else if (likely(!atomic_dec_and_test(&skb->users)))
 		return;
+	trace_consume_skb(skb);
 	__kfree_skb(skb);
 }
 EXPORT_SYMBOL(consume_skb);


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 5/5] perf:add a script shows a process of packet
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (3 preceding siblings ...)
  2010-06-24  8:19 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
@ 2010-06-24  8:21 ` Koki Sanagi
  2010-06-25  0:57 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-24  8:21 UTC (permalink / raw)
  To: netdev; +Cc: davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku

This perf script shows a time-chart of process of packet.
Patch 0/5 shows an output of this.
If you want to use it, install perf and record perf.data like following.

#perf trace record netdev-times [script]

If you set script, perf gathers records until it ends.
If not, you must Ctrl-C to stop recording.

And if you want a report from record,

#perf trace report netdev-times [options]

If you use some options, you can limit an output.
Option is below.

tx: show only process of tx packets
rx: show only process of rx packets
dev=: show a process specified with this option

In the future, I want src/dst IP(v6) address filter option. 
It's now under consideration/construction.

For example, if you want to show a process of received packets associated
with eth3,

#perf trace report netdev-times rx dev=eth3
79074.756672832sec cpu=1
irq_entry(+0.000000msec,irq=77:eth3)
         |------------softirq_raise(+0.001277msec)
irq_exit (+0.002278msec)     |
                             |
                      softirq_entry(+0.003562msec)
                             |
                             |---netif_receive_skb(+0.006279msec,len=100)
                             |            |
                             |   skb_copy_datagram_iovec(+0.038778msec, 2285:sshd)
                             |
                      napi_poll_exit(+0.017160msec, eth3)
                             |
                      softirq_exit(+0.018248msec)


This perf script helps us to analyze a process time of transmit/receive
sequence.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 tools/perf/scripts/python/bin/netdev-times-record |    7 +
 tools/perf/scripts/python/bin/netdev-times-report |    5 +
 tools/perf/scripts/python/netdev-times.py         |  495 +++++++++++++++++++++
 3 files changed, 507 insertions(+), 0 deletions(-)

diff --git a/tools/perf/scripts/python/bin/netdev-times-record b/tools/perf/scripts/python/bin/netdev-times-record
new file mode 100644
index 0000000..1dfa8d5
--- /dev/null
+++ b/tools/perf/scripts/python/bin/netdev-times-record
@@ -0,0 +1,7 @@
+#!/bin/bash
+perf record -c 1 -f -R -a -e net:net_dev_xmit -e net:net_dev_queue	\
+		-e net:net_dev_receive -e skb:consume_skb		\
+		-e skb:dev_kfree_skb_irq -e napi:napi_poll		\
+		-e irq:irq_handler_entry -e irq:irq_handler_exit	\
+		-e irq:softirq_entry -e irq:softirq_exit		\
+		-e irq:softirq_raise -e skb:skb_copy_datagram_iovec $@
diff --git a/tools/perf/scripts/python/bin/netdev-times-report b/tools/perf/scripts/python/bin/netdev-times-report
new file mode 100644
index 0000000..ecc8122
--- /dev/null
+++ b/tools/perf/scripts/python/bin/netdev-times-report
@@ -0,0 +1,5 @@
+#!/bin/bash
+# description: displayi a process of packet and processing time
+# args: tx rx dev src dst
+
+perf trace -s ~/libexec/perf-core/scripts/python/netdev-times.py $@
diff --git a/tools/perf/scripts/python/netdev-times.py b/tools/perf/scripts/python/netdev-times.py
new file mode 100644
index 0000000..5e68be4
--- /dev/null
+++ b/tools/perf/scripts/python/netdev-times.py
@@ -0,0 +1,495 @@
+# Display process of packets and processed time.
+# It helps you to investigate networking.
+
+import os
+import sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+	'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+all_event_list = []; # insert all tracepoint event related with this script
+irq_dic = {}; # key is cpu and value is a list which stacks irqs
+              # which raise NET_RX softirq
+net_rx_dic = {}; # key is cpu and value include time of NET_RX softirq-entry
+		 # and a list which stacks receive
+receive_hunk_list = []; # a list which include a sequence of receive events
+receive_skb_list = []; # received packet list for matching
+		       # skb_copy_datagram_iovec
+
+queue_list = []; # list of packets which pass through dev_queue_xmit
+xmit_list = [];  # list of packets which pass through dev_hard_start_xmit
+free_list = [];  # list of packets which is freed
+
+show_tx = 0;
+show_rx = 0;
+dev = 0; # store a name of device specified by option "dev="
+
+# Calculate a time interval(msec) from src(nsec) to dst(nsec)
+def diff_msec(src, dst):
+	return (dst - src) / 1000000.0
+
+# Display a process of transmitting a packet
+def print_transmit(hunk):
+	if dev != 0 and hunk['dev'].find(dev) < 0:
+		return
+	print "%7s %5d %6d.%09dsec %12.6fmsec      %12.6fmsec" % \
+		(hunk['dev'], hunk['len'],
+		nsecs_secs(hunk['queue_t']),
+		nsecs_nsecs(hunk['queue_t']),
+		diff_msec(hunk['queue_t'], hunk['xmit_t']),
+		diff_msec(hunk['xmit_t'], hunk['free_t']))
+
+# Display a process of received packets and interrputs associated with
+# a NET_RX softirq
+def print_receive(hunk):
+	show_hunk = 0
+	if 'irq_list' not in hunk.keys() \
+	or len(hunk['irq_list']) == 0:
+		return
+	irq_list = hunk['irq_list']
+	cpu = irq_list[0]['cpu']
+	base_t = irq_list[0]['irq_ent_t']
+	# check if this hunk should be showed
+	if dev != 0:
+		for i in range(len(irq_list)):
+			if irq_list[i]['name'].find(dev) >= 0:
+				show_hunk = 1
+				break
+	else:
+		show_hunk = 1
+	if show_hunk == 0:
+		return
+
+	print "%d.%09dsec cpu=%d" % \
+		(nsecs_secs(base_t), nsecs_nsecs(base_t), cpu)
+	for i in range(len(irq_list)):
+		print "irq_entry(+%fmsec,irq=%d:%s)" % \
+			(diff_msec(base_t, irq_list[i]['irq_ent_t']),
+			irq_list[i]['irq'], irq_list[i]['name'])
+
+		if 'sirq_raise_t' in irq_list[i].keys():
+			print "         |------------" \
+			      "softirq_raise(+%fmsec)" % \
+				diff_msec(base_t, irq_list[i]['sirq_raise_t'])
+
+		if 'irq_ext_t' in irq_list[i].keys():
+			print "irq_exit (+%fmsec)     |" % \
+				diff_msec(base_t, irq_list[i]['irq_ext_t'])
+
+		print "                             |"
+
+	if 'sirq_ent_t' not in hunk.keys():
+		print 'maybe softirq_entry is dropped'
+		return
+	print "                      " \
+		"softirq_entry(+%fmsec)\n" \
+		"                      " \
+		"       |" % \
+		diff_msec(base_t, hunk['sirq_ent_t'])
+	event_list = hunk['event_list']
+	for i in range(len(event_list)):
+		event = event_list[i]
+		if event['event_name'] == 'napi_poll':
+			print "                      " \
+			      "napi_poll_exit(+%fmsec, %s)" % \
+			(diff_msec(base_t, event['event_t']), event['dev'])
+			print "                      " \
+			      "       |"
+		elif 'comm' in event.keys():
+			print "                      " \
+				"       |---netif_receive_skb" \
+				"(+%fmsec,len=%d)\n" \
+				"                      " \
+				"       |            |\n" \
+				"                      " \
+				"       |   skb_copy_datagram_iovec" \
+				"(+%fmsec, %d:%s)\n" \
+				"                      " \
+				"       |" % \
+			(diff_msec(base_t, event['event_t']),
+			event['len'],
+			diff_msec(base_t, event['comm_t']),
+			event['pid'], event['comm'])
+		else:
+			print "                      " \
+				"       |---netif_receive_skb" \
+				"(+%fmsec,len=%d)\n" \
+				"                      " \
+				"       |" % \
+				(diff_msec(base_t, event['event_t']),
+					event['len'])
+
+	print "                      " \
+	      "softirq_exit(+%fmsec)\n" % \
+		 diff_msec(base_t, hunk['sirq_ext_t'])
+
+def trace_begin():
+	global show_tx
+	global show_rx
+	global dev
+
+	for i in range(len(sys.argv)):
+		if i == 0:
+			continue
+		arg = sys.argv[i]
+		if arg == 'tx':
+			show_tx = 1
+		elif arg =='rx':
+			show_rx = 1
+		elif arg.find('dev=',0, 4) >= 0:
+			dev = arg[4:]
+	if show_tx == 0  and show_rx == 0:
+		show_tx = 1
+		show_rx = 1
+
+def trace_end():
+	global show_tx
+	global show_rx
+	# order all events in time
+	all_event_list.sort(lambda a,b :cmp(a['time'], b['time']))
+	# process all events
+	for i in range(len(all_event_list)):
+		event = all_event_list[i]
+		event_name = event['event_name']
+		if event_name == 'irq__softirq_exit':
+			handle_irq_softirq_exit(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['vec'])
+		elif event_name == 'irq__softirq_entry':
+			handle_irq_softirq_entry(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'],event['vec'])
+		elif event_name == 'irq__softirq_raise':
+			handle_irq_softirq_raise(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['vec'])
+		elif event_name == 'irq__irq_handler_entry':
+			handle_irq_handler_entry(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['irq'], event['name'])
+		elif event_name == 'irq__irq_handler_exit':
+			handle_irq_handler_exit(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['irq'], event['ret'])
+		elif event_name == 'napi__napi_poll':
+			handle_napi_poll(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['napi'],
+				event['dev_name'])
+		elif event_name == 'net__net_dev_receive':
+			handle_net_dev_receive(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'], event['name'])
+		elif event_name == 'skb__skb_copy_datagram_iovec':
+			handle_skb_copy_datagram_iovec(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'])
+		elif event_name == 'net__net_dev_queue':
+			handle_net_dev_queue(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'], event['name'])
+		elif event_name == 'net__net_dev_xmit':
+			handle_net_dev_xmit(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'], event['rc'], event['name'])
+		elif event_name == 'skb__dev_kfree_skb_irq':
+			handle_dev_kfree_skb_irq(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'])
+		elif event_name == 'skb__consume_skb':
+			handle_consume_skb(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'])
+	# display receive hunks
+	if show_rx == 1:
+		for i in range(len(receive_hunk_list)):
+			print_receive(receive_hunk_list[i])
+	# display transmit hunks
+	if show_tx == 1:
+		print "   dev    len      dev_queue_xmit|----------|" \
+			"dev_hard_start_xmit|-----|free_skb"
+		print "                         |             |" \
+			"                           |"
+		for i in range(len(free_list)):
+			print_transmit(free_list[i])
+
+def irq__softirq_exit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	vec):
+	if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
+		return
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'vec':vec}
+	all_event_list.append(event_data)
+
+def handle_irq_softirq_exit(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	vec):
+	rec_data = {'sirq_ext_t':time}
+	if common_cpu in irq_dic.keys():
+		rec_data.update({'irq_list':irq_dic[common_cpu]})
+		del irq_dic[common_cpu]
+	if common_cpu in net_rx_dic.keys():
+		rec_data.update({
+		    'event_list':net_rx_dic[common_cpu]['event_list'],
+		    'sirq_ent_t':net_rx_dic[common_cpu]['sirq_ent_t']})
+		del net_rx_dic[common_cpu]
+	# merge information realted to a NET_RX softirq
+	receive_hunk_list.append(rec_data)
+
+def irq__softirq_entry(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	vec):
+	if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
+		return
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'vec':vec}
+	all_event_list.append(event_data)
+
+def handle_irq_softirq_entry(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	vec):
+		net_rx_dic[common_cpu] = {'event_list':[],
+					  'sirq_ent_t':time}
+
+def irq__softirq_raise(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	vec):
+	if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
+		return
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'vec':vec}
+	all_event_list.append(event_data)
+
+def handle_irq_softirq_raise(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	vec):
+	if common_cpu not in irq_dic.keys() \
+	or len(irq_dic[common_cpu]) == 0:
+		return
+	irq = irq_dic[common_cpu].pop()
+	# put a time to prev irq on the same cpu
+	irq.update({'sirq_raise_t':time})
+	irq_dic[common_cpu].append(irq)
+
+def irq__irq_handler_entry(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	irq, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'irq':irq, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_irq_handler_entry(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	irq, name):
+	if common_cpu not in irq_dic.keys():
+		irq_dic[common_cpu] = []
+	irq_record = {'irq':irq,
+		      'name':name,
+		      'cpu':common_cpu,
+		      'irq_ent_t':time}
+	irq_dic[common_cpu].append(irq_record)
+
+def irq__irq_handler_exit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	irq, ret):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'irq':irq, 'ret':ret}
+	all_event_list.append(event_data)
+
+def handle_irq_handler_exit(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	irq, ret):
+	if common_cpu not in irq_dic.keys():
+		return
+	irq_record = irq_dic[common_cpu].pop()
+	irq_record.update({'irq_ext_t':time})
+	# if an irq doesn't include NET_RX softirq, drop.
+	if 'sirq_raise_t' in irq_record.keys():
+		irq_dic[common_cpu].append(irq_record)
+
+def napi__napi_poll(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	napi, dev_name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'napi':napi, 'dev_name':dev_name}
+	all_event_list.append(event_data)
+
+def handle_napi_poll(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	napi, dev_name):
+	if common_cpu in net_rx_dic.keys():
+		event_list = net_rx_dic[common_cpu]['event_list']
+		rec_data = {'event_name':'napi_poll',
+			    'dev':dev_name,
+			    'event_t':time}
+		event_list.append(rec_data)
+
+def net__net_dev_receive(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr,skblen, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_net_dev_receive(event_name, context, common_cpu,
+	ccommon_pid, common_comm, time,
+	skbaddr, skblen, name):
+	if common_cpu in net_rx_dic.keys():
+		rec_data = {'event_name':'netif_receive_skb',
+			    'event_t':time,
+			    'skbaddr':skbaddr,
+			    'len':skblen}
+		event_list = net_rx_dic[common_cpu]['event_list']
+		event_list.append(rec_data)
+		receive_skb_list.insert(0, rec_data)
+
+def skb__skb_copy_datagram_iovec(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, skblen):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen}
+	all_event_list.append(event_data)
+
+def handle_skb_copy_datagram_iovec(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr, skblen):
+	for i in range(len(receive_skb_list)):
+		rec_data = receive_skb_list[i]
+		if skbaddr == rec_data['skbaddr'] and \
+			'comm' not in rec_data.keys():
+			rec_data.update({'comm':common_comm,
+					 'pid':common_pid,
+					 'comm_t':time})
+			del receive_skb_list[i]
+			break
+
+def net__net_dev_queue(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, skblen, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_net_dev_queue(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr, skblen, name):
+	skb = {'dev':name,
+	       'skbaddr':skbaddr,
+	       'len':skblen,
+	       'queue_t':time}
+	xmit_list.insert(0, skb)
+
+def net__net_dev_xmit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, skblen, rc, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen, 'rc':rc, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_net_dev_xmit(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr, skblen, rc, name):
+	if rc == 0: # NETDEV_TX_OK
+		for i in range(len(xmit_list)):
+			skb = xmit_list[i]
+			if skb['skbaddr'] == skbaddr:
+				skb['xmit_t'] = time
+				queue_list.insert(0, skb)
+				del xmit_list[i]
+				break
+
+def free_skb(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr):
+	for i in range(len(queue_list)):
+		skb = queue_list[i]
+		if skb['skbaddr'] ==skbaddr:
+			skb['free_t'] = time
+			free_list.append(skb)
+			del queue_list[i]
+			break
+
+def skb__dev_kfree_skb_irq(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr}
+	all_event_list.append(event_data)
+
+def handle_dev_kfree_skb_irq(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr):
+	free_skb(event_name, context, common_cpu,
+		common_pid, common_comm, time,
+		skbaddr)
+
+def skb__consume_skb(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr}
+	all_event_list.append(event_data)
+
+def handle_consume_skb(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr):
+	free_skb(event_name, context, common_cpu,
+		common_pid, common_comm, time,
+		skbaddr)


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-24  8:16 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
@ 2010-06-24 19:15   ` Steven Rostedt
  2010-06-25  0:49     ` Koki Sanagi
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2010-06-24 19:15 UTC (permalink / raw)
  To: Koki Sanagi
  Cc: netdev, davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku,
	linux-kernel

Hi Koki,

Your subject says 1/5 but I do not see any other patches.


On Thu, 2010-06-24 at 17:16 +0900, Koki Sanagi wrote:
> This patch adds a tracepoint to raising of softirq.
> This is useful if you want to detect which hard interrupt raise softirq
> and lets you know a time between raising softirq and performing softirq.
> Combinating with other tracepoint, it lets us know a process of packets
> (See patch 0/5).
> 
>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
> 
> This is a same patch Lai Jiangshan submitted.
> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> ---
>  include/linux/interrupt.h  |    8 +++++++-
>  include/trace/events/irq.h |   34 +++++++++++++++++++++++++++++++---
>  2 files changed, 38 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index c233113..1cb5726 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -18,6 +18,7 @@
>  #include <asm/atomic.h>
>  #include <asm/ptrace.h>
>  #include <asm/system.h>
> +#include <trace/events/irq.h>
>  
>  /*
>   * These correspond to the IORESOURCE_IRQ_* defines in
> @@ -402,7 +403,12 @@ asmlinkage void do_softirq(void);
>  asmlinkage void __do_softirq(void);
>  extern void open_softirq(int nr, void (*action)(struct softirq_action *));
>  extern void softirq_init(void);
> -#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
> +static inline void __raise_softirq_irqoff(unsigned int nr)
> +{
> +	trace_softirq_raise(nr);
> +	or_softirq_pending(1UL << nr);
> +}
> +
>  extern void raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq(unsigned int nr);
>  extern void wakeup_softirqd(void);
> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> index 0e4cfb6..7cb7435 100644
> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -5,7 +5,9 @@
>  #define _TRACE_IRQ_H
>  
>  #include <linux/tracepoint.h>
> -#include <linux/interrupt.h>
> +
> +struct irqaction;
> +struct softirq_action;
>  
>  #define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
>  #define show_softirq_name(val)				\
> @@ -82,6 +84,32 @@ TRACE_EVENT(irq_handler_exit,
>  		  __entry->irq, __entry->ret ? "handled" : "unhandled")
>  );
>  
> +/**
> + * softirq_raise - called immediately when a softirq is raised
> + * @nr: softirq vector number
> + *
> + * Tracepoint for tracing when softirq action is raised.
> + * Also, when used in combination with the softirq_entry tracepoint
> + * we can determine the softirq raise latency.
> + */
> +TRACE_EVENT(softirq_raise,
> +
> +	TP_PROTO(unsigned int nr),
> +
> +	TP_ARGS(nr),
> +
> +	TP_STRUCT__entry(
> +		__field(	unsigned int,	vec	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->vec	= nr;
> +	),
> +
> +	TP_printk("vec=%d [action=%s]", __entry->vec,
> +		show_softirq_name(__entry->vec))

Hmm, is there a way to reuse a DECLARE_EVENT_CLASS here?

> +);
> +
>  DECLARE_EVENT_CLASS(softirq,
>  
>  	TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
> @@ -89,11 +117,11 @@ DECLARE_EVENT_CLASS(softirq,
>  	TP_ARGS(h, vec),
>  
>  	TP_STRUCT__entry(
> -		__field(	int,	vec			)
> +		__field(	unsigned int,	vec	)
>  	),
>  
>  	TP_fast_assign(
> -		__entry->vec = (int)(h - vec);
> +		__entry->vec = (unsigned int)(h - vec);

Just curious, did you see the original as a bug?

-- Steve

>  	),
>  
>  	TP_printk("vec=%d [action=%s]", __entry->vec,



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-24 19:15   ` Steven Rostedt
@ 2010-06-25  0:49     ` Koki Sanagi
  0 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  0:49 UTC (permalink / raw)
  To: rostedt
  Cc: netdev, davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku,
	linux-kernel

Hi Steven,

Sorry, I submitted other patches to only netdev.
I'll resubmit soon all patches to linux-kernel.

(2010/06/25 4:15), Steven Rostedt wrote:
> Hi Koki,
> 
> Your subject says 1/5 but I do not see any other patches.
> 
> 
> On Thu, 2010-06-24 at 17:16 +0900, Koki Sanagi wrote:
>> This patch adds a tracepoint to raising of softirq.
>> This is useful if you want to detect which hard interrupt raise softirq
>> and lets you know a time between raising softirq and performing softirq.
>> Combinating with other tracepoint, it lets us know a process of packets
>> (See patch 0/5).
>>
>>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
>>
>> This is a same patch Lai Jiangshan submitted.
>> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
>>
>> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
>> ---
>>  include/linux/interrupt.h  |    8 +++++++-
>>  include/trace/events/irq.h |   34 +++++++++++++++++++++++++++++++---
>>  2 files changed, 38 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
>> index c233113..1cb5726 100644
>> --- a/include/linux/interrupt.h
>> +++ b/include/linux/interrupt.h
>> @@ -18,6 +18,7 @@
>>  #include <asm/atomic.h>
>>  #include <asm/ptrace.h>
>>  #include <asm/system.h>
>> +#include <trace/events/irq.h>
>>  
>>  /*
>>   * These correspond to the IORESOURCE_IRQ_* defines in
>> @@ -402,7 +403,12 @@ asmlinkage void do_softirq(void);
>>  asmlinkage void __do_softirq(void);
>>  extern void open_softirq(int nr, void (*action)(struct softirq_action *));
>>  extern void softirq_init(void);
>> -#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
>> +static inline void __raise_softirq_irqoff(unsigned int nr)
>> +{
>> +	trace_softirq_raise(nr);
>> +	or_softirq_pending(1UL << nr);
>> +}
>> +
>>  extern void raise_softirq_irqoff(unsigned int nr);
>>  extern void raise_softirq(unsigned int nr);
>>  extern void wakeup_softirqd(void);
>> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
>> index 0e4cfb6..7cb7435 100644
>> --- a/include/trace/events/irq.h
>> +++ b/include/trace/events/irq.h
>> @@ -5,7 +5,9 @@
>>  #define _TRACE_IRQ_H
>>  
>>  #include <linux/tracepoint.h>
>> -#include <linux/interrupt.h>
>> +
>> +struct irqaction;
>> +struct softirq_action;
>>  
>>  #define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
>>  #define show_softirq_name(val)				\
>> @@ -82,6 +84,32 @@ TRACE_EVENT(irq_handler_exit,
>>  		  __entry->irq, __entry->ret ? "handled" : "unhandled")
>>  );
>>  
>> +/**
>> + * softirq_raise - called immediately when a softirq is raised
>> + * @nr: softirq vector number
>> + *
>> + * Tracepoint for tracing when softirq action is raised.
>> + * Also, when used in combination with the softirq_entry tracepoint
>> + * we can determine the softirq raise latency.
>> + */
>> +TRACE_EVENT(softirq_raise,
>> +
>> +	TP_PROTO(unsigned int nr),
>> +
>> +	TP_ARGS(nr),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	unsigned int,	vec	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->vec	= nr;
>> +	),
>> +
>> +	TP_printk("vec=%d [action=%s]", __entry->vec,
>> +		show_softirq_name(__entry->vec))
> 
> Hmm, is there a way to reuse a DECLARE_EVENT_CLASS here?
> 
>> +);
>> +
>>  DECLARE_EVENT_CLASS(softirq,
>>  
>>  	TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
>> @@ -89,11 +117,11 @@ DECLARE_EVENT_CLASS(softirq,
>>  	TP_ARGS(h, vec),
>>  
>>  	TP_STRUCT__entry(
>> -		__field(	int,	vec			)
>> +		__field(	unsigned int,	vec	)
>>  	),
>>  
>>  	TP_fast_assign(
>> -		__entry->vec = (int)(h - vec);
>> +		__entry->vec = (unsigned int)(h - vec);
> 
> Just curious, did you see the original as a bug?
> 
> -- Steve
> 
>>  	),
>>  
>>  	TP_printk("vec=%d [action=%s]", __entry->vec,
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (4 preceding siblings ...)
  2010-06-24  8:21 ` [RFC PATCH v2 5/5] perf:add a script shows a process of packet Koki Sanagi
@ 2010-06-25  0:57 ` Koki Sanagi
  2010-06-25  1:10   ` KOSAKI Motohiro
  2010-06-25  0:58 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  0:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: kaneshige.kenji, izumi.taku

This patch adds a tracepoint to raising of softirq.
This is useful if you want to detect which hard interrupt raise softirq
and lets you know a time between raising softirq and performing softirq.
Combinating with other tracepoint, it lets us know a process of packets
(See patch 0/5).

          <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
          <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
          <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]

This is a same patch Lai Jiangshan submitted.
http://marc.info/?l=linux-kernel&m=126026122728732&w=2

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/linux/interrupt.h  |    8 +++++++-
 include/trace/events/irq.h |   34 +++++++++++++++++++++++++++++++---
 2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index c233113..1cb5726 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -18,6 +18,7 @@
 #include <asm/atomic.h>
 #include <asm/ptrace.h>
 #include <asm/system.h>
+#include <trace/events/irq.h>
 
 /*
  * These correspond to the IORESOURCE_IRQ_* defines in
@@ -402,7 +403,12 @@ asmlinkage void do_softirq(void);
 asmlinkage void __do_softirq(void);
 extern void open_softirq(int nr, void (*action)(struct softirq_action *));
 extern void softirq_init(void);
-#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
+static inline void __raise_softirq_irqoff(unsigned int nr)
+{
+	trace_softirq_raise(nr);
+	or_softirq_pending(1UL << nr);
+}
+
 extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
 extern void wakeup_softirqd(void);
diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
index 0e4cfb6..7cb7435 100644
--- a/include/trace/events/irq.h
+++ b/include/trace/events/irq.h
@@ -5,7 +5,9 @@
 #define _TRACE_IRQ_H
 
 #include <linux/tracepoint.h>
-#include <linux/interrupt.h>
+
+struct irqaction;
+struct softirq_action;
 
 #define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
 #define show_softirq_name(val)				\
@@ -82,6 +84,32 @@ TRACE_EVENT(irq_handler_exit,
 		  __entry->irq, __entry->ret ? "handled" : "unhandled")
 );
 
+/**
+ * softirq_raise - called immediately when a softirq is raised
+ * @nr: softirq vector number
+ *
+ * Tracepoint for tracing when softirq action is raised.
+ * Also, when used in combination with the softirq_entry tracepoint
+ * we can determine the softirq raise latency.
+ */
+TRACE_EVENT(softirq_raise,
+
+	TP_PROTO(unsigned int nr),
+
+	TP_ARGS(nr),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	vec	)
+	),
+
+	TP_fast_assign(
+		__entry->vec	= nr;
+	),
+
+	TP_printk("vec=%d [action=%s]", __entry->vec,
+		show_softirq_name(__entry->vec))
+);
+
 DECLARE_EVENT_CLASS(softirq,
 
 	TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
@@ -89,11 +117,11 @@ DECLARE_EVENT_CLASS(softirq,
 	TP_ARGS(h, vec),
 
 	TP_STRUCT__entry(
-		__field(	int,	vec			)
+		__field(	unsigned int,	vec	)
 	),
 
 	TP_fast_assign(
-		__entry->vec = (int)(h - vec);
+		__entry->vec = (unsigned int)(h - vec);
 	),
 
 	TP_printk("vec=%d [action=%s]", __entry->vec,



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (5 preceding siblings ...)
  2010-06-25  0:57 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
@ 2010-06-25  0:58 ` Koki Sanagi
  2010-06-25  1:11   ` KOSAKI Motohiro
  2010-06-25  7:57   ` Frederic Weisbecker
  2010-06-25  0:58 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  0:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: kaneshige.kenji, izumi.taku

This patch converts trace_napi_poll from DECLARE_EVENT to TRACE_EVENT.
This lets you know how long napi_poll takes.

          <idle>-0     [001] 241302.750777: napi_poll: napi poll on napi struct f6acc480 for device eth3
          <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d70c for device eth1
          <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d20c for device eth1

This is a same patch Neil Horman submitted.
http://marc.info/?l=linux-kernel&m=125978157926853&w=2

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/trace/events/napi.h |   23 +++++++++++++++++++++--
 1 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/napi.h b/include/trace/events/napi.h
index 188deca..512a057 100644
--- a/include/trace/events/napi.h
+++ b/include/trace/events/napi.h
@@ -6,10 +6,29 @@
 
 #include <linux/netdevice.h>
 #include <linux/tracepoint.h>
+#include <linux/ftrace.h>
+
+#define NO_DEV "(no_device)"
+
+TRACE_EVENT(napi_poll,
 
-DECLARE_TRACE(napi_poll,
 	TP_PROTO(struct napi_struct *napi),
-	TP_ARGS(napi));
+
+	TP_ARGS(napi),
+
+	TP_STRUCT__entry(
+		__field(	struct napi_struct *,	napi)
+		__string(	dev_name, napi->dev ? napi->dev->name : NO_DEV)
+	),
+
+	TP_fast_assign(
+		__entry->napi = napi;
+		__assign_str(dev_name, napi->dev ? napi->dev->name : NO_DEV);
+	),
+
+	TP_printk("napi poll on napi struct %p for device %s",
+		__entry->napi, __get_str(dev_name))
+);
 
 #endif /* _TRACE_NAPI_H_ */
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (6 preceding siblings ...)
  2010-06-25  0:58 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
@ 2010-06-25  0:58 ` Koki Sanagi
  2010-06-28 18:09   ` Steven Rostedt
  2010-06-25  0:59 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
  2010-06-25  0:59 ` [RFC PATCH v2 5/5] perf:add a script shows a process of packet Koki Sanagi
  9 siblings, 1 reply; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  0:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: kaneshige.kenji, izumi.taku

This patch adds tracepoint to dev_queue_xmit, dev_hard_start_xmit and
netif_receive_skb. These tracepints help you to monitor network driver's
input/output.

            sshd-4445  [001] 241367.066046: net_dev_queue: dev=eth3 skbaddr=dd6b2538 len=114
            sshd-4445  [001] 241367.066047: net_dev_xmit: dev=eth3 skbaddr=dd6b2538 len=114 rc=0
          <idle>-0     [001] 241367.067472: net_dev_receive: dev=eth3 skbaddr=f5e59000 len=52

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 include/trace/events/net.h |   83 ++++++++++++++++++++++++++++++++++++++++++++
 net/core/dev.c             |    5 +++
 net/core/net-traces.c      |    1 +
 3 files changed, 89 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/net.h b/include/trace/events/net.h
new file mode 100644
index 0000000..ee10970
--- /dev/null
+++ b/include/trace/events/net.h
@@ -0,0 +1,83 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM net
+
+#if !defined(_TRACE_NET_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NET_H
+
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/ip.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(net_dev_xmit,
+
+	TP_PROTO(struct sk_buff *skb,
+		 int rc),
+
+	TP_ARGS(skb, rc),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+		__field(	unsigned int,	len		)
+		__field(	int,		rc		)
+		__string(	name,		skb->dev->name	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__entry->rc = rc;
+		__assign_str(name, skb->dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u rc=%d",
+		__get_str(name), __entry->skbaddr, __entry->len, __entry->rc)
+);
+
+TRACE_EVENT(net_dev_queue,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+		__field(	unsigned int,	len		)
+		__string(	name,		skb->dev->name	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__assign_str(name, skb->dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u",
+		__get_str(name), __entry->skbaddr, __entry->len)
+);
+
+TRACE_EVENT(net_dev_receive,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,		skbaddr		)
+		__field(	unsigned int,	len		)
+		__string(	name,		skb->dev->name	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = skb->len;
+		__assign_str(name, skb->dev->name);
+	),
+
+	TP_printk("dev=%s skbaddr=%p len=%u",
+		__get_str(name), __entry->skbaddr, __entry->len)
+);
+#endif /* _TRACE_NET_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/net/core/dev.c b/net/core/dev.c
index 5902426..4b64b21 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -130,6 +130,7 @@
 #include <linux/jhash.h>
 #include <linux/random.h>
 #include <trace/events/napi.h>
+#include <trace/events/net.h>
 #include <linux/pci.h>
 
 #include "net-sysfs.h"
@@ -1922,6 +1923,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 		}
 
 		rc = ops->ndo_start_xmit(skb, dev);
+		trace_net_dev_xmit(skb, rc);
 		if (rc == NETDEV_TX_OK)
 			txq_trans_update(txq);
 		return rc;
@@ -1942,6 +1944,7 @@ gso:
 			skb_dst_drop(nskb);
 
 		rc = ops->ndo_start_xmit(nskb, dev);
+		trace_net_dev_xmit(nskb, rc);
 		if (unlikely(rc != NETDEV_TX_OK)) {
 			if (rc & ~NETDEV_TX_MASK)
 				goto out_kfree_gso_skb;
@@ -2156,6 +2159,7 @@ int dev_queue_xmit(struct sk_buff *skb)
 	}
 
 gso:
+	trace_net_dev_queue(skb);
 	/* Disable soft irqs for various locks below. Also
 	 * stops preemption for RCU.
 	 */
@@ -2942,6 +2946,7 @@ int netif_receive_skb(struct sk_buff *skb)
 	if (netdev_tstamp_prequeue)
 		net_timestamp_check(skb);
 
+	trace_net_dev_receive(skb);
 #ifdef CONFIG_RPS
 	{
 		struct rps_dev_flow voidflow, *rflow = &voidflow;
diff --git a/net/core/net-traces.c b/net/core/net-traces.c
index afa6380..7f1bb2a 100644
--- a/net/core/net-traces.c
+++ b/net/core/net-traces.c
@@ -26,6 +26,7 @@
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/skb.h>
+#include <trace/events/net.h>
 #include <trace/events/napi.h>
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb);


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (7 preceding siblings ...)
  2010-06-25  0:58 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
@ 2010-06-25  0:59 ` Koki Sanagi
  2010-06-25  4:55   ` Eric Dumazet
  2010-06-28 18:12   ` Steven Rostedt
  2010-06-25  0:59 ` [RFC PATCH v2 5/5] perf:add a script shows a process of packet Koki Sanagi
  9 siblings, 2 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  0:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: kaneshige.kenji, izumi.taku

 This patch adds tracepoint to consume_skb and dev_kfree_skb_irq.
Combinating with tracepoint on dev_hard_start_xmit, we can check how long it
takes to free transmited packets. And using it, we can calculate how many
packets driver had at that time. It is useful when a drop of transmited packet
is a problem.

          <idle>-0     [001] 241409.218333: consume_skb: skbaddr=dd6b2fb8
          <idle>-0     [001] 241409.490555: dev_kfree_skb_irq: skbaddr=f5e29840

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
include/trace/events/skb.h |   36 ++++++++++++++++++++++++++++++++++++
 net/core/dev.c             |    2 ++
 net/core/skbuff.c          |    1 +
 3 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 4b2be6d..6ab5b34 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -35,6 +35,42 @@ TRACE_EVENT(kfree_skb,
 		__entry->skbaddr, __entry->protocol, __entry->location)
 );
 
+TRACE_EVENT(consume_skb,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,	skbaddr	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+	),
+
+	TP_printk("skbaddr=%p",
+		__entry->skbaddr)
+);
+
+TRACE_EVENT(dev_kfree_skb_irq,
+
+	TP_PROTO(struct sk_buff *skb),
+
+	TP_ARGS(skb),
+
+	TP_STRUCT__entry(
+		__field(	void *,	skbaddr	)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+	),
+
+	TP_printk("skbaddr=%p",
+		__entry->skbaddr)
+);
+
 TRACE_EVENT(skb_copy_datagram_iovec,
 
 	TP_PROTO(const struct sk_buff *skb, int len),
diff --git a/net/core/dev.c b/net/core/dev.c
index 4b64b21..807b1ca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -131,6 +131,7 @@
 #include <linux/random.h>
 #include <trace/events/napi.h>
 #include <trace/events/net.h>
+#include <trace/events/skb.h>
 #include <linux/pci.h>
 
 #include "net-sysfs.h"
@@ -1580,6 +1581,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
 		struct softnet_data *sd;
 		unsigned long flags;
 
+		trace_dev_kfree_skb_irq(skb);
 		local_irq_save(flags);
 		sd = &__get_cpu_var(softnet_data);
 		skb->next = sd->completion_queue;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 34432b4..a7b4036 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
 		smp_rmb();
 	else if (likely(!atomic_dec_and_test(&skb->users)))
 		return;
+	trace_consume_skb(skb);
 	__kfree_skb(skb);
 }
 EXPORT_SYMBOL(consume_skb);


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 5/5] perf:add a script shows a process of packet
  2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
                   ` (8 preceding siblings ...)
  2010-06-25  0:59 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
@ 2010-06-25  0:59 ` Koki Sanagi
  9 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  0:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: kaneshige.kenji, izumi.taku

This perf script shows a time-chart of process of packet.
Patch 0/5 shows an output of this.
If you want to use it, install perf and record perf.data like following.

#perf trace record netdev-times [script]

If you set script, perf gathers records until it ends.
If not, you must Ctrl-C to stop recording.

And if you want a report from record,

#perf trace report netdev-times [options]

If you use some options, you can limit an output.
Option is below.

tx: show only process of tx packets
rx: show only process of rx packets
dev=: show a process specified with this option

In the future, I want src/dst IP(v6) address filter option. 
It's now under consideration/construction.

For example, if you want to show a process of received packets associated
with eth3,

#perf trace report netdev-times rx dev=eth3
79074.756672832sec cpu=1
irq_entry(+0.000000msec,irq=77:eth3)
         |------------softirq_raise(+0.001277msec)
irq_exit (+0.002278msec)     |
                             |
                      softirq_entry(+0.003562msec)
                             |
                             |---netif_receive_skb(+0.006279msec,len=100)
                             |            |
                             |   skb_copy_datagram_iovec(+0.038778msec, 2285:sshd)
                             |
                      napi_poll_exit(+0.017160msec, eth3)
                             |
                      softirq_exit(+0.018248msec)


This perf script helps us to analyze a process time of transmit/receive
sequence.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 tools/perf/scripts/python/bin/netdev-times-record |    7 +
 tools/perf/scripts/python/bin/netdev-times-report |    5 +
 tools/perf/scripts/python/netdev-times.py         |  495 +++++++++++++++++++++
 3 files changed, 507 insertions(+), 0 deletions(-)

diff --git a/tools/perf/scripts/python/bin/netdev-times-record b/tools/perf/scripts/python/bin/netdev-times-record
new file mode 100644
index 0000000..1dfa8d5
--- /dev/null
+++ b/tools/perf/scripts/python/bin/netdev-times-record
@@ -0,0 +1,7 @@
+#!/bin/bash
+perf record -c 1 -f -R -a -e net:net_dev_xmit -e net:net_dev_queue	\
+		-e net:net_dev_receive -e skb:consume_skb		\
+		-e skb:dev_kfree_skb_irq -e napi:napi_poll		\
+		-e irq:irq_handler_entry -e irq:irq_handler_exit	\
+		-e irq:softirq_entry -e irq:softirq_exit		\
+		-e irq:softirq_raise -e skb:skb_copy_datagram_iovec $@
diff --git a/tools/perf/scripts/python/bin/netdev-times-report b/tools/perf/scripts/python/bin/netdev-times-report
new file mode 100644
index 0000000..ecc8122
--- /dev/null
+++ b/tools/perf/scripts/python/bin/netdev-times-report
@@ -0,0 +1,5 @@
+#!/bin/bash
+# description: displayi a process of packet and processing time
+# args: tx rx dev src dst
+
+perf trace -s ~/libexec/perf-core/scripts/python/netdev-times.py $@
diff --git a/tools/perf/scripts/python/netdev-times.py b/tools/perf/scripts/python/netdev-times.py
new file mode 100644
index 0000000..5e68be4
--- /dev/null
+++ b/tools/perf/scripts/python/netdev-times.py
@@ -0,0 +1,495 @@
+# Display process of packets and processed time.
+# It helps you to investigate networking.
+
+import os
+import sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+	'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+all_event_list = []; # insert all tracepoint event related with this script
+irq_dic = {}; # key is cpu and value is a list which stacks irqs
+              # which raise NET_RX softirq
+net_rx_dic = {}; # key is cpu and value include time of NET_RX softirq-entry
+		 # and a list which stacks receive
+receive_hunk_list = []; # a list which include a sequence of receive events
+receive_skb_list = []; # received packet list for matching
+		       # skb_copy_datagram_iovec
+
+queue_list = []; # list of packets which pass through dev_queue_xmit
+xmit_list = [];  # list of packets which pass through dev_hard_start_xmit
+free_list = [];  # list of packets which is freed
+
+show_tx = 0;
+show_rx = 0;
+dev = 0; # store a name of device specified by option "dev="
+
+# Calculate a time interval(msec) from src(nsec) to dst(nsec)
+def diff_msec(src, dst):
+	return (dst - src) / 1000000.0
+
+# Display a process of transmitting a packet
+def print_transmit(hunk):
+	if dev != 0 and hunk['dev'].find(dev) < 0:
+		return
+	print "%7s %5d %6d.%09dsec %12.6fmsec      %12.6fmsec" % \
+		(hunk['dev'], hunk['len'],
+		nsecs_secs(hunk['queue_t']),
+		nsecs_nsecs(hunk['queue_t']),
+		diff_msec(hunk['queue_t'], hunk['xmit_t']),
+		diff_msec(hunk['xmit_t'], hunk['free_t']))
+
+# Display a process of received packets and interrputs associated with
+# a NET_RX softirq
+def print_receive(hunk):
+	show_hunk = 0
+	if 'irq_list' not in hunk.keys() \
+	or len(hunk['irq_list']) == 0:
+		return
+	irq_list = hunk['irq_list']
+	cpu = irq_list[0]['cpu']
+	base_t = irq_list[0]['irq_ent_t']
+	# check if this hunk should be showed
+	if dev != 0:
+		for i in range(len(irq_list)):
+			if irq_list[i]['name'].find(dev) >= 0:
+				show_hunk = 1
+				break
+	else:
+		show_hunk = 1
+	if show_hunk == 0:
+		return
+
+	print "%d.%09dsec cpu=%d" % \
+		(nsecs_secs(base_t), nsecs_nsecs(base_t), cpu)
+	for i in range(len(irq_list)):
+		print "irq_entry(+%fmsec,irq=%d:%s)" % \
+			(diff_msec(base_t, irq_list[i]['irq_ent_t']),
+			irq_list[i]['irq'], irq_list[i]['name'])
+
+		if 'sirq_raise_t' in irq_list[i].keys():
+			print "         |------------" \
+			      "softirq_raise(+%fmsec)" % \
+				diff_msec(base_t, irq_list[i]['sirq_raise_t'])
+
+		if 'irq_ext_t' in irq_list[i].keys():
+			print "irq_exit (+%fmsec)     |" % \
+				diff_msec(base_t, irq_list[i]['irq_ext_t'])
+
+		print "                             |"
+
+	if 'sirq_ent_t' not in hunk.keys():
+		print 'maybe softirq_entry is dropped'
+		return
+	print "                      " \
+		"softirq_entry(+%fmsec)\n" \
+		"                      " \
+		"       |" % \
+		diff_msec(base_t, hunk['sirq_ent_t'])
+	event_list = hunk['event_list']
+	for i in range(len(event_list)):
+		event = event_list[i]
+		if event['event_name'] == 'napi_poll':
+			print "                      " \
+			      "napi_poll_exit(+%fmsec, %s)" % \
+			(diff_msec(base_t, event['event_t']), event['dev'])
+			print "                      " \
+			      "       |"
+		elif 'comm' in event.keys():
+			print "                      " \
+				"       |---netif_receive_skb" \
+				"(+%fmsec,len=%d)\n" \
+				"                      " \
+				"       |            |\n" \
+				"                      " \
+				"       |   skb_copy_datagram_iovec" \
+				"(+%fmsec, %d:%s)\n" \
+				"                      " \
+				"       |" % \
+			(diff_msec(base_t, event['event_t']),
+			event['len'],
+			diff_msec(base_t, event['comm_t']),
+			event['pid'], event['comm'])
+		else:
+			print "                      " \
+				"       |---netif_receive_skb" \
+				"(+%fmsec,len=%d)\n" \
+				"                      " \
+				"       |" % \
+				(diff_msec(base_t, event['event_t']),
+					event['len'])
+
+	print "                      " \
+	      "softirq_exit(+%fmsec)\n" % \
+		 diff_msec(base_t, hunk['sirq_ext_t'])
+
+def trace_begin():
+	global show_tx
+	global show_rx
+	global dev
+
+	for i in range(len(sys.argv)):
+		if i == 0:
+			continue
+		arg = sys.argv[i]
+		if arg == 'tx':
+			show_tx = 1
+		elif arg =='rx':
+			show_rx = 1
+		elif arg.find('dev=',0, 4) >= 0:
+			dev = arg[4:]
+	if show_tx == 0  and show_rx == 0:
+		show_tx = 1
+		show_rx = 1
+
+def trace_end():
+	global show_tx
+	global show_rx
+	# order all events in time
+	all_event_list.sort(lambda a,b :cmp(a['time'], b['time']))
+	# process all events
+	for i in range(len(all_event_list)):
+		event = all_event_list[i]
+		event_name = event['event_name']
+		if event_name == 'irq__softirq_exit':
+			handle_irq_softirq_exit(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['vec'])
+		elif event_name == 'irq__softirq_entry':
+			handle_irq_softirq_entry(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'],event['vec'])
+		elif event_name == 'irq__softirq_raise':
+			handle_irq_softirq_raise(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['vec'])
+		elif event_name == 'irq__irq_handler_entry':
+			handle_irq_handler_entry(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['irq'], event['name'])
+		elif event_name == 'irq__irq_handler_exit':
+			handle_irq_handler_exit(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['irq'], event['ret'])
+		elif event_name == 'napi__napi_poll':
+			handle_napi_poll(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['napi'],
+				event['dev_name'])
+		elif event_name == 'net__net_dev_receive':
+			handle_net_dev_receive(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'], event['name'])
+		elif event_name == 'skb__skb_copy_datagram_iovec':
+			handle_skb_copy_datagram_iovec(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'])
+		elif event_name == 'net__net_dev_queue':
+			handle_net_dev_queue(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'], event['name'])
+		elif event_name == 'net__net_dev_xmit':
+			handle_net_dev_xmit(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'],
+				event['skblen'], event['rc'], event['name'])
+		elif event_name == 'skb__dev_kfree_skb_irq':
+			handle_dev_kfree_skb_irq(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'])
+		elif event_name == 'skb__consume_skb':
+			handle_consume_skb(event['event_name'],
+				event['context'], event['common_cpu'],
+				event['common_pid'], event['common_comm'],
+				event['time'], event['skbaddr'])
+	# display receive hunks
+	if show_rx == 1:
+		for i in range(len(receive_hunk_list)):
+			print_receive(receive_hunk_list[i])
+	# display transmit hunks
+	if show_tx == 1:
+		print "   dev    len      dev_queue_xmit|----------|" \
+			"dev_hard_start_xmit|-----|free_skb"
+		print "                         |             |" \
+			"                           |"
+		for i in range(len(free_list)):
+			print_transmit(free_list[i])
+
+def irq__softirq_exit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	vec):
+	if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
+		return
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'vec':vec}
+	all_event_list.append(event_data)
+
+def handle_irq_softirq_exit(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	vec):
+	rec_data = {'sirq_ext_t':time}
+	if common_cpu in irq_dic.keys():
+		rec_data.update({'irq_list':irq_dic[common_cpu]})
+		del irq_dic[common_cpu]
+	if common_cpu in net_rx_dic.keys():
+		rec_data.update({
+		    'event_list':net_rx_dic[common_cpu]['event_list'],
+		    'sirq_ent_t':net_rx_dic[common_cpu]['sirq_ent_t']})
+		del net_rx_dic[common_cpu]
+	# merge information realted to a NET_RX softirq
+	receive_hunk_list.append(rec_data)
+
+def irq__softirq_entry(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	vec):
+	if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
+		return
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'vec':vec}
+	all_event_list.append(event_data)
+
+def handle_irq_softirq_entry(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	vec):
+		net_rx_dic[common_cpu] = {'event_list':[],
+					  'sirq_ent_t':time}
+
+def irq__softirq_raise(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	vec):
+	if symbol_str("irq__softirq_entry", "vec", vec) != "NET_RX":
+		return
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'vec':vec}
+	all_event_list.append(event_data)
+
+def handle_irq_softirq_raise(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	vec):
+	if common_cpu not in irq_dic.keys() \
+	or len(irq_dic[common_cpu]) == 0:
+		return
+	irq = irq_dic[common_cpu].pop()
+	# put a time to prev irq on the same cpu
+	irq.update({'sirq_raise_t':time})
+	irq_dic[common_cpu].append(irq)
+
+def irq__irq_handler_entry(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	irq, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'irq':irq, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_irq_handler_entry(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	irq, name):
+	if common_cpu not in irq_dic.keys():
+		irq_dic[common_cpu] = []
+	irq_record = {'irq':irq,
+		      'name':name,
+		      'cpu':common_cpu,
+		      'irq_ent_t':time}
+	irq_dic[common_cpu].append(irq_record)
+
+def irq__irq_handler_exit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	irq, ret):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'irq':irq, 'ret':ret}
+	all_event_list.append(event_data)
+
+def handle_irq_handler_exit(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	irq, ret):
+	if common_cpu not in irq_dic.keys():
+		return
+	irq_record = irq_dic[common_cpu].pop()
+	irq_record.update({'irq_ext_t':time})
+	# if an irq doesn't include NET_RX softirq, drop.
+	if 'sirq_raise_t' in irq_record.keys():
+		irq_dic[common_cpu].append(irq_record)
+
+def napi__napi_poll(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	napi, dev_name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'napi':napi, 'dev_name':dev_name}
+	all_event_list.append(event_data)
+
+def handle_napi_poll(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	napi, dev_name):
+	if common_cpu in net_rx_dic.keys():
+		event_list = net_rx_dic[common_cpu]['event_list']
+		rec_data = {'event_name':'napi_poll',
+			    'dev':dev_name,
+			    'event_t':time}
+		event_list.append(rec_data)
+
+def net__net_dev_receive(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr,skblen, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_net_dev_receive(event_name, context, common_cpu,
+	ccommon_pid, common_comm, time,
+	skbaddr, skblen, name):
+	if common_cpu in net_rx_dic.keys():
+		rec_data = {'event_name':'netif_receive_skb',
+			    'event_t':time,
+			    'skbaddr':skbaddr,
+			    'len':skblen}
+		event_list = net_rx_dic[common_cpu]['event_list']
+		event_list.append(rec_data)
+		receive_skb_list.insert(0, rec_data)
+
+def skb__skb_copy_datagram_iovec(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, skblen):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen}
+	all_event_list.append(event_data)
+
+def handle_skb_copy_datagram_iovec(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr, skblen):
+	for i in range(len(receive_skb_list)):
+		rec_data = receive_skb_list[i]
+		if skbaddr == rec_data['skbaddr'] and \
+			'comm' not in rec_data.keys():
+			rec_data.update({'comm':common_comm,
+					 'pid':common_pid,
+					 'comm_t':time})
+			del receive_skb_list[i]
+			break
+
+def net__net_dev_queue(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, skblen, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_net_dev_queue(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr, skblen, name):
+	skb = {'dev':name,
+	       'skbaddr':skbaddr,
+	       'len':skblen,
+	       'queue_t':time}
+	xmit_list.insert(0, skb)
+
+def net__net_dev_xmit(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr, skblen, rc, name):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr, 'skblen':skblen, 'rc':rc, 'name':name}
+	all_event_list.append(event_data)
+
+def handle_net_dev_xmit(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr, skblen, rc, name):
+	if rc == 0: # NETDEV_TX_OK
+		for i in range(len(xmit_list)):
+			skb = xmit_list[i]
+			if skb['skbaddr'] == skbaddr:
+				skb['xmit_t'] = time
+				queue_list.insert(0, skb)
+				del xmit_list[i]
+				break
+
+def free_skb(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr):
+	for i in range(len(queue_list)):
+		skb = queue_list[i]
+		if skb['skbaddr'] ==skbaddr:
+			skb['free_t'] = time
+			free_list.append(skb)
+			del queue_list[i]
+			break
+
+def skb__dev_kfree_skb_irq(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr}
+	all_event_list.append(event_data)
+
+def handle_dev_kfree_skb_irq(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr):
+	free_skb(event_name, context, common_cpu,
+		common_pid, common_comm, time,
+		skbaddr)
+
+def skb__consume_skb(event_name, context, common_cpu,
+	common_secs, common_nsecs, common_pid, common_comm,
+	skbaddr):
+	event_data = {'event_name':event_name, 'context':context,
+		'common_cpu':common_cpu, 'common_pid':common_pid,
+		'common_comm':common_comm,'time':nsecs(common_secs,
+							common_nsecs),
+		'skbaddr':skbaddr}
+	all_event_list.append(event_data)
+
+def handle_consume_skb(event_name, context, common_cpu,
+	common_pid, common_comm, time,
+	skbaddr):
+	free_skb(event_name, context, common_cpu,
+		common_pid, common_comm, time,
+		skbaddr)


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-25  0:57 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
@ 2010-06-25  1:10   ` KOSAKI Motohiro
  2010-06-25  7:46     ` Frederic Weisbecker
  2010-06-25  7:51     ` Koki Sanagi
  0 siblings, 2 replies; 26+ messages in thread
From: KOSAKI Motohiro @ 2010-06-25  1:10 UTC (permalink / raw)
  To: Koki Sanagi
  Cc: kosaki.motohiro, linux-kernel, kaneshige.kenji, izumi.taku,
	Lai Jiangshan

(cc to Lai)

> This patch adds a tracepoint to raising of softirq.
> This is useful if you want to detect which hard interrupt raise softirq
> and lets you know a time between raising softirq and performing softirq.
> Combinating with other tracepoint, it lets us know a process of packets
> (See patch 0/5).
> 
>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
> 
> This is a same patch Lai Jiangshan submitted.
> http://marc.info/?l=linux-kernel&m=126026122728732&w=2

If so, you shouldn't remove author's info. please fix the description
as following.

-------------------------------------------------------------------------
From: Lai Jiangshan <laijs@cn.fujitsu.com>

This patch adds a tracepoint to raising of softirq.
This is useful if you want to detect which hard interrupt raise softirq
and lets you know a time between raising softirq and performing softirq.
Combinating with other tracepoint, it lets us know a process of packets
(See patch 0/5).

          <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
          <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
          <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]

This is a same below.
http://marc.info/?l=linux-kernel&m=126026122728732&w=2

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
--------------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT
  2010-06-25  0:58 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
@ 2010-06-25  1:11   ` KOSAKI Motohiro
  2010-06-25  7:57   ` Frederic Weisbecker
  1 sibling, 0 replies; 26+ messages in thread
From: KOSAKI Motohiro @ 2010-06-25  1:11 UTC (permalink / raw)
  To: Koki Sanagi; +Cc: kosaki.motohiro, linux-kernel, kaneshige.kenji, izumi.taku

> This patch converts trace_napi_poll from DECLARE_EVENT to TRACE_EVENT.
> This lets you know how long napi_poll takes.
> 
>           <idle>-0     [001] 241302.750777: napi_poll: napi poll on napi struct f6acc480 for device eth3
>           <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d70c for device eth1
>           <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d20c for device eth1
> 
> This is a same patch Neil Horman submitted.
> http://marc.info/?l=linux-kernel&m=125978157926853&w=2

Ditto.
Please don't remove Neil's S-O-B.




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-25  0:59 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
@ 2010-06-25  4:55   ` Eric Dumazet
  2010-06-25  7:12     ` Koki Sanagi
  2010-06-28 18:12   ` Steven Rostedt
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2010-06-25  4:55 UTC (permalink / raw)
  To: Koki Sanagi; +Cc: linux-kernel, kaneshige.kenji, izumi.taku

Le vendredi 25 juin 2010 à 09:59 +0900, Koki Sanagi a écrit :
> This patch adds tracepoint to consume_skb and dev_kfree_skb_irq.
> Combinating with tracepoint on dev_hard_start_xmit, we can check how long it
> takes to free transmited packets. And using it, we can calculate how many
> packets driver had at that time. It is useful when a drop of transmited packet
> is a problem.
> 
>           <idle>-0     [001] 241409.218333: consume_skb: skbaddr=dd6b2fb8
>           <idle>-0     [001] 241409.490555: dev_kfree_skb_irq: skbaddr=f5e29840
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> ---
> include/trace/events/skb.h |   36 ++++++++++++++++++++++++++++++++++++
>  net/core/dev.c             |    2 ++
>  net/core/skbuff.c          |    1 +
>  3 files changed, 39 insertions(+), 0 deletions(-)
> 
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index 4b2be6d..6ab5b34 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -35,6 +35,42 @@ TRACE_EVENT(kfree_skb,
>  		__entry->skbaddr, __entry->protocol, __entry->location)
>  );
>  
> +TRACE_EVENT(consume_skb,
> +
> +	TP_PROTO(struct sk_buff *skb),
> +
> +	TP_ARGS(skb),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,	skbaddr	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +	),
> +
> +	TP_printk("skbaddr=%p",
> +		__entry->skbaddr)
> +);
> +
> +TRACE_EVENT(dev_kfree_skb_irq,
> +
> +	TP_PROTO(struct sk_buff *skb),
> +
> +	TP_ARGS(skb),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,	skbaddr	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +	),
> +
> +	TP_printk("skbaddr=%p",
> +		__entry->skbaddr)
> +);
> +
>  TRACE_EVENT(skb_copy_datagram_iovec,
>  
>  	TP_PROTO(const struct sk_buff *skb, int len),
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 4b64b21..807b1ca 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -131,6 +131,7 @@
>  #include <linux/random.h>
>  #include <trace/events/napi.h>
>  #include <trace/events/net.h>
> +#include <trace/events/skb.h>
>  #include <linux/pci.h>
>  
>  #include "net-sysfs.h"
> @@ -1580,6 +1581,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
>  		struct softnet_data *sd;
>  		unsigned long flags;
>  
> +		trace_dev_kfree_skb_irq(skb);
>  		local_irq_save(flags);
>  		sd = &__get_cpu_var(softnet_data);
>  		skb->next = sd->completion_queue;
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 34432b4..a7b4036 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
>  		smp_rmb();
>  	else if (likely(!atomic_dec_and_test(&skb->users)))
>  		return;
> +	trace_consume_skb(skb);
>  	__kfree_skb(skb);
>  }
>  EXPORT_SYMBOL(consume_skb);
> 


You might add a trace point to skb_free_datagram_locked() too, since it
contains an inlined consume_skb()




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-25  4:55   ` Eric Dumazet
@ 2010-06-25  7:12     ` Koki Sanagi
  2010-06-25 20:05       ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  7:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, kaneshige.kenji, izumi.taku

(2010/06/25 13:55), Eric Dumazet wrote:
> Le vendredi 25 juin 2010 à 09:59 +0900, Koki Sanagi a écrit :
>> This patch adds tracepoint to consume_skb and dev_kfree_skb_irq.
>> Combinating with tracepoint on dev_hard_start_xmit, we can check how long it
>> takes to free transmited packets. And using it, we can calculate how many
>> packets driver had at that time. It is useful when a drop of transmited packet
>> is a problem.
>>
>>           <idle>-0     [001] 241409.218333: consume_skb: skbaddr=dd6b2fb8
>>           <idle>-0     [001] 241409.490555: dev_kfree_skb_irq: skbaddr=f5e29840
>>
>> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
>> ---
>> include/trace/events/skb.h |   36 ++++++++++++++++++++++++++++++++++++
>>  net/core/dev.c             |    2 ++
>>  net/core/skbuff.c          |    1 +
>>  3 files changed, 39 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
>> index 4b2be6d..6ab5b34 100644
>> --- a/include/trace/events/skb.h
>> +++ b/include/trace/events/skb.h
>> @@ -35,6 +35,42 @@ TRACE_EVENT(kfree_skb,
>>  		__entry->skbaddr, __entry->protocol, __entry->location)
>>  );
>>  
>> +TRACE_EVENT(consume_skb,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,	skbaddr	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +	),
>> +
>> +	TP_printk("skbaddr=%p",
>> +		__entry->skbaddr)
>> +);
>> +
>> +TRACE_EVENT(dev_kfree_skb_irq,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,	skbaddr	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +	),
>> +
>> +	TP_printk("skbaddr=%p",
>> +		__entry->skbaddr)
>> +);
>> +
>>  TRACE_EVENT(skb_copy_datagram_iovec,
>>  
>>  	TP_PROTO(const struct sk_buff *skb, int len),
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 4b64b21..807b1ca 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -131,6 +131,7 @@
>>  #include <linux/random.h>
>>  #include <trace/events/napi.h>
>>  #include <trace/events/net.h>
>> +#include <trace/events/skb.h>
>>  #include <linux/pci.h>
>>  
>>  #include "net-sysfs.h"
>> @@ -1580,6 +1581,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
>>  		struct softnet_data *sd;
>>  		unsigned long flags;
>>  
>> +		trace_dev_kfree_skb_irq(skb);
>>  		local_irq_save(flags);
>>  		sd = &__get_cpu_var(softnet_data);
>>  		skb->next = sd->completion_queue;
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 34432b4..a7b4036 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
>>  		smp_rmb();
>>  	else if (likely(!atomic_dec_and_test(&skb->users)))
>>  		return;
>> +	trace_consume_skb(skb);
>>  	__kfree_skb(skb);
>>  }
>>  EXPORT_SYMBOL(consume_skb);
>>
> 
> 
> You might add a trace point to skb_free_datagram_locked() too, since it
> contains an inlined consume_skb()
> 

I think it is contrary.
skb_free_datagram_locked() contains consume_skb(), so tracepoint isn't needed.
Because skb_free_datagram_locked() can be traced by trace_consume_skb().




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-25  1:10   ` KOSAKI Motohiro
@ 2010-06-25  7:46     ` Frederic Weisbecker
  2010-06-25  7:51     ` Koki Sanagi
  1 sibling, 0 replies; 26+ messages in thread
From: Frederic Weisbecker @ 2010-06-25  7:46 UTC (permalink / raw)
  To: KOSAKI Motohiro, Koki Sanagi
  Cc: linux-kernel, kaneshige.kenji, izumi.taku, Lai Jiangshan

2010/6/25 KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>:
> (cc to Lai)
>
>> This patch adds a tracepoint to raising of softirq.
>> This is useful if you want to detect which hard interrupt raise softirq
>> and lets you know a time between raising softirq and performing softirq.
>> Combinating with other tracepoint, it lets us know a process of packets
>> (See patch 0/5).
>>
>>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
>>
>> This is a same patch Lai Jiangshan submitted.
>> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
>
> If so, you shouldn't remove author's info. please fix the description
> as following.
>
> -------------------------------------------------------------------------
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> This patch adds a tracepoint to raising of softirq.
> This is useful if you want to detect which hard interrupt raise softirq
> and lets you know a time between raising softirq and performing softirq.
> Combinating with other tracepoint, it lets us know a process of packets
> (See patch 0/5).
>
>          <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>          <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>          <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
>
> This is a same below.
> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Acked-by: Frederic Weisbecker <fweisbec@gmail.com>


Right. And if you do some changes in the original patch, you can just
change the description, or add a simple line before your Signed-off-by, that
says:

[ factorize softirq events with DECLARE_EVENT_CLASS ]

Because indeed, as Steve said, the softirq_raise tracepoint should
belong to the softirq event class as it perfectly matches it. This
helps the kernel
saving some space.

Thanks.


> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> --------------------------------------------------------------------------------
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
  2010-06-25  1:10   ` KOSAKI Motohiro
  2010-06-25  7:46     ` Frederic Weisbecker
@ 2010-06-25  7:51     ` Koki Sanagi
  1 sibling, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-25  7:51 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-kernel, kaneshige.kenji, izumi.taku, Lai Jiangshan

Thanks Kosaki,

I'll fix a description.

(2010/06/25 10:10), KOSAKI Motohiro wrote:
> (cc to Lai)
> 
>> This patch adds a tracepoint to raising of softirq.
>> This is useful if you want to detect which hard interrupt raise softirq
>> and lets you know a time between raising softirq and performing softirq.
>> Combinating with other tracepoint, it lets us know a process of packets
>> (See patch 0/5).
>>
>>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
>>
>> This is a same patch Lai Jiangshan submitted.
>> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
> 
> If so, you shouldn't remove author's info. please fix the description
> as following.
> 
> -------------------------------------------------------------------------
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> 
> This patch adds a tracepoint to raising of softirq.
> This is useful if you want to detect which hard interrupt raise softirq
> and lets you know a time between raising softirq and performing softirq.
> Combinating with other tracepoint, it lets us know a process of packets
> (See patch 0/5).
> 
>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
> 
> This is a same below.
> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
> 
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> --------------------------------------------------------------------------------
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT
  2010-06-25  0:58 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
  2010-06-25  1:11   ` KOSAKI Motohiro
@ 2010-06-25  7:57   ` Frederic Weisbecker
  1 sibling, 0 replies; 26+ messages in thread
From: Frederic Weisbecker @ 2010-06-25  7:57 UTC (permalink / raw)
  To: Koki Sanagi; +Cc: linux-kernel, kaneshige.kenji, izumi.taku, KOSAKI Motohiro

2010/6/25 Koki Sanagi <sanagi.koki@jp.fujitsu.com>:
> This patch converts trace_napi_poll from DECLARE_EVENT to TRACE_EVENT.


>From DECLARE_TRACE to TRACE_EVENT.

Or more generally: this expands the napi polling tracepoint to a trace
event, to improve the
napi polling tracepoint usability.


> This lets you know how long napi_poll takes.
>
>          <idle>-0     [001] 241302.750777: napi_poll: napi poll on napi struct f6acc480 for device eth3
>          <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d70c for device eth1
>          <idle>-0     [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d20c for device eth1
>
> This is a same patch Neil Horman submitted.
> http://marc.info/?l=linux-kernel&m=125978157926853&w=2

Hmm, nope, doesn't seem to be the same patch.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-25  7:12     ` Koki Sanagi
@ 2010-06-25 20:05       ` Eric Dumazet
  2010-06-28  0:25         ` Koki Sanagi
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2010-06-25 20:05 UTC (permalink / raw)
  To: Koki Sanagi; +Cc: linux-kernel, kaneshige.kenji, izumi.taku, netdev

Le vendredi 25 juin 2010 à 16:12 +0900, Koki Sanagi a écrit :

> > You might add a trace point to skb_free_datagram_locked() too, since it
> > contains an inlined consume_skb()
> > 
> 
> I think it is contrary.

I think you are _very_ wrong.

> skb_free_datagram_locked() contains consume_skb(), so tracepoint isn't needed.
> Because skb_free_datagram_locked() can be traced by trace_consume_skb().
> 
> 
> 

Koki, it would be good if you worked on net-next-2.6, so that my comment
applies.

Also, not sending this kind of patches on netdev is not going to help
very much.

Who is supposed to review them on lkml and Ack them ? 

Nobody.

Please build your future network related patches against net-next-2.6,
and send them to nedev and David Miller, the official network
maintainer, as stated in MAINTAINERS file.

NETWORKING [GENERAL]
M:      "David S. Miller" <davem@davemloft.net>
L:      netdev@vger.kernel.org
W:      http://www.linuxfoundation.org/en/Net
W:      http://patchwork.ozlabs.org/project/netdev/list/
T:      git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.git
T:      git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git
S:      Maintained
F:      net/
F:      include/net/
F:      include/linux/in.h
F:      include/linux/net.h
F:      include/linux/netdevice.h



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-25 20:05       ` Eric Dumazet
@ 2010-06-28  0:25         ` Koki Sanagi
  0 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-28  0:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, kaneshige.kenji, izumi.taku, netdev

(2010/06/26 5:05), Eric Dumazet wrote:
> Le vendredi 25 juin 2010 à 16:12 +0900, Koki Sanagi a écrit :
> 
>>> You might add a trace point to skb_free_datagram_locked() too, since it
>>> contains an inlined consume_skb()
>>>
>>
>> I think it is contrary.
> 
> I think you are _very_ wrong.
> 
>> skb_free_datagram_locked() contains consume_skb(), so tracepoint isn't needed.
>> Because skb_free_datagram_locked() can be traced by trace_consume_skb().
>>
>>
>>
> 
> Koki, it would be good if you worked on net-next-2.6, so that my comment
> applies.

Yes, I am very wrong... I saw skb_free_datagram_locked() at linux-2.6.34.
It would be good to add tracepoint to it.
 
> Also, not sending this kind of patches on netdev is not going to help
> very much.
> 
> Who is supposed to review them on lkml and Ack them ? 
> 
> Nobody.
> 
> Please build your future network related patches against net-next-2.6,
> and send them to nedev and David Miller, the official network
> maintainer, as stated in MAINTAINERS file.
> 
> NETWORKING [GENERAL]
> M:      "David S. Miller" <davem@davemloft.net>
> L:      netdev@vger.kernel.org
> W:      http://www.linuxfoundation.org/en/Net
> W:      http://patchwork.ozlabs.org/project/netdev/list/
> T:      git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.git
> T:      git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git
> S:      Maintained
> F:      net/
> F:      include/net/
> F:      include/linux/in.h
> F:      include/linux/net.h
> F:      include/linux/netdevice.h

Yes, my submitting this time has many improper things(address, format and description...).
I'll check and modify them next time.

Thanks,
Koki Sanagi.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer
  2010-06-25  0:58 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
@ 2010-06-28 18:09   ` Steven Rostedt
  2010-06-29  0:27     ` Koki Sanagi
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2010-06-28 18:09 UTC (permalink / raw)
  To: Koki Sanagi; +Cc: linux-kernel, kaneshige.kenji, izumi.taku

On Fri, 2010-06-25 at 09:58 +0900, Koki Sanagi wrote:
> This patch adds tracepoint to dev_queue_xmit, dev_hard_start_xmit and
> netif_receive_skb. These tracepints help you to monitor network driver's
> input/output.
> 
>             sshd-4445  [001] 241367.066046: net_dev_queue: dev=eth3 skbaddr=dd6b2538 len=114
>             sshd-4445  [001] 241367.066047: net_dev_xmit: dev=eth3 skbaddr=dd6b2538 len=114 rc=0
>           <idle>-0     [001] 241367.067472: net_dev_receive: dev=eth3 skbaddr=f5e59000 len=52
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> ---
>  include/trace/events/net.h |   83 ++++++++++++++++++++++++++++++++++++++++++++
>  net/core/dev.c             |    5 +++
>  net/core/net-traces.c      |    1 +
>  3 files changed, 89 insertions(+), 0 deletions(-)
> 
> diff --git a/include/trace/events/net.h b/include/trace/events/net.h
> new file mode 100644
> index 0000000..ee10970
> --- /dev/null
> +++ b/include/trace/events/net.h
> @@ -0,0 +1,83 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM net
> +
> +#if !defined(_TRACE_NET_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_NET_H
> +
> +#include <linux/skbuff.h>
> +#include <linux/netdevice.h>
> +#include <linux/ip.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(net_dev_xmit,
> +
> +	TP_PROTO(struct sk_buff *skb,
> +		 int rc),
> +
> +	TP_ARGS(skb, rc),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,		skbaddr		)
> +		__field(	unsigned int,	len		)
> +		__field(	int,		rc		)
> +		__string(	name,		skb->dev->name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +		__entry->len = skb->len;
> +		__entry->rc = rc;
> +		__assign_str(name, skb->dev->name);
> +	),
> +
> +	TP_printk("dev=%s skbaddr=%p len=%u rc=%d",
> +		__get_str(name), __entry->skbaddr, __entry->len, __entry->rc)
> +);
> +
> +TRACE_EVENT(net_dev_queue,
> +
> +	TP_PROTO(struct sk_buff *skb),
> +
> +	TP_ARGS(skb),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,		skbaddr		)
> +		__field(	unsigned int,	len		)
> +		__string(	name,		skb->dev->name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +		__entry->len = skb->len;
> +		__assign_str(name, skb->dev->name);
> +	),
> +
> +	TP_printk("dev=%s skbaddr=%p len=%u",
> +		__get_str(name), __entry->skbaddr, __entry->len)
> +);
> +
> +TRACE_EVENT(net_dev_receive,

This and net_dev_queue are identical. Please use DECLARE_EVENT_CLASS()
and DEFINE_EVENT() to create these, otherwise it bloats the kernel.

-- Steve

> +
> +	TP_PROTO(struct sk_buff *skb),
> +
> +	TP_ARGS(skb),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,		skbaddr		)
> +		__field(	unsigned int,	len		)
> +		__string(	name,		skb->dev->name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +		__entry->len = skb->len;
> +		__assign_str(name, skb->dev->name);
> +	),
> +
> +	TP_printk("dev=%s skbaddr=%p len=%u",
> +		__get_str(name), __entry->skbaddr, __entry->len)
> +);
> +#endif /* _TRACE_NET_H */
> +



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-25  0:59 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
  2010-06-25  4:55   ` Eric Dumazet
@ 2010-06-28 18:12   ` Steven Rostedt
  2010-06-29  0:28     ` Koki Sanagi
  1 sibling, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2010-06-28 18:12 UTC (permalink / raw)
  To: Koki Sanagi; +Cc: linux-kernel, kaneshige.kenji, izumi.taku

On Fri, 2010-06-25 at 09:59 +0900, Koki Sanagi wrote:
> This patch adds tracepoint to consume_skb and dev_kfree_skb_irq.
> Combinating with tracepoint on dev_hard_start_xmit, we can check how long it
> takes to free transmited packets. And using it, we can calculate how many
> packets driver had at that time. It is useful when a drop of transmited packet
> is a problem.
> 
>           <idle>-0     [001] 241409.218333: consume_skb: skbaddr=dd6b2fb8
>           <idle>-0     [001] 241409.490555: dev_kfree_skb_irq: skbaddr=f5e29840
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> ---
> include/trace/events/skb.h |   36 ++++++++++++++++++++++++++++++++++++
>  net/core/dev.c             |    2 ++
>  net/core/skbuff.c          |    1 +
>  3 files changed, 39 insertions(+), 0 deletions(-)
> 
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index 4b2be6d..6ab5b34 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -35,6 +35,42 @@ TRACE_EVENT(kfree_skb,
>  		__entry->skbaddr, __entry->protocol, __entry->location)
>  );
>  
> +TRACE_EVENT(consume_skb,
> +
> +	TP_PROTO(struct sk_buff *skb),
> +
> +	TP_ARGS(skb),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,	skbaddr	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +	),
> +
> +	TP_printk("skbaddr=%p",
> +		__entry->skbaddr)
> +);
> +
> +TRACE_EVENT(dev_kfree_skb_irq,
> +
> +	TP_PROTO(struct sk_buff *skb),
> +
> +	TP_ARGS(skb),
> +
> +	TP_STRUCT__entry(
> +		__field(	void *,	skbaddr	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->skbaddr = skb;
> +	),
> +
> +	TP_printk("skbaddr=%p",
> +		__entry->skbaddr)
> +);

These two tracepoints are also identical in body. Please use
DECLARE_EVENT_CLASS() and DEFINE_EVENT() instead.

-- Steve

> +
>  TRACE_EVENT(skb_copy_datagram_iovec,
>  
>  	TP_PROTO(const struct sk_buff *skb, int len),
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 4b64b21..807b1ca 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -131,6 +131,7 @@
>  #include <linux/random.h>
>  #include <trace/events/napi.h>
>  #include <trace/events/net.h>
> +#include <trace/events/skb.h>
>  #include <linux/pci.h>
>  
>  #include "net-sysfs.h"
> @@ -1580,6 +1581,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
>  		struct softnet_data *sd;
>  		unsigned long flags;
>  
> +		trace_dev_kfree_skb_irq(skb);
>  		local_irq_save(flags);
>  		sd = &__get_cpu_var(softnet_data);
>  		skb->next = sd->completion_queue;
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 34432b4..a7b4036 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
>  		smp_rmb();
>  	else if (likely(!atomic_dec_and_test(&skb->users)))
>  		return;
> +	trace_consume_skb(skb);
>  	__kfree_skb(skb);
>  }
>  EXPORT_SYMBOL(consume_skb);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer
  2010-06-28 18:09   ` Steven Rostedt
@ 2010-06-29  0:27     ` Koki Sanagi
  0 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-29  0:27 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, kaneshige.kenji, izumi.taku

(2010/06/29 3:09), Steven Rostedt wrote:
> On Fri, 2010-06-25 at 09:58 +0900, Koki Sanagi wrote:
>> This patch adds tracepoint to dev_queue_xmit, dev_hard_start_xmit and
>> netif_receive_skb. These tracepints help you to monitor network driver's
>> input/output.
>>
>>             sshd-4445  [001] 241367.066046: net_dev_queue: dev=eth3 skbaddr=dd6b2538 len=114
>>             sshd-4445  [001] 241367.066047: net_dev_xmit: dev=eth3 skbaddr=dd6b2538 len=114 rc=0
>>           <idle>-0     [001] 241367.067472: net_dev_receive: dev=eth3 skbaddr=f5e59000 len=52
>>
>> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
>> ---
>>  include/trace/events/net.h |   83 ++++++++++++++++++++++++++++++++++++++++++++
>>  net/core/dev.c             |    5 +++
>>  net/core/net-traces.c      |    1 +
>>  3 files changed, 89 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/trace/events/net.h b/include/trace/events/net.h
>> new file mode 100644
>> index 0000000..ee10970
>> --- /dev/null
>> +++ b/include/trace/events/net.h
>> @@ -0,0 +1,83 @@
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM net
>> +
>> +#if !defined(_TRACE_NET_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_NET_H
>> +
>> +#include <linux/skbuff.h>
>> +#include <linux/netdevice.h>
>> +#include <linux/ip.h>
>> +#include <linux/tracepoint.h>
>> +
>> +TRACE_EVENT(net_dev_xmit,
>> +
>> +	TP_PROTO(struct sk_buff *skb,
>> +		 int rc),
>> +
>> +	TP_ARGS(skb, rc),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,		skbaddr		)
>> +		__field(	unsigned int,	len		)
>> +		__field(	int,		rc		)
>> +		__string(	name,		skb->dev->name	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +		__entry->len = skb->len;
>> +		__entry->rc = rc;
>> +		__assign_str(name, skb->dev->name);
>> +	),
>> +
>> +	TP_printk("dev=%s skbaddr=%p len=%u rc=%d",
>> +		__get_str(name), __entry->skbaddr, __entry->len, __entry->rc)
>> +);
>> +
>> +TRACE_EVENT(net_dev_queue,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,		skbaddr		)
>> +		__field(	unsigned int,	len		)
>> +		__string(	name,		skb->dev->name	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +		__entry->len = skb->len;
>> +		__assign_str(name, skb->dev->name);
>> +	),
>> +
>> +	TP_printk("dev=%s skbaddr=%p len=%u",
>> +		__get_str(name), __entry->skbaddr, __entry->len)
>> +);
>> +
>> +TRACE_EVENT(net_dev_receive,
> 
> This and net_dev_queue are identical. Please use DECLARE_EVENT_CLASS()
> and DEFINE_EVENT() to create these, otherwise it bloats the kernel.
> 
> -- Steve

OK, I'll do that.

Thanks,
Koki Sanagi.
> 
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,		skbaddr		)
>> +		__field(	unsigned int,	len		)
>> +		__string(	name,		skb->dev->name	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +		__entry->len = skb->len;
>> +		__assign_str(name, skb->dev->name);
>> +	),
>> +
>> +	TP_printk("dev=%s skbaddr=%p len=%u",
>> +		__get_str(name), __entry->skbaddr, __entry->len)
>> +);
>> +#endif /* _TRACE_NET_H */
>> +
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb
  2010-06-28 18:12   ` Steven Rostedt
@ 2010-06-29  0:28     ` Koki Sanagi
  0 siblings, 0 replies; 26+ messages in thread
From: Koki Sanagi @ 2010-06-29  0:28 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, kaneshige.kenji, izumi.taku

(2010/06/29 3:12), Steven Rostedt wrote:
> On Fri, 2010-06-25 at 09:59 +0900, Koki Sanagi wrote:
>> This patch adds tracepoint to consume_skb and dev_kfree_skb_irq.
>> Combinating with tracepoint on dev_hard_start_xmit, we can check how long it
>> takes to free transmited packets. And using it, we can calculate how many
>> packets driver had at that time. It is useful when a drop of transmited packet
>> is a problem.
>>
>>           <idle>-0     [001] 241409.218333: consume_skb: skbaddr=dd6b2fb8
>>           <idle>-0     [001] 241409.490555: dev_kfree_skb_irq: skbaddr=f5e29840
>>
>> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
>> ---
>> include/trace/events/skb.h |   36 ++++++++++++++++++++++++++++++++++++
>>  net/core/dev.c             |    2 ++
>>  net/core/skbuff.c          |    1 +
>>  3 files changed, 39 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
>> index 4b2be6d..6ab5b34 100644
>> --- a/include/trace/events/skb.h
>> +++ b/include/trace/events/skb.h
>> @@ -35,6 +35,42 @@ TRACE_EVENT(kfree_skb,
>>  		__entry->skbaddr, __entry->protocol, __entry->location)
>>  );
>>  
>> +TRACE_EVENT(consume_skb,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,	skbaddr	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +	),
>> +
>> +	TP_printk("skbaddr=%p",
>> +		__entry->skbaddr)
>> +);
>> +
>> +TRACE_EVENT(dev_kfree_skb_irq,
>> +
>> +	TP_PROTO(struct sk_buff *skb),
>> +
>> +	TP_ARGS(skb),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	void *,	skbaddr	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->skbaddr = skb;
>> +	),
>> +
>> +	TP_printk("skbaddr=%p",
>> +		__entry->skbaddr)
>> +);
> 
> These two tracepoints are also identical in body. Please use
> DECLARE_EVENT_CLASS() and DEFINE_EVENT() instead.
> 
> -- Steve
> 

OK, I'll do that.

Thanks,
Koki Sanagi.

>> +
>>  TRACE_EVENT(skb_copy_datagram_iovec,
>>  
>>  	TP_PROTO(const struct sk_buff *skb, int len),
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 4b64b21..807b1ca 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -131,6 +131,7 @@
>>  #include <linux/random.h>
>>  #include <trace/events/napi.h>
>>  #include <trace/events/net.h>
>> +#include <trace/events/skb.h>
>>  #include <linux/pci.h>
>>  
>>  #include "net-sysfs.h"
>> @@ -1580,6 +1581,7 @@ void dev_kfree_skb_irq(struct sk_buff *skb)
>>  		struct softnet_data *sd;
>>  		unsigned long flags;
>>  
>> +		trace_dev_kfree_skb_irq(skb);
>>  		local_irq_save(flags);
>>  		sd = &__get_cpu_var(softnet_data);
>>  		skb->next = sd->completion_queue;
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 34432b4..a7b4036 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -466,6 +466,7 @@ void consume_skb(struct sk_buff *skb)
>>  		smp_rmb();
>>  	else if (likely(!atomic_dec_and_test(&skb->users)))
>>  		return;
>> +	trace_consume_skb(skb);
>>  	__kfree_skb(skb);
>>  }
>>  EXPORT_SYMBOL(consume_skb);
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-06-29  0:28 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-24  8:09 [RFC PATCH v2 0/5] netdev: show a process of packets Koki Sanagi
2010-06-24  8:16 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
2010-06-24 19:15   ` Steven Rostedt
2010-06-25  0:49     ` Koki Sanagi
2010-06-24  8:17 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
2010-06-24  8:18 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
2010-06-24  8:19 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
2010-06-24  8:21 ` [RFC PATCH v2 5/5] perf:add a script shows a process of packet Koki Sanagi
2010-06-25  0:57 ` [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise Koki Sanagi
2010-06-25  1:10   ` KOSAKI Motohiro
2010-06-25  7:46     ` Frederic Weisbecker
2010-06-25  7:51     ` Koki Sanagi
2010-06-25  0:58 ` [RFC PATCH v2 2/5] napi: convert trace_napi_poll to TRACE_EVENT Koki Sanagi
2010-06-25  1:11   ` KOSAKI Motohiro
2010-06-25  7:57   ` Frederic Weisbecker
2010-06-25  0:58 ` [RFC PATCH v2 3/5] netdev: add tracepoints to netdev layer Koki Sanagi
2010-06-28 18:09   ` Steven Rostedt
2010-06-29  0:27     ` Koki Sanagi
2010-06-25  0:59 ` [RFC PATCH v2 4/5] skb: add tracepoints to freeing skb Koki Sanagi
2010-06-25  4:55   ` Eric Dumazet
2010-06-25  7:12     ` Koki Sanagi
2010-06-25 20:05       ` Eric Dumazet
2010-06-28  0:25         ` Koki Sanagi
2010-06-28 18:12   ` Steven Rostedt
2010-06-29  0:28     ` Koki Sanagi
2010-06-25  0:59 ` [RFC PATCH v2 5/5] perf:add a script shows a process of packet Koki Sanagi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.