All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations
@ 2009-08-07 20:21 Neil Horman
  2009-08-07 20:28 ` [PATCH 1/3] " Neil Horman
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-07 20:21 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt, nhorman

Hey all-
	I put out an RFC about this awhile ago and didn't get any loud screams,
so I've gone ahead and implemented it

	Currently, our network infrastructure allows net device drivers to
allocate skbs based on the the numa node the device itself is local to.  This of
course cuts down on cross numa chatter when the device is DMA-ing network
traffic to the driver.  Unfortuantely no such corresponding infrastrucuture
exists at the process level.  The scheduler has no insight into the numa
locality of incomming data packets for a given process (and arguably it
shouldn't), and so there is every chance that a process will run on a different
numa node than the packets that its receiving lives on, creating cross numa node
traffic.

	This patch aims to provide userspace with the opportunity to optimize
that scheduling.  It consists of a tracepoint and an ftrace module which exports
a history of the packets each process receives, along with the numa node each
packet was received on, as well as the numa node the process was running on when
it copied the buffer to user space.  With this information, exported via the
ftrace infrastructure to user space, a sysadim can identify high prirority
processes, and optimize their scheduling so that they are more likely to run on
the same node that they are primarily receiving data on, thereby cutting down
cross numa node traffic.

Tested by me, working well, applies against the head of the net-next tree

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations
  2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
@ 2009-08-07 20:28 ` Neil Horman
  2009-08-07 20:30 ` [PATCH 2/3] " Neil Horman
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-07 20:28 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / cosumption tracer - Add consumption tracepoint

This patch adds a tracepoint to skb_copy_datagram_iovec, which is called each
time a userspace process copies a frame from a socket receive queue to a user
space buffer.  It allows us to hook in and examine each sk_buff that the system
receives on a per-socket bases, and can be use to compile a list of which skb's
were received by which processes.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 include/trace/events/skb.h |   20 ++++++++++++++++++++
 net/core/datagram.c        |    3 +++
 2 files changed, 23 insertions(+)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 1e8fabb..bdc5c1c 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -2,6 +2,7 @@
 #define _TRACE_SKB_H
 
 #include <linux/skbuff.h>
+#include <linux/netdevice.h>
 #include <linux/tracepoint.h>
 
 #undef TRACE_SYSTEM
@@ -34,6 +35,25 @@ TRACE_EVENT(kfree_skb,
 		__entry->skbaddr, __entry->protocol, __entry->location)
 );
 
+TRACE_EVENT(skb_copy_datagram_iovec,
+
+	TP_PROTO(const struct sk_buff *skb, int len),
+
+	TP_ARGS(skb, len),
+
+	TP_STRUCT__entry(
+		__field(	const void *,		skbaddr		)
+		__field(	int,			len		)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = len;
+	),
+
+	TP_printk("skbaddr=%p len=%d", __entry->skbaddr, __entry->len)
+);
+
 #endif /* _TRACE_SKB_H */
 
 /* This part must be outside protection */
diff --git a/net/core/datagram.c b/net/core/datagram.c
index b0fe692..1c6cf3a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -55,6 +55,7 @@
 #include <net/checksum.h>
 #include <net/sock.h>
 #include <net/tcp_states.h>
+#include <trace/events/skb.h>
 
 /*
  *	Is a socket 'connection oriented' ?
@@ -284,6 +285,8 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 
+	trace_skb_copy_datagram_iovec(skb, len);
+
 	/* Copy header. */
 	if (copy > 0) {
 		if (copy > len)

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations
  2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
  2009-08-07 20:28 ` [PATCH 1/3] " Neil Horman
@ 2009-08-07 20:30 ` Neil Horman
  2009-08-07 20:44 ` [PATCH 3/3] " Neil Horman
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-07 20:30 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / consumption corelator - Add config option

This patch adds a Kconfig option to enable the addtition of the skb source
tracer.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 Kconfig |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 1551f47..1aeec05 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -234,6 +234,16 @@ config BOOT_TRACER
 	  You must pass in ftrace=initcall to the kernel command line
 	  to enable this on bootup.
 
+config SKB_SOURCES_TRACER
+	bool "Trace skb source information
+	select GENERIC_TRACER
+	help
+	   This tracer helps developers/sysadmins correlate skb allocation and
+	   consumption.  The idea being that some processes will primarily consume data
+	   that was allocated on certain numa nodes.  By being able to visualize which
+	   nodes the data was allocated on, a sysadmin or developer can optimize the
+	   scheduling of those processes to cut back on cross node chatter.
+
 config TRACE_BRANCH_PROFILING
 	bool
 	select GENERIC_TRACER

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations
  2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
  2009-08-07 20:28 ` [PATCH 1/3] " Neil Horman
  2009-08-07 20:30 ` [PATCH 2/3] " Neil Horman
@ 2009-08-07 20:44 ` Neil Horman
  2009-08-08 23:13 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2) Neil Horman
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-07 20:44 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / consumption correlator

Add ftracer module to kernel to print out a list that correlates a process id,
an skb it read, and the numa nodes on wich the process was running when it was
read along with the numa node the skbuff was allocated on.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 kernel/trace/Makefile            |    1 
 kernel/trace/trace.h             |   19 ++++
 kernel/trace/trace_skb_sources.c |  154 +++++++++++++++++++++++++++++++++++++++
 net/core/datagram.c              |    3 
 4 files changed, 177 insertions(+)

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 844164d..ee5e5b1 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -49,6 +49,7 @@ obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
 ifeq ($(CONFIG_BLOCK),y)
 obj-$(CONFIG_EVENT_TRACING) += blktrace.o
 endif
+obj-$(CONFIG_SKB_SOURCES_TRACER) += trace_skb_sources.o
 obj-$(CONFIG_EVENT_TRACING) += trace_events.o
 obj-$(CONFIG_EVENT_TRACING) += trace_export.o
 obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 3548ae5..8c1d458 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -11,6 +11,7 @@
 #include <trace/boot.h>
 #include <linux/kmemtrace.h>
 #include <trace/power.h>
+#include <trace/events/skb.h>
 
 #include <linux/trace_seq.h>
 #include <linux/ftrace_event.h>
@@ -40,6 +41,7 @@ enum trace_type {
 	TRACE_KMEM_FREE,
 	TRACE_POWER,
 	TRACE_BLK,
+	TRACE_SKB_SOURCE,
 
 	__TRACE_LAST_TYPE,
 };
@@ -171,6 +173,21 @@ struct trace_power {
 	struct power_trace	state_data;
 };
 
+struct skb_record {
+	pid_t pid;		/* pid of the copying process */
+	int anid;		/* node where skb was allocated */
+	int cnid;		/* node to which skb was copied in userspace */
+	char ifname[IFNAMSIZ];	/* Name of the receiving interface */
+	int rx_queue;		/* The rx queue the skb was received on */
+	int ccpu;		/* Cpu the application got this frame from */
+	int len;		/* length of the data copied */
+};
+
+struct trace_skb_event {
+	struct trace_entry	ent;
+	struct skb_record	event_data;
+};
+
 enum kmemtrace_type_id {
 	KMEMTRACE_TYPE_KMALLOC = 0,	/* kmalloc() or kfree(). */
 	KMEMTRACE_TYPE_CACHE,		/* kmem_cache_*(). */
@@ -323,6 +340,8 @@ extern void __ftrace_bad_type(void);
 			  TRACE_SYSCALL_ENTER);				\
 		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
 			  TRACE_SYSCALL_EXIT);				\
+		IF_ASSIGN(var, ent, struct trace_skb_event,		\
+			  TRACE_SKB_SOURCE);				\
 		__ftrace_bad_type();					\
 	} while (0)
 
diff --git a/kernel/trace/trace_skb_sources.c b/kernel/trace/trace_skb_sources.c
new file mode 100644
index 0000000..4ba3671
--- /dev/null
+++ b/kernel/trace/trace_skb_sources.c
@@ -0,0 +1,154 @@
+/*
+ * ring buffer based tracer for analyzing per-socket skb sources
+ *
+ * Neil Horman <nhorman@tuxdriver.com> 
+ * Copyright (C) 2009
+ *
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/debugfs.h>
+#include <trace/events/skb.h>
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/hardirq.h>
+#include <linux/netdevice.h>
+#include <net/sock.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+EXPORT_TRACEPOINT_SYMBOL_GPL(skb_copy_datagram_iovec);
+
+static struct trace_array *skb_trace;
+static int __read_mostly trace_skb_source_enabled;
+
+static void probe_skb_dequeue(const struct sk_buff *skb, int len)
+{
+	struct ring_buffer_event *event;
+	struct trace_skb_event *entry;
+	struct trace_array *tr = skb_trace;
+	struct net_device *dev;
+
+	if (!trace_skb_source_enabled)
+		return;
+
+	if (in_interrupt())
+		return;
+
+	event = trace_buffer_lock_reserve(tr, TRACE_SKB_SOURCE,
+					  sizeof(*entry), 0, 0);
+	if (!event)
+		return;
+	entry = ring_buffer_event_data(event);
+
+	entry->event_data.pid = current->pid;
+	entry->event_data.anid = page_to_nid(virt_to_page(skb->data));
+	entry->event_data.cnid = cpu_to_node(smp_processor_id());
+	entry->event_data.len = len;
+	entry->event_data.rx_queue = skb->queue_mapping;
+	entry->event_data.ccpu = smp_processor_id();
+
+	dev = dev_get_by_index(sock_net(skb->sk), skb->iif);
+	if (dev) {
+		memcpy(entry->event_data.ifname, dev->name, IFNAMSIZ);
+		dev_put(dev);
+	} else {
+		strcpy(entry->event_data.ifname, "Unknown");
+	}
+
+	trace_buffer_unlock_commit(tr, event, 0, 0);
+}
+
+static int tracing_skb_source_register(void)
+{
+	int ret;
+
+	ret = register_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
+	if (ret)
+		pr_info("skb source trace: Couldn't activate dequeue tracepoint");
+	
+	return ret;
+}
+
+static void start_skb_source_trace(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 1;
+}
+
+static void stop_skb_source_trace(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 0;
+}
+
+static void skb_source_trace_reset(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 0;
+	unregister_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
+}
+
+
+static int skb_source_trace_init(struct trace_array *tr)
+{
+	int cpu;
+	skb_trace = tr;
+
+	trace_skb_source_enabled = 1;
+	tracing_skb_source_register();
+
+	for_each_cpu(cpu, cpu_possible_mask)
+		tracing_reset(tr, cpu);
+	return 0;
+}
+
+static enum print_line_t skb_source_print_line(struct trace_iterator *iter)
+{
+	int ret = 0;
+	struct trace_entry *entry = iter->ent;
+	struct trace_skb_event *event;
+	struct skb_record *record;
+	struct trace_seq *s = &iter->seq;
+
+	trace_assign_type(event, entry);
+	record = &event->event_data;
+	if (entry->type != TRACE_SKB_SOURCE)
+		return TRACE_TYPE_UNHANDLED;
+
+	ret = trace_seq_printf(s, "	%d	%d	%d	%s	%d	%d	%d\n",
+			record->pid,
+			record->anid,
+			record->cnid,
+			record->ifname,
+			record->rx_queue,
+			record->ccpu,
+			record->len);
+
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
+static void skb_source_print_header(struct seq_file *s)
+{
+	seq_puts(s, "#	PID	ANID	CNID	IFC	RXQ	CCPU	LEN\n");
+	seq_puts(s, "#	 |	 |	 |	 |	 |	 |	 |\n");
+}
+
+static struct tracer skb_source_tracer __read_mostly =
+{
+	.name		= "skb_sources",
+	.init		= skb_source_trace_init,
+	.start		= start_skb_source_trace,
+	.stop		= stop_skb_source_trace,
+	.reset		= skb_source_trace_reset,
+	.print_line	= skb_source_print_line,
+	.print_header	= skb_source_print_header,
+};
+
+static int init_skb_source_trace(void)
+{
+	return register_tracer(&skb_source_tracer);
+}
+device_initcall(init_skb_source_trace);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index b0fe692..1c6cf3a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -55,6 +55,7 @@
 #include <net/checksum.h>
 #include <net/sock.h>
 #include <net/tcp_states.h>
+#include <trace/events/skb.h>
 
 /*
  *	Is a socket 'connection oriented' ?
@@ -284,6 +285,8 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 
+	trace_skb_copy_datagram_iovec(skb, len);
+
 	/* Copy header. */
 	if (copy > 0) {
 		if (copy > len)

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2)
  2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
                   ` (2 preceding siblings ...)
  2009-08-07 20:44 ` [PATCH 3/3] " Neil Horman
@ 2009-08-08 23:13 ` Neil Horman
  2009-08-08 23:14   ` [PATCH 1/3] " Neil Horman
                     ` (2 more replies)
  2009-08-13  5:15 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations David Miller
  2009-08-13 14:59 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3) Neil Horman
  5 siblings, 3 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-08 23:13 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

n Fri, Aug 07, 2009 at 04:21:30PM -0400, Neil Horman wrote:
Hey all-
	I put out an RFC about this awhile ago and didn't get any loud screams,
so I've gone ahead and implemented it

	Currently, our network infrastructure allows net device drivers to
allocate skbs based on the the numa node the device itself is local to.  This of
course cuts down on cross numa chatter when the device is DMA-ing network
traffic to the driver.  Unfortuantely no such corresponding infrastrucuture
exists at the process level.  The scheduler has no insight into the numa
locality of incomming data packets for a given process (and arguably it
shouldn't), and so there is every chance that a process will run on a different
numa node than the packets that its receiving lives on, creating cross numa node
traffic.

	This patch aims to provide userspace with the opportunity to optimize
that scheduling.  It consists of a tracepoint and an ftrace module which exports
a history of the packets each process receives, along with the numa node each
packet was received on, as well as the numa node the process was running on when
it copied the buffer to user space.  With this information, exported via the
ftrace infrastructure to user space, a sysadim can identify high prirority
processes, and optimize their scheduling so that they are more likely to run on
the same node that they are primarily receiving data on, thereby cutting down
cross numa node traffic.

Tested by me, working well, applies against the head of the net-next tree


Version 2 change notes:

I noticed that I did something stupid in patch 3, and it added a duplicated
chunk which didn't apply, this new series simply removes  that, everything else
is the same


Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2)
  2009-08-08 23:13 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2) Neil Horman
@ 2009-08-08 23:14   ` Neil Horman
  2009-08-08 23:15   ` [PATCH 2/3] " Neil Horman
  2009-08-08 23:15   ` [PATCH 3/3] " Neil Horman
  2 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-08 23:14 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / cosumption tracer - Add consumption tracepoint

This patch adds a tracepoint to skb_copy_datagram_iovec, which is called each
time a userspace process copies a frame from a socket receive queue to a user
space buffer.  It allows us to hook in and examine each sk_buff that the system
receives on a per-socket bases, and can be use to compile a list of which skb's
were received by which processes.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 include/trace/events/skb.h |   20 ++++++++++++++++++++
 net/core/datagram.c        |    3 +++
 2 files changed, 23 insertions(+)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index 1e8fabb..bdc5c1c 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -2,6 +2,7 @@
 #define _TRACE_SKB_H
 
 #include <linux/skbuff.h>
+#include <linux/netdevice.h>
 #include <linux/tracepoint.h>
 
 #undef TRACE_SYSTEM
@@ -34,6 +35,25 @@ TRACE_EVENT(kfree_skb,
 		__entry->skbaddr, __entry->protocol, __entry->location)
 );
 
+TRACE_EVENT(skb_copy_datagram_iovec,
+
+	TP_PROTO(const struct sk_buff *skb, int len),
+
+	TP_ARGS(skb, len),
+
+	TP_STRUCT__entry(
+		__field(	const void *,		skbaddr		)
+		__field(	int,			len		)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = len;
+	),
+
+	TP_printk("skbaddr=%p len=%d", __entry->skbaddr, __entry->len)
+);
+
 #endif /* _TRACE_SKB_H */
 
 /* This part must be outside protection */
diff --git a/net/core/datagram.c b/net/core/datagram.c
index b0fe692..1c6cf3a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -55,6 +55,7 @@
 #include <net/checksum.h>
 #include <net/sock.h>
 #include <net/tcp_states.h>
+#include <trace/events/skb.h>
 
 /*
  *	Is a socket 'connection oriented' ?
@@ -284,6 +285,8 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 
+	trace_skb_copy_datagram_iovec(skb, len);
+
 	/* Copy header. */
 	if (copy > 0) {
 		if (copy > len)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2)
  2009-08-08 23:13 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2) Neil Horman
  2009-08-08 23:14   ` [PATCH 1/3] " Neil Horman
@ 2009-08-08 23:15   ` Neil Horman
  2009-08-08 23:15   ` [PATCH 3/3] " Neil Horman
  2 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-08 23:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / consumption corelator - Add config option

This patch adds a Kconfig option to enable the addtition of the skb source
tracer.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 Kconfig |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 1551f47..1aeec05 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -234,6 +234,16 @@ config BOOT_TRACER
 	  You must pass in ftrace=initcall to the kernel command line
 	  to enable this on bootup.
 
+config SKB_SOURCES_TRACER
+	bool "Trace skb source information
+	select GENERIC_TRACER
+	help
+	   This tracer helps developers/sysadmins correlate skb allocation and
+	   consumption.  The idea being that some processes will primarily consume data
+	   that was allocated on certain numa nodes.  By being able to visualize which
+	   nodes the data was allocated on, a sysadmin or developer can optimize the
+	   scheduling of those processes to cut back on cross node chatter.
+
 config TRACE_BRANCH_PROFILING
 	bool
 	select GENERIC_TRACER
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2)
  2009-08-08 23:13 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2) Neil Horman
  2009-08-08 23:14   ` [PATCH 1/3] " Neil Horman
  2009-08-08 23:15   ` [PATCH 2/3] " Neil Horman
@ 2009-08-08 23:15   ` Neil Horman
  2 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-08 23:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / consumption correlator

Add ftracer module to kernel to print out a list that correlates a process id,
an skb it read, and the numa nodes on wich the process was running when it was
read along with the numa node the skbuff was allocated on.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 844164d..ee5e5b1 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -49,6 +49,7 @@ obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
 ifeq ($(CONFIG_BLOCK),y)
 obj-$(CONFIG_EVENT_TRACING) += blktrace.o
 endif
+obj-$(CONFIG_SKB_SOURCES_TRACER) += trace_skb_sources.o
 obj-$(CONFIG_EVENT_TRACING) += trace_events.o
 obj-$(CONFIG_EVENT_TRACING) += trace_export.o
 obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 3548ae5..8c1d458 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -11,6 +11,7 @@
 #include <trace/boot.h>
 #include <linux/kmemtrace.h>
 #include <trace/power.h>
+#include <trace/events/skb.h>
 
 #include <linux/trace_seq.h>
 #include <linux/ftrace_event.h>
@@ -40,6 +41,7 @@ enum trace_type {
 	TRACE_KMEM_FREE,
 	TRACE_POWER,
 	TRACE_BLK,
+	TRACE_SKB_SOURCE,
 
 	__TRACE_LAST_TYPE,
 };
@@ -171,6 +173,21 @@ struct trace_power {
 	struct power_trace	state_data;
 };
 
+struct skb_record {
+	pid_t pid;		/* pid of the copying process */
+	int anid;		/* node where skb was allocated */
+	int cnid;		/* node to which skb was copied in userspace */
+	char ifname[IFNAMSIZ];	/* Name of the receiving interface */
+	int rx_queue;		/* The rx queue the skb was received on */
+	int ccpu;		/* Cpu the application got this frame from */
+	int len;		/* length of the data copied */
+};
+
+struct trace_skb_event {
+	struct trace_entry	ent;
+	struct skb_record	event_data;
+};
+
 enum kmemtrace_type_id {
 	KMEMTRACE_TYPE_KMALLOC = 0,	/* kmalloc() or kfree(). */
 	KMEMTRACE_TYPE_CACHE,		/* kmem_cache_*(). */
@@ -323,6 +340,8 @@ extern void __ftrace_bad_type(void);
 			  TRACE_SYSCALL_ENTER);				\
 		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
 			  TRACE_SYSCALL_EXIT);				\
+		IF_ASSIGN(var, ent, struct trace_skb_event,		\
+			  TRACE_SKB_SOURCE);				\
 		__ftrace_bad_type();					\
 	} while (0)
 
diff --git a/kernel/trace/trace_skb_sources.c b/kernel/trace/trace_skb_sources.c
new file mode 100644
index 0000000..4ba3671
--- /dev/null
+++ b/kernel/trace/trace_skb_sources.c
@@ -0,0 +1,154 @@
+/*
+ * ring buffer based tracer for analyzing per-socket skb sources
+ *
+ * Neil Horman <nhorman@tuxdriver.com> 
+ * Copyright (C) 2009
+ *
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/debugfs.h>
+#include <trace/events/skb.h>
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/hardirq.h>
+#include <linux/netdevice.h>
+#include <net/sock.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+EXPORT_TRACEPOINT_SYMBOL_GPL(skb_copy_datagram_iovec);
+
+static struct trace_array *skb_trace;
+static int __read_mostly trace_skb_source_enabled;
+
+static void probe_skb_dequeue(const struct sk_buff *skb, int len)
+{
+	struct ring_buffer_event *event;
+	struct trace_skb_event *entry;
+	struct trace_array *tr = skb_trace;
+	struct net_device *dev;
+
+	if (!trace_skb_source_enabled)
+		return;
+
+	if (in_interrupt())
+		return;
+
+	event = trace_buffer_lock_reserve(tr, TRACE_SKB_SOURCE,
+					  sizeof(*entry), 0, 0);
+	if (!event)
+		return;
+	entry = ring_buffer_event_data(event);
+
+	entry->event_data.pid = current->pid;
+	entry->event_data.anid = page_to_nid(virt_to_page(skb->data));
+	entry->event_data.cnid = cpu_to_node(smp_processor_id());
+	entry->event_data.len = len;
+	entry->event_data.rx_queue = skb->queue_mapping;
+	entry->event_data.ccpu = smp_processor_id();
+
+	dev = dev_get_by_index(sock_net(skb->sk), skb->iif);
+	if (dev) {
+		memcpy(entry->event_data.ifname, dev->name, IFNAMSIZ);
+		dev_put(dev);
+	} else {
+		strcpy(entry->event_data.ifname, "Unknown");
+	}
+
+	trace_buffer_unlock_commit(tr, event, 0, 0);
+}
+
+static int tracing_skb_source_register(void)
+{
+	int ret;
+
+	ret = register_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
+	if (ret)
+		pr_info("skb source trace: Couldn't activate dequeue tracepoint");
+	
+	return ret;
+}
+
+static void start_skb_source_trace(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 1;
+}
+
+static void stop_skb_source_trace(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 0;
+}
+
+static void skb_source_trace_reset(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 0;
+	unregister_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
+}
+
+
+static int skb_source_trace_init(struct trace_array *tr)
+{
+	int cpu;
+	skb_trace = tr;
+
+	trace_skb_source_enabled = 1;
+	tracing_skb_source_register();
+
+	for_each_cpu(cpu, cpu_possible_mask)
+		tracing_reset(tr, cpu);
+	return 0;
+}
+
+static enum print_line_t skb_source_print_line(struct trace_iterator *iter)
+{
+	int ret = 0;
+	struct trace_entry *entry = iter->ent;
+	struct trace_skb_event *event;
+	struct skb_record *record;
+	struct trace_seq *s = &iter->seq;
+
+	trace_assign_type(event, entry);
+	record = &event->event_data;
+	if (entry->type != TRACE_SKB_SOURCE)
+		return TRACE_TYPE_UNHANDLED;
+
+	ret = trace_seq_printf(s, "	%d	%d	%d	%s	%d	%d	%d\n",
+			record->pid,
+			record->anid,
+			record->cnid,
+			record->ifname,
+			record->rx_queue,
+			record->ccpu,
+			record->len);
+
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
+static void skb_source_print_header(struct seq_file *s)
+{
+	seq_puts(s, "#	PID	ANID	CNID	IFC	RXQ	CCPU	LEN\n");
+	seq_puts(s, "#	 |	 |	 |	 |	 |	 |	 |\n");
+}
+
+static struct tracer skb_source_tracer __read_mostly =
+{
+	.name		= "skb_sources",
+	.init		= skb_source_trace_init,
+	.start		= start_skb_source_trace,
+	.stop		= stop_skb_source_trace,
+	.reset		= skb_source_trace_reset,
+	.print_line	= skb_source_print_line,
+	.print_header	= skb_source_print_header,
+};
+
+static int init_skb_source_trace(void)
+{
+	return register_tracer(&skb_source_tracer);
+}
+device_initcall(init_skb_source_trace);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations
  2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
                   ` (3 preceding siblings ...)
  2009-08-08 23:13 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2) Neil Horman
@ 2009-08-13  5:15 ` David Miller
  2009-08-13 10:42   ` Neil Horman
  2009-08-13 14:59 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3) Neil Horman
  5 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2009-08-13  5:15 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, rostedt

From: Neil Horman <nhorman@tuxdriver.com>
Date: Fri, 7 Aug 2009 16:21:30 -0400

> Tested by me, working well, applies against the head of the net-next tree
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

For some reason they don't apply cleanly to net-next-2.6, probably
because of the net-2.6 merge I did recently.

Neil, could you please respin, and also could you please not
use the same subject line for all 3 patches?  That becomes the
commit message header, and it should be unique for each patch
since each patch does something different. :)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations
  2009-08-13  5:15 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations David Miller
@ 2009-08-13 10:42   ` Neil Horman
  0 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-13 10:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, rostedt

On Wed, Aug 12, 2009 at 10:15:15PM -0700, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Fri, 7 Aug 2009 16:21:30 -0400
> 
> > Tested by me, working well, applies against the head of the net-next tree
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> 
> For some reason they don't apply cleanly to net-next-2.6, probably
> because of the net-2.6 merge I did recently.
> 
> Neil, could you please respin, and also could you please not
> use the same subject line for all 3 patches?  That becomes the
> commit message header, and it should be unique for each patch
> since each patch does something different. :)
> 
> 
> 
Sure, no problem, I'll respin them all today and repost.
Thanks!
Neil


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3)
  2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
                   ` (4 preceding siblings ...)
  2009-08-13  5:15 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations David Miller
@ 2009-08-13 14:59 ` Neil Horman
  2009-08-13 15:19   ` [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3) Neil Horman
                     ` (2 more replies)
  5 siblings, 3 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-13 14:59 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

Hey all-
        I put out an RFC about this awhile ago and didn't get any loud screams,
so I've gone ahead and implemented it

        Currently, our network infrastructure allows net device drivers to
allocate skbs based on the the numa node the device itself is local to.  This of
course cuts down on cross numa chatter when the device is DMA-ing network
traffic to the driver.  Unfortuantely no such corresponding infrastrucuture
exists at the process level.  The scheduler has no insight into the numa
locality of incomming data packets for a given process (and arguably it
shouldn't), and so there is every chance that a process will run on a different
numa node than the packets that its receiving lives on, creating cross numa node
traffic.

        This patch aims to provide userspace with the opportunity to optimize
that scheduling.  It consists of a tracepoint and an ftrace module which exports
a history of the packets each process receives, along with the numa node each
packet was received on, as well as the numa node the process was running on when
it copied the buffer to user space.  With this information, exported via the
ftrace infrastructure to user space, a sysadim can identify high prirority
processes, and optimize their scheduling so that they are more likely to run on
the same node that they are primarily receiving data on, thereby cutting down
cross numa node traffic.

Tested by me, working well, applies against the head of the net-next tree


Version 3 change notes:

Respun on davems request to apply to the head of the net-next tree.  No other
changes

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3)
  2009-08-13 14:59 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3) Neil Horman
@ 2009-08-13 15:19   ` Neil Horman
  2009-08-13 23:28     ` David Miller
  2009-08-13 15:20   ` [PATCH 2/3] net: skb ftracer - Add config option to enable new ftracer (v3) Neil Horman
  2009-08-13 15:23   ` [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3) Neil Horman
  2 siblings, 1 reply; 19+ messages in thread
From: Neil Horman @ 2009-08-13 15:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / cosumption tracer - Add consumption tracepoint

This patch adds a tracepoint to skb_copy_datagram_iovec, which is called each
time a userspace process copies a frame from a socket receive queue to a user
space buffer.  It allows us to hook in and examine each sk_buff that the system
receives on a per-socket bases, and can be use to compile a list of which skb's
were received by which processes.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 include/trace/events/skb.h |   20 ++++++++++++++++++++
 net/core/datagram.c        |    3 +++
 2 files changed, 23 insertions(+)

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index e499863..4b2be6d 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -5,6 +5,7 @@
 #define _TRACE_SKB_H
 
 #include <linux/skbuff.h>
+#include <linux/netdevice.h>
 #include <linux/tracepoint.h>
 
 /*
@@ -34,6 +35,25 @@ TRACE_EVENT(kfree_skb,
 		__entry->skbaddr, __entry->protocol, __entry->location)
 );
 
+TRACE_EVENT(skb_copy_datagram_iovec,
+
+	TP_PROTO(const struct sk_buff *skb, int len),
+
+	TP_ARGS(skb, len),
+
+	TP_STRUCT__entry(
+		__field(	const void *,		skbaddr		)
+		__field(	int,			len		)
+	),
+
+	TP_fast_assign(
+		__entry->skbaddr = skb;
+		__entry->len = len;
+	),
+
+	TP_printk("skbaddr=%p len=%d", __entry->skbaddr, __entry->len)
+);
+
 #endif /* _TRACE_SKB_H */
 
 /* This part must be outside protection */
diff --git a/net/core/datagram.c b/net/core/datagram.c
index b0fe692..1c6cf3a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -55,6 +55,7 @@
 #include <net/checksum.h>
 #include <net/sock.h>
 #include <net/tcp_states.h>
+#include <trace/events/skb.h>
 
 /*
  *	Is a socket 'connection oriented' ?
@@ -284,6 +285,8 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 
+	trace_skb_copy_datagram_iovec(skb, len);
+
 	/* Copy header. */
 	if (copy > 0) {
 		if (copy > len)

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] net: skb ftracer - Add config option to enable new ftracer (v3)
  2009-08-13 14:59 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3) Neil Horman
  2009-08-13 15:19   ` [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3) Neil Horman
@ 2009-08-13 15:20   ` Neil Horman
  2009-08-13 23:28     ` David Miller
  2009-08-13 15:23   ` [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3) Neil Horman
  2 siblings, 1 reply; 19+ messages in thread
From: Neil Horman @ 2009-08-13 15:20 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / consumption corelator - Add config option

This patch adds a Kconfig option to enable the addtition of the skb source
tracer.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 Kconfig |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 019f380..dcb263d 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -234,6 +234,16 @@ config BOOT_TRACER
 	  You must pass in initcall_debug and ftrace=initcall to the kernel
 	  command line to enable this on bootup.
 
+config SKB_SOURCES_TRACER
+	bool "Trace skb source information
+	select GENERIC_TRACER
+	help
+	   This tracer helps developers/sysadmins correlate skb allocation and
+	   consumption.  The idea being that some processes will primarily consume data
+	   that was allocated on certain numa nodes.  By being able to visualize which
+	   nodes the data was allocated on, a sysadmin or developer can optimize the
+	   scheduling of those processes to cut back on cross node chatter.
+
 config TRACE_BRANCH_PROFILING
 	bool
 	select GENERIC_TRACER

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3)
  2009-08-13 14:59 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3) Neil Horman
  2009-08-13 15:19   ` [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3) Neil Horman
  2009-08-13 15:20   ` [PATCH 2/3] net: skb ftracer - Add config option to enable new ftracer (v3) Neil Horman
@ 2009-08-13 15:23   ` Neil Horman
  2009-08-13 23:28     ` David Miller
  2009-08-17 20:55     ` Steven Rostedt
  2 siblings, 2 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-13 15:23 UTC (permalink / raw)
  To: netdev; +Cc: davem, rostedt

skb allocation / consumption correlator

Add ftracer module to kernel to print out a list that correlates a process id,
an skb it read, and the numa nodes on wich the process was running when it was
read along with the numa node the skbuff was allocated on.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 Makefile            |    1 
 trace.h             |   19 ++++++
 trace_skb_sources.c |  154 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 174 insertions(+)

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 844164d..ee5e5b1 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -49,6 +49,7 @@ obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
 ifeq ($(CONFIG_BLOCK),y)
 obj-$(CONFIG_EVENT_TRACING) += blktrace.o
 endif
+obj-$(CONFIG_SKB_SOURCES_TRACER) += trace_skb_sources.o
 obj-$(CONFIG_EVENT_TRACING) += trace_events.o
 obj-$(CONFIG_EVENT_TRACING) += trace_export.o
 obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 8b9f4f6..8a6281b 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -11,6 +11,7 @@
 #include <trace/boot.h>
 #include <linux/kmemtrace.h>
 #include <trace/power.h>
+#include <trace/events/skb.h>
 
 #include <linux/trace_seq.h>
 #include <linux/ftrace_event.h>
@@ -40,6 +41,7 @@ enum trace_type {
 	TRACE_KMEM_FREE,
 	TRACE_POWER,
 	TRACE_BLK,
+	TRACE_SKB_SOURCE,
 
 	__TRACE_LAST_TYPE,
 };
@@ -171,6 +173,21 @@ struct trace_power {
 	struct power_trace	state_data;
 };
 
+struct skb_record {
+	pid_t pid;		/* pid of the copying process */
+	int anid;		/* node where skb was allocated */
+	int cnid;		/* node to which skb was copied in userspace */
+	char ifname[IFNAMSIZ];	/* Name of the receiving interface */
+	int rx_queue;		/* The rx queue the skb was received on */
+	int ccpu;		/* Cpu the application got this frame from */
+	int len;		/* length of the data copied */
+};
+
+struct trace_skb_event {
+	struct trace_entry	ent;
+	struct skb_record	event_data;
+};
+
 enum kmemtrace_type_id {
 	KMEMTRACE_TYPE_KMALLOC = 0,	/* kmalloc() or kfree(). */
 	KMEMTRACE_TYPE_CACHE,		/* kmem_cache_*(). */
@@ -323,6 +340,8 @@ extern void __ftrace_bad_type(void);
 			  TRACE_SYSCALL_ENTER);				\
 		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
 			  TRACE_SYSCALL_EXIT);				\
+		IF_ASSIGN(var, ent, struct trace_skb_event,		\
+			  TRACE_SKB_SOURCE);				\
 		__ftrace_bad_type();					\
 	} while (0)
 
diff --git a/kernel/trace/trace_skb_sources.c b/kernel/trace/trace_skb_sources.c
new file mode 100644
index 0000000..4ba3671
--- /dev/null
+++ b/kernel/trace/trace_skb_sources.c
@@ -0,0 +1,154 @@
+/*
+ * ring buffer based tracer for analyzing per-socket skb sources
+ *
+ * Neil Horman <nhorman@tuxdriver.com> 
+ * Copyright (C) 2009
+ *
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/debugfs.h>
+#include <trace/events/skb.h>
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/hardirq.h>
+#include <linux/netdevice.h>
+#include <net/sock.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+EXPORT_TRACEPOINT_SYMBOL_GPL(skb_copy_datagram_iovec);
+
+static struct trace_array *skb_trace;
+static int __read_mostly trace_skb_source_enabled;
+
+static void probe_skb_dequeue(const struct sk_buff *skb, int len)
+{
+	struct ring_buffer_event *event;
+	struct trace_skb_event *entry;
+	struct trace_array *tr = skb_trace;
+	struct net_device *dev;
+
+	if (!trace_skb_source_enabled)
+		return;
+
+	if (in_interrupt())
+		return;
+
+	event = trace_buffer_lock_reserve(tr, TRACE_SKB_SOURCE,
+					  sizeof(*entry), 0, 0);
+	if (!event)
+		return;
+	entry = ring_buffer_event_data(event);
+
+	entry->event_data.pid = current->pid;
+	entry->event_data.anid = page_to_nid(virt_to_page(skb->data));
+	entry->event_data.cnid = cpu_to_node(smp_processor_id());
+	entry->event_data.len = len;
+	entry->event_data.rx_queue = skb->queue_mapping;
+	entry->event_data.ccpu = smp_processor_id();
+
+	dev = dev_get_by_index(sock_net(skb->sk), skb->iif);
+	if (dev) {
+		memcpy(entry->event_data.ifname, dev->name, IFNAMSIZ);
+		dev_put(dev);
+	} else {
+		strcpy(entry->event_data.ifname, "Unknown");
+	}
+
+	trace_buffer_unlock_commit(tr, event, 0, 0);
+}
+
+static int tracing_skb_source_register(void)
+{
+	int ret;
+
+	ret = register_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
+	if (ret)
+		pr_info("skb source trace: Couldn't activate dequeue tracepoint");
+	
+	return ret;
+}
+
+static void start_skb_source_trace(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 1;
+}
+
+static void stop_skb_source_trace(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 0;
+}
+
+static void skb_source_trace_reset(struct trace_array *tr)
+{
+	trace_skb_source_enabled = 0;
+	unregister_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
+}
+
+
+static int skb_source_trace_init(struct trace_array *tr)
+{
+	int cpu;
+	skb_trace = tr;
+
+	trace_skb_source_enabled = 1;
+	tracing_skb_source_register();
+
+	for_each_cpu(cpu, cpu_possible_mask)
+		tracing_reset(tr, cpu);
+	return 0;
+}
+
+static enum print_line_t skb_source_print_line(struct trace_iterator *iter)
+{
+	int ret = 0;
+	struct trace_entry *entry = iter->ent;
+	struct trace_skb_event *event;
+	struct skb_record *record;
+	struct trace_seq *s = &iter->seq;
+
+	trace_assign_type(event, entry);
+	record = &event->event_data;
+	if (entry->type != TRACE_SKB_SOURCE)
+		return TRACE_TYPE_UNHANDLED;
+
+	ret = trace_seq_printf(s, "	%d	%d	%d	%s	%d	%d	%d\n",
+			record->pid,
+			record->anid,
+			record->cnid,
+			record->ifname,
+			record->rx_queue,
+			record->ccpu,
+			record->len);
+
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
+static void skb_source_print_header(struct seq_file *s)
+{
+	seq_puts(s, "#	PID	ANID	CNID	IFC	RXQ	CCPU	LEN\n");
+	seq_puts(s, "#	 |	 |	 |	 |	 |	 |	 |\n");
+}
+
+static struct tracer skb_source_tracer __read_mostly =
+{
+	.name		= "skb_sources",
+	.init		= skb_source_trace_init,
+	.start		= start_skb_source_trace,
+	.stop		= stop_skb_source_trace,
+	.reset		= skb_source_trace_reset,
+	.print_line	= skb_source_print_line,
+	.print_header	= skb_source_print_header,
+};
+
+static int init_skb_source_trace(void)
+{
+	return register_tracer(&skb_source_tracer);
+}
+device_initcall(init_skb_source_trace);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3)
  2009-08-13 15:19   ` [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3) Neil Horman
@ 2009-08-13 23:28     ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2009-08-13 23:28 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, rostedt

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 13 Aug 2009 11:19:44 -0400

> skb allocation / cosumption tracer - Add consumption tracepoint
> 
> This patch adds a tracepoint to skb_copy_datagram_iovec, which is called each
> time a userspace process copies a frame from a socket receive queue to a user
> space buffer.  It allows us to hook in and examine each sk_buff that the system
> receives on a per-socket bases, and can be use to compile a list of which skb's
> were received by which processes.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] net: skb ftracer - Add config option to enable new ftracer (v3)
  2009-08-13 15:20   ` [PATCH 2/3] net: skb ftracer - Add config option to enable new ftracer (v3) Neil Horman
@ 2009-08-13 23:28     ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2009-08-13 23:28 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, rostedt

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 13 Aug 2009 11:20:45 -0400

> skb allocation / consumption corelator - Add config option
> 
> This patch adds a Kconfig option to enable the addtition of the skb source
> tracer.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3)
  2009-08-13 15:23   ` [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3) Neil Horman
@ 2009-08-13 23:28     ` David Miller
  2009-08-17 20:55     ` Steven Rostedt
  1 sibling, 0 replies; 19+ messages in thread
From: David Miller @ 2009-08-13 23:28 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, rostedt

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 13 Aug 2009 11:23:56 -0400

> skb allocation / consumption correlator
> 
> Add ftracer module to kernel to print out a list that correlates a process id,
> an skb it read, and the numa nodes on wich the process was running when it was
> read along with the numa node the skbuff was allocated on.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

Applied to net-next-2.6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3)
  2009-08-13 15:23   ` [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3) Neil Horman
  2009-08-13 23:28     ` David Miller
@ 2009-08-17 20:55     ` Steven Rostedt
  2009-08-18 16:39       ` Neil Horman
  1 sibling, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2009-08-17 20:55 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, davem


Hi Neil!

Sorry for the late reply, I've been on vacation for the last week.

On Thu, 13 Aug 2009, Neil Horman wrote:

> skb allocation / consumption correlator
> 
> Add ftracer module to kernel to print out a list that correlates a process id,
> an skb it read, and the numa nodes on wich the process was running when it was
> read along with the numa node the skbuff was allocated on.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> 
> 
>  Makefile            |    1 
>  trace.h             |   19 ++++++
>  trace_skb_sources.c |  154 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 174 insertions(+)
> 
> diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> index 844164d..ee5e5b1 100644
> --- a/kernel/trace/Makefile
> +++ b/kernel/trace/Makefile
> @@ -49,6 +49,7 @@ obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
>  ifeq ($(CONFIG_BLOCK),y)
>  obj-$(CONFIG_EVENT_TRACING) += blktrace.o
>  endif
> +obj-$(CONFIG_SKB_SOURCES_TRACER) += trace_skb_sources.o
>  obj-$(CONFIG_EVENT_TRACING) += trace_events.o
>  obj-$(CONFIG_EVENT_TRACING) += trace_export.o
>  obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 8b9f4f6..8a6281b 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -11,6 +11,7 @@
>  #include <trace/boot.h>
>  #include <linux/kmemtrace.h>
>  #include <trace/power.h>
> +#include <trace/events/skb.h>
>  
>  #include <linux/trace_seq.h>
>  #include <linux/ftrace_event.h>
> @@ -40,6 +41,7 @@ enum trace_type {
>  	TRACE_KMEM_FREE,
>  	TRACE_POWER,
>  	TRACE_BLK,
> +	TRACE_SKB_SOURCE,
>  
>  	__TRACE_LAST_TYPE,
>  };
> @@ -171,6 +173,21 @@ struct trace_power {
>  	struct power_trace	state_data;
>  };
>  
> +struct skb_record {
> +	pid_t pid;		/* pid of the copying process */
> +	int anid;		/* node where skb was allocated */
> +	int cnid;		/* node to which skb was copied in userspace */
> +	char ifname[IFNAMSIZ];	/* Name of the receiving interface */
> +	int rx_queue;		/* The rx queue the skb was received on */
> +	int ccpu;		/* Cpu the application got this frame from */
> +	int len;		/* length of the data copied */
> +};
> +
> +struct trace_skb_event {
> +	struct trace_entry	ent;
> +	struct skb_record	event_data;
> +};
> +
>  enum kmemtrace_type_id {
>  	KMEMTRACE_TYPE_KMALLOC = 0,	/* kmalloc() or kfree(). */
>  	KMEMTRACE_TYPE_CACHE,		/* kmem_cache_*(). */
> @@ -323,6 +340,8 @@ extern void __ftrace_bad_type(void);
>  			  TRACE_SYSCALL_ENTER);				\
>  		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
>  			  TRACE_SYSCALL_EXIT);				\
> +		IF_ASSIGN(var, ent, struct trace_skb_event,		\
> +			  TRACE_SKB_SOURCE);				\
>  		__ftrace_bad_type();					\
>  	} while (0)
>  
> diff --git a/kernel/trace/trace_skb_sources.c b/kernel/trace/trace_skb_sources.c
> new file mode 100644
> index 0000000..4ba3671
> --- /dev/null
> +++ b/kernel/trace/trace_skb_sources.c
> @@ -0,0 +1,154 @@
> +/*
> + * ring buffer based tracer for analyzing per-socket skb sources
> + *
> + * Neil Horman <nhorman@tuxdriver.com> 
> + * Copyright (C) 2009
> + *
> + *
> + */
> +
> +#include <linux/init.h>
> +#include <linux/debugfs.h>
> +#include <trace/events/skb.h>
> +#include <linux/kallsyms.h>
> +#include <linux/module.h>
> +#include <linux/hardirq.h>
> +#include <linux/netdevice.h>
> +#include <net/sock.h>
> +
> +#include "trace.h"
> +#include "trace_output.h"
> +
> +EXPORT_TRACEPOINT_SYMBOL_GPL(skb_copy_datagram_iovec);
> +
> +static struct trace_array *skb_trace;
> +static int __read_mostly trace_skb_source_enabled;
> +
> +static void probe_skb_dequeue(const struct sk_buff *skb, int len)
> +{
> +	struct ring_buffer_event *event;
> +	struct trace_skb_event *entry;
> +	struct trace_array *tr = skb_trace;
> +	struct net_device *dev;
> +
> +	if (!trace_skb_source_enabled)
> +		return;
> +
> +	if (in_interrupt())
> +		return;

Is there a reason for not doing this in an interrupt?

> +
> +	event = trace_buffer_lock_reserve(tr, TRACE_SKB_SOURCE,
> +					  sizeof(*entry), 0, 0);
> +	if (!event)
> +		return;
> +	entry = ring_buffer_event_data(event);
> +
> +	entry->event_data.pid = current->pid;

Note, the trace_buffer_lock_reserve will record the current pid, thus you 
do not need to record it here.

> +	entry->event_data.anid = page_to_nid(virt_to_page(skb->data));
> +	entry->event_data.cnid = cpu_to_node(smp_processor_id());
> +	entry->event_data.len = len;
> +	entry->event_data.rx_queue = skb->queue_mapping;
> +	entry->event_data.ccpu = smp_processor_id();

Also, the cpu is recorded in the ring buffer. They are per cpu ring 
buffers and that determines the cpu it was recorded on.

> +
> +	dev = dev_get_by_index(sock_net(skb->sk), skb->iif);
> +	if (dev) {
> +		memcpy(entry->event_data.ifname, dev->name, IFNAMSIZ);
> +		dev_put(dev);
> +	} else {
> +		strcpy(entry->event_data.ifname, "Unknown");
> +	}
> +
> +	trace_buffer_unlock_commit(tr, event, 0, 0);
> +}
> +
> +static int tracing_skb_source_register(void)
> +{
> +	int ret;
> +
> +	ret = register_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
> +	if (ret)
> +		pr_info("skb source trace: Couldn't activate dequeue tracepoint");
> +	
> +	return ret;
> +}
> +
> +static void start_skb_source_trace(struct trace_array *tr)
> +{
> +	trace_skb_source_enabled = 1;
> +}
> +
> +static void stop_skb_source_trace(struct trace_array *tr)
> +{
> +	trace_skb_source_enabled = 0;
> +}
> +
> +static void skb_source_trace_reset(struct trace_array *tr)
> +{
> +	trace_skb_source_enabled = 0;
> +	unregister_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
> +}
> +
> +
> +static int skb_source_trace_init(struct trace_array *tr)
> +{
> +	int cpu;
> +	skb_trace = tr;
> +
> +	trace_skb_source_enabled = 1;
> +	tracing_skb_source_register();
> +
> +	for_each_cpu(cpu, cpu_possible_mask)
> +		tracing_reset(tr, cpu);
> +	return 0;
> +}
> +
> +static enum print_line_t skb_source_print_line(struct trace_iterator *iter)
> +{
> +	int ret = 0;
> +	struct trace_entry *entry = iter->ent;

iter->cpu has the cpu that trace was recorded on.
entry->pid has the pid of the process that did the recording.

> +	struct trace_skb_event *event;
> +	struct skb_record *record;
> +	struct trace_seq *s = &iter->seq;
> +
> +	trace_assign_type(event, entry);
> +	record = &event->event_data;
> +	if (entry->type != TRACE_SKB_SOURCE)
> +		return TRACE_TYPE_UNHANDLED;
> +
> +	ret = trace_seq_printf(s, "	%d	%d	%d	%s	%d	%d	%d\n",
> +			record->pid,
> +			record->anid,
> +			record->cnid,
> +			record->ifname,
> +			record->rx_queue,
> +			record->ccpu,
> +			record->len);
> +
> +	if (!ret)
> +		return TRACE_TYPE_PARTIAL_LINE;
> +
> +	return TRACE_TYPE_HANDLED;
> +}
> +
> +static void skb_source_print_header(struct seq_file *s)
> +{
> +	seq_puts(s, "#	PID	ANID	CNID	IFC	RXQ	CCPU	LEN\n");
> +	seq_puts(s, "#	 |	 |	 |	 |	 |	 |	 |\n");
> +}
> +
> +static struct tracer skb_source_tracer __read_mostly =
> +{
> +	.name		= "skb_sources",
> +	.init		= skb_source_trace_init,
> +	.start		= start_skb_source_trace,
> +	.stop		= stop_skb_source_trace,
> +	.reset		= skb_source_trace_reset,
> +	.print_line	= skb_source_print_line,
> +	.print_header	= skb_source_print_header,
> +};
> +
> +static int init_skb_source_trace(void)
> +{
> +	return register_tracer(&skb_source_tracer);
> +}
> +device_initcall(init_skb_source_trace);
> 

BTW, why not just do this as events? Or was this just a easy way to 
communicate with the user space tools?

-- Steve


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3)
  2009-08-17 20:55     ` Steven Rostedt
@ 2009-08-18 16:39       ` Neil Horman
  0 siblings, 0 replies; 19+ messages in thread
From: Neil Horman @ 2009-08-18 16:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: netdev, davem

On Mon, Aug 17, 2009 at 04:55:38PM -0400, Steven Rostedt wrote:
> 
> Hi Neil!
> 
> Sorry for the late reply, I've been on vacation for the last week.
> 
> On Thu, 13 Aug 2009, Neil Horman wrote:
> 
> > skb allocation / consumption correlator
> > 
> > Add ftracer module to kernel to print out a list that correlates a process id,
> > an skb it read, and the numa nodes on wich the process was running when it was
> > read along with the numa node the skbuff was allocated on.
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > 
> > 
> >  Makefile            |    1 
> >  trace.h             |   19 ++++++
> >  trace_skb_sources.c |  154 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 174 insertions(+)
> > 
> > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> > index 844164d..ee5e5b1 100644
> > --- a/kernel/trace/Makefile
> > +++ b/kernel/trace/Makefile
> > @@ -49,6 +49,7 @@ obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
> >  ifeq ($(CONFIG_BLOCK),y)
> >  obj-$(CONFIG_EVENT_TRACING) += blktrace.o
> >  endif
> > +obj-$(CONFIG_SKB_SOURCES_TRACER) += trace_skb_sources.o
> >  obj-$(CONFIG_EVENT_TRACING) += trace_events.o
> >  obj-$(CONFIG_EVENT_TRACING) += trace_export.o
> >  obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 8b9f4f6..8a6281b 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -11,6 +11,7 @@
> >  #include <trace/boot.h>
> >  #include <linux/kmemtrace.h>
> >  #include <trace/power.h>
> > +#include <trace/events/skb.h>
> >  
> >  #include <linux/trace_seq.h>
> >  #include <linux/ftrace_event.h>
> > @@ -40,6 +41,7 @@ enum trace_type {
> >  	TRACE_KMEM_FREE,
> >  	TRACE_POWER,
> >  	TRACE_BLK,
> > +	TRACE_SKB_SOURCE,
> >  
> >  	__TRACE_LAST_TYPE,
> >  };
> > @@ -171,6 +173,21 @@ struct trace_power {
> >  	struct power_trace	state_data;
> >  };
> >  
> > +struct skb_record {
> > +	pid_t pid;		/* pid of the copying process */
> > +	int anid;		/* node where skb was allocated */
> > +	int cnid;		/* node to which skb was copied in userspace */
> > +	char ifname[IFNAMSIZ];	/* Name of the receiving interface */
> > +	int rx_queue;		/* The rx queue the skb was received on */
> > +	int ccpu;		/* Cpu the application got this frame from */
> > +	int len;		/* length of the data copied */
> > +};
> > +
> > +struct trace_skb_event {
> > +	struct trace_entry	ent;
> > +	struct skb_record	event_data;
> > +};
> > +
> >  enum kmemtrace_type_id {
> >  	KMEMTRACE_TYPE_KMALLOC = 0,	/* kmalloc() or kfree(). */
> >  	KMEMTRACE_TYPE_CACHE,		/* kmem_cache_*(). */
> > @@ -323,6 +340,8 @@ extern void __ftrace_bad_type(void);
> >  			  TRACE_SYSCALL_ENTER);				\
> >  		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
> >  			  TRACE_SYSCALL_EXIT);				\
> > +		IF_ASSIGN(var, ent, struct trace_skb_event,		\
> > +			  TRACE_SKB_SOURCE);				\
> >  		__ftrace_bad_type();					\
> >  	} while (0)
> >  
> > diff --git a/kernel/trace/trace_skb_sources.c b/kernel/trace/trace_skb_sources.c
> > new file mode 100644
> > index 0000000..4ba3671
> > --- /dev/null
> > +++ b/kernel/trace/trace_skb_sources.c
> > @@ -0,0 +1,154 @@
> > +/*
> > + * ring buffer based tracer for analyzing per-socket skb sources
> > + *
> > + * Neil Horman <nhorman@tuxdriver.com> 
> > + * Copyright (C) 2009
> > + *
> > + *
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/debugfs.h>
> > +#include <trace/events/skb.h>
> > +#include <linux/kallsyms.h>
> > +#include <linux/module.h>
> > +#include <linux/hardirq.h>
> > +#include <linux/netdevice.h>
> > +#include <net/sock.h>
> > +
> > +#include "trace.h"
> > +#include "trace_output.h"
> > +
> > +EXPORT_TRACEPOINT_SYMBOL_GPL(skb_copy_datagram_iovec);
> > +
> > +static struct trace_array *skb_trace;
> > +static int __read_mostly trace_skb_source_enabled;
> > +
> > +static void probe_skb_dequeue(const struct sk_buff *skb, int len)
> > +{
> > +	struct ring_buffer_event *event;
> > +	struct trace_skb_event *entry;
> > +	struct trace_array *tr = skb_trace;
> > +	struct net_device *dev;
> > +
> > +	if (!trace_skb_source_enabled)
> > +		return;
> > +
> > +	if (in_interrupt())
> > +		return;
> 
> Is there a reason for not doing this in an interrupt?
> 
Because the idea is to correlate skb consumption to a process.  If we get in
this tracepoint in an interrupt, it doesn't make sense to record.


> > +
> > +	event = trace_buffer_lock_reserve(tr, TRACE_SKB_SOURCE,
> > +					  sizeof(*entry), 0, 0);
> > +	if (!event)
> > +		return;
> > +	entry = ring_buffer_event_data(event);
> > +
> > +	entry->event_data.pid = current->pid;
> 
> Note, the trace_buffer_lock_reserve will record the current pid, thus you 
> do not need to record it here.
> 
> > +	entry->event_data.anid = page_to_nid(virt_to_page(skb->data));
> > +	entry->event_data.cnid = cpu_to_node(smp_processor_id());
> > +	entry->event_data.len = len;
> > +	entry->event_data.rx_queue = skb->queue_mapping;
> > +	entry->event_data.ccpu = smp_processor_id();
> 
> Also, the cpu is recorded in the ring buffer. They are per cpu ring 
> buffers and that determines the cpu it was recorded on.
> 
> > +
> > +	dev = dev_get_by_index(sock_net(skb->sk), skb->iif);
> > +	if (dev) {
> > +		memcpy(entry->event_data.ifname, dev->name, IFNAMSIZ);
> > +		dev_put(dev);
> > +	} else {
> > +		strcpy(entry->event_data.ifname, "Unknown");
> > +	}
> > +
> > +	trace_buffer_unlock_commit(tr, event, 0, 0);
> > +}
> > +
> > +static int tracing_skb_source_register(void)
> > +{
> > +	int ret;
> > +
> > +	ret = register_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
> > +	if (ret)
> > +		pr_info("skb source trace: Couldn't activate dequeue tracepoint");
> > +	
> > +	return ret;
> > +}
> > +
> > +static void start_skb_source_trace(struct trace_array *tr)
> > +{
> > +	trace_skb_source_enabled = 1;
> > +}
> > +
> > +static void stop_skb_source_trace(struct trace_array *tr)
> > +{
> > +	trace_skb_source_enabled = 0;
> > +}
> > +
> > +static void skb_source_trace_reset(struct trace_array *tr)
> > +{
> > +	trace_skb_source_enabled = 0;
> > +	unregister_trace_skb_copy_datagram_iovec(probe_skb_dequeue);
> > +}
> > +
> > +
> > +static int skb_source_trace_init(struct trace_array *tr)
> > +{
> > +	int cpu;
> > +	skb_trace = tr;
> > +
> > +	trace_skb_source_enabled = 1;
> > +	tracing_skb_source_register();
> > +
> > +	for_each_cpu(cpu, cpu_possible_mask)
> > +		tracing_reset(tr, cpu);
> > +	return 0;
> > +}
> > +
> > +static enum print_line_t skb_source_print_line(struct trace_iterator *iter)
> > +{
> > +	int ret = 0;
> > +	struct trace_entry *entry = iter->ent;
> 
> iter->cpu has the cpu that trace was recorded on.
> entry->pid has the pid of the process that did the recording.
> 
ok, I'll clean this up in a subsequent patch, since davem has already rolled
them in.

> > +	struct trace_skb_event *event;
> > +	struct skb_record *record;
> > +	struct trace_seq *s = &iter->seq;
> > +
> > +	trace_assign_type(event, entry);
> > +	record = &event->event_data;
> > +	if (entry->type != TRACE_SKB_SOURCE)
> > +		return TRACE_TYPE_UNHANDLED;
> > +
> > +	ret = trace_seq_printf(s, "	%d	%d	%d	%s	%d	%d	%d\n",
> > +			record->pid,
> > +			record->anid,
> > +			record->cnid,
> > +			record->ifname,
> > +			record->rx_queue,
> > +			record->ccpu,
> > +			record->len);
> > +
> > +	if (!ret)
> > +		return TRACE_TYPE_PARTIAL_LINE;
> > +
> > +	return TRACE_TYPE_HANDLED;
> > +}
> > +
> > +static void skb_source_print_header(struct seq_file *s)
> > +{
> > +	seq_puts(s, "#	PID	ANID	CNID	IFC	RXQ	CCPU	LEN\n");
> > +	seq_puts(s, "#	 |	 |	 |	 |	 |	 |	 |\n");
> > +}
> > +
> > +static struct tracer skb_source_tracer __read_mostly =
> > +{
> > +	.name		= "skb_sources",
> > +	.init		= skb_source_trace_init,
> > +	.start		= start_skb_source_trace,
> > +	.stop		= stop_skb_source_trace,
> > +	.reset		= skb_source_trace_reset,
> > +	.print_line	= skb_source_print_line,
> > +	.print_header	= skb_source_print_header,
> > +};
> > +
> > +static int init_skb_source_trace(void)
> > +{
> > +	return register_tracer(&skb_source_tracer);
> > +}
> > +device_initcall(init_skb_source_trace);
> > 
> 
> BTW, why not just do this as events? Or was this just a easy way to 
> communicate with the user space tools?
> 
Thats exactly why I did it.  the idea is for me to now write a user space tool
that lets me analyze the events and ajust process scheduling to optimize the rx
path.
Neil

> -- Steve
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-08-18 16:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-07 20:21 [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations Neil Horman
2009-08-07 20:28 ` [PATCH 1/3] " Neil Horman
2009-08-07 20:30 ` [PATCH 2/3] " Neil Horman
2009-08-07 20:44 ` [PATCH 3/3] " Neil Horman
2009-08-08 23:13 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v2) Neil Horman
2009-08-08 23:14   ` [PATCH 1/3] " Neil Horman
2009-08-08 23:15   ` [PATCH 2/3] " Neil Horman
2009-08-08 23:15   ` [PATCH 3/3] " Neil Horman
2009-08-13  5:15 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations David Miller
2009-08-13 10:42   ` Neil Horman
2009-08-13 14:59 ` [PATCH 0/3] net: Add ftracer to help optimize process scheduling based on incomming frame allocations (v3) Neil Horman
2009-08-13 15:19   ` [PATCH 1/3] net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3) Neil Horman
2009-08-13 23:28     ` David Miller
2009-08-13 15:20   ` [PATCH 2/3] net: skb ftracer - Add config option to enable new ftracer (v3) Neil Horman
2009-08-13 23:28     ` David Miller
2009-08-13 15:23   ` [PATCH 3/3] net: skb ftracer - Add actual ftrace code to kernel (v3) Neil Horman
2009-08-13 23:28     ` David Miller
2009-08-17 20:55     ` Steven Rostedt
2009-08-18 16:39       ` Neil Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.