All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Add walltime support for ring-buffer
@ 2009-07-24 10:40 Zhaolei
  2009-07-24 10:42 ` [PATCH] " Zhaolei
                   ` (4 more replies)
  0 siblings, 5 replies; 46+ messages in thread
From: Zhaolei @ 2009-07-24 10:40 UTC (permalink / raw)
  To: Steven Rostedt, Frederic Weisbecker, Ingo Molnar; +Cc: LKML

Hello, Steven

It is a prototype code to make ftrace display walltime of events.

It need to applys on my patch of:
[PATCH v3 1/2] Add function to convert between calendar time and broken-down time for universal use

This code can run but need to be fixed, it only for demostrate what is result
looks like.

I think display walltime is valuable for ftrace, it can provide more
recognizable information to system manager that he can known "when"
event happened.
And it is also helpful for flight-recorder we are doing now, we can get
walltime of events after panic for analyse.

We can get following result on this patch:
 # cd /mnt/debugfs/tracing/
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-2457  [000]   214.895446:   2457:120:S   + [000]  2457:120:S bash
             bash-2457  [000]   214.898298:   2457:120:R   + [000]     5:115:S events/0
             bash-2457  [000]   214.898684:   2457:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]   214.899458:      5:115:R   + [000]  2455:120:S sshd
         events/0-5     [000]   214.899495:      5:115:S ==> [000]  2455:120:R sshd
 ...
 # echo 1 > options/walltime
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#                  TIMESTAMP  FUNCTION
 #              | |       |                        |         |
           <idle>-0     [000] 2009-07-25 18:01:00.848468:      0:140:R ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-07-25 18:01:00.848523:      5:115:S ==> [000]     0:140:R <idle>
           <idle>-0     [000] 2009-07-25 18:01:01.613479:      0:140:R   + [000]  2455:120:S sshd
           <idle>-0     [000] 2009-07-25 18:01:01.613678:      0:140:R ==> [000]  2455:120:R sshd
             sshd-2455  [000] 2009-07-25 18:01:01.614015:   2455:120:S   + [000]  2455:120:S sshd
 ...

This patch is prototype because we have following things to do:
1: function to get walltime is not near rb_time_stamp()
   So walltime is not absolutely synchronize with trace time.
   But if we place code to get walltime near rb_time_stamp(), it will be called
   on every reserve-event and will make system slow.
2: We can't call getnstimeofday() freely in ring-buffer's code, because
   it is possibility already hold xtime_lock. Prototype code used a ugly way
   to get gid of this problem, maybe we need to ftrace_disable_cpu() before
   every write_seqlock() instead.
3: People maybe change walltime when tracing, but current ring-buffer page's
   walltime can not reflect this change. We need to add tracepoints to watch
   this change, and add a walltime-change event(already reserved in
   ring-buffer.c) to record this change.

I'll continue to finish remain steps, what is your opinion on it?

Thanks
Zhaolei



^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Add walltime support for ring-buffer
  2009-07-24 10:40 [PATCH 0/3] Add walltime support for ring-buffer Zhaolei
@ 2009-07-24 10:42 ` Zhaolei
  2009-07-24 10:47   ` Zhaolei
  2009-07-24 10:43 ` [RFC PATCH 1/3] " Zhaolei
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 46+ messages in thread
From: Zhaolei @ 2009-07-24 10:42 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Frederic Weisbecker, Ingo Molnar, LKML

Note:
It is only a prototype patch to for demostrate what is result
looks like.
There still have many works to do(bug to fix) before apply.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 drivers/oprofile/cpu_buffer.c        |    4 +-
 include/linux/ring_buffer.h          |   12 ++-
 kernel/trace/ring_buffer.c           |  159 ++++++++++++++++++++++++++++++----
 kernel/trace/ring_buffer_benchmark.c |    3 +-
 kernel/trace/trace.c                 |    8 +-
 kernel/trace/trace_functions_graph.c |    8 +-
 kernel/trace/trace_selftest.c        |    2 +-
 7 files changed, 161 insertions(+), 35 deletions(-)

diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
index a7aae24..1b9b36d 100644
--- a/drivers/oprofile/cpu_buffer.c
+++ b/drivers/oprofile/cpu_buffer.c
@@ -187,14 +187,14 @@ int op_cpu_buffer_write_commit(struct op_entry *entry)
 struct op_sample *op_cpu_buffer_read_entry(struct op_entry *entry, int cpu)
 {
 	struct ring_buffer_event *e;
-	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL);
+	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL, NULL);
 	if (e)
 		goto event;
 	if (ring_buffer_swap_cpu(op_ring_buffer_read,
 				 op_ring_buffer_write,
 				 cpu))
 		return NULL;
-	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL);
+	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL, NULL);
 	if (e)
 		goto event;
 	return NULL;
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 7fca716..eafffca 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -134,18 +134,22 @@ int ring_buffer_write(struct ring_buffer *buffer,
 		      unsigned long length, void *data);
 
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts);
+ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
+		 struct timespec *timespec);
 struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts);
+ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts,
+		    struct timespec *timespec);
 
 struct ring_buffer_iter *
 ring_buffer_read_start(struct ring_buffer *buffer, int cpu);
 void ring_buffer_read_finish(struct ring_buffer_iter *iter);
 
 struct ring_buffer_event *
-ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts);
+ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts,
+		      struct timespec *timespec);
 struct ring_buffer_event *
-ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts);
+ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts,
+		 struct timespec *timespec);
 void ring_buffer_iter_reset(struct ring_buffer_iter *iter);
 int ring_buffer_iter_empty(struct ring_buffer_iter *iter);
 
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 51633d7..698e47f 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -318,6 +318,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
 
 struct buffer_data_page {
 	u64		 time_stamp;	/* page time stamp */
+	u32		 walltime[3];	/* page wall time */
 	local_t		 commit;	/* write committed index */
 	unsigned char	 data[];	/* data of buffer page */
 };
@@ -442,6 +443,8 @@ struct ring_buffer_per_cpu {
 	unsigned long			read;
 	u64				write_stamp;
 	u64				read_stamp;
+	u64				read_walltimestamp;
+	u32				read_walltime[3];
 	atomic_t			record_disabled;
 };
 
@@ -469,6 +472,8 @@ struct ring_buffer_iter {
 	unsigned long			head;
 	struct buffer_page		*head_page;
 	u64				read_stamp;
+	u64				read_walltimestamp;
+	u32				read_walltime[3];
 };
 
 /* buffer may be either ring_buffer or ring_buffer_per_cpu */
@@ -1499,6 +1504,14 @@ rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
 static void rb_reset_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 {
 	cpu_buffer->read_stamp = cpu_buffer->reader_page->page->time_stamp;
+	cpu_buffer->read_walltimestamp =
+		cpu_buffer->reader_page->page->time_stamp;
+	cpu_buffer->read_walltime[0] =
+		cpu_buffer->reader_page->page->walltime[0];
+	cpu_buffer->read_walltime[1] =
+		cpu_buffer->reader_page->page->walltime[1];
+	cpu_buffer->read_walltime[2] =
+		cpu_buffer->reader_page->page->walltime[2];
 	cpu_buffer->reader_page->read = 0;
 }
 
@@ -1518,6 +1531,10 @@ static void rb_inc_iter(struct ring_buffer_iter *iter)
 		rb_inc_page(cpu_buffer, &iter->head_page);
 
 	iter->read_stamp = iter->head_page->page->time_stamp;
+	iter->read_walltimestamp = iter->head_page->page->time_stamp;
+	iter->read_walltime[0] = iter->head_page->page->walltime[0];
+	iter->read_walltime[1] = iter->head_page->page->walltime[1];
+	iter->read_walltime[2] = iter->head_page->page->walltime[2];
 	iter->head = 0;
 }
 
@@ -1927,8 +1944,48 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 	 * If this is the first commit on the page, then update
 	 * its timestamp.
 	 */
-	if (!tail)
+	if (!tail) {
+		struct timespec timespec;
 		tail_page->page->time_stamp = *ts;
+		/*
+		 * Fixme:
+		 * 1: Get walltime close to get timestamp
+		 * 2: don't need to set timestamp in rb_move_tail(), timestamp
+		 *    will be rewrited here
+		 */
+		/*
+		 * Fixme:
+		 * We can't call getnstimeofday() here,
+		 * because when when a function called with xtime_lock holded,
+		 * kernel will run to here by function_trace, then
+		 * getnstimeofday() will run into infinite loop, fox ex:
+		 * tick_do_update_jiffies64()  * write_seqlock(xtime_lock) *
+		 * ->do_timer()
+		 * ->update_wall_time()
+		 * ->update_xtime_cache()
+		 * ->ftrace_call()  * here into function trace *
+		 * ->function_test_events_call()
+		 * ->trace_current_buffer_lock_reserve()
+		 * -> ...
+		 * -> here
+		 *
+		 * So we use get_seconds() when xtime_lock is holded,
+		 * it is only a temporary way just to get walltime function
+		 * work, we must fix it.
+		 *
+		 * Another way is call ftrace_disable_cpu() before
+		 * write_seqlock().
+		 */
+		if (xtime_lock.sequence & 1) {
+			timespec.tv_sec = get_seconds();
+			timespec.tv_nsec = 0;
+		} else {
+			getnstimeofday(&timespec);
+		}
+		tail_page->page->walltime[0] = timespec.tv_nsec;
+		*(__kernel_time_t *)(tail_page->page->walltime + 1) =
+			timespec.tv_sec;
+	}
 
 	return event;
 }
@@ -2674,10 +2731,19 @@ static void rb_iter_reset(struct ring_buffer_iter *iter)
 		iter->head_page = cpu_buffer->reader_page;
 		iter->head = cpu_buffer->reader_page->read;
 	}
-	if (iter->head)
+	if (iter->head) {
 		iter->read_stamp = cpu_buffer->read_stamp;
-	else
+		iter->read_walltimestamp = cpu_buffer->read_walltimestamp;
+		iter->read_walltime[0] = cpu_buffer->read_walltime[0];
+		iter->read_walltime[1] = cpu_buffer->read_walltime[1];
+		iter->read_walltime[2] = cpu_buffer->read_walltime[2];
+	} else {
 		iter->read_stamp = iter->head_page->page->time_stamp;
+		iter->read_walltimestamp = iter->head_page->page->time_stamp;
+		iter->read_walltime[0] = iter->head_page->page->walltime[0];
+		iter->read_walltime[1] = iter->head_page->page->walltime[1];
+		iter->read_walltime[2] = iter->head_page->page->walltime[2];
+	}
 }
 
 /**
@@ -2780,6 +2846,21 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *iter,
 	return;
 }
 
+static void rb_cal_walltime(u32 *walltime, u64 ts, u32 *base_walltime,
+			    u64 base_ts)
+{
+	u64 sec;
+	u64 nsec;
+	sec = *(__kernel_time_t *)(base_walltime + 1);
+	nsec = base_walltime[0] + ts - base_ts;
+	while (nsec >= NSEC_PER_SEC) {
+		nsec -= NSEC_PER_SEC;
+		++sec;
+	}
+	walltime[0] = nsec;
+	*(__kernel_time_t *)(walltime + 1) = sec;
+}
+
 static struct buffer_page *
 rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 {
@@ -2953,12 +3034,14 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
 }
 
 static struct ring_buffer_event *
-rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
+	       struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
 	struct buffer_page *reader;
 	int nr_loops = 0;
+	u64 timestamp;
 
 	cpu_buffer = buffer->buffers[cpu];
 
@@ -3004,10 +3087,21 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 		goto again;
 
 	case RINGBUF_TYPE_DATA:
-		if (ts) {
-			*ts = cpu_buffer->read_stamp + event->time_delta;
+		if (ts || timespec) {
+			timestamp = cpu_buffer->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(buffer,
-							 cpu_buffer->cpu, ts);
+							 cpu_buffer->cpu,
+							 &timestamp);
+		}
+		if (ts)
+			*ts = timestamp;
+		if (timespec) {
+			u32 walltime[3];
+			rb_cal_walltime(walltime, timestamp,
+				cpu_buffer->read_walltime,
+				cpu_buffer->read_walltimestamp);
+			timespec->tv_nsec = walltime[0];
+			timespec->tv_sec = *(__kernel_time_t *)(walltime + 1);
 		}
 		return event;
 
@@ -3020,12 +3114,13 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 EXPORT_SYMBOL_GPL(ring_buffer_peek);
 
 static struct ring_buffer_event *
-rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts, struct timespec *timespec)
 {
 	struct ring_buffer *buffer;
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
 	int nr_loops = 0;
+	u64 timestamp;
 
 	if (ring_buffer_iter_empty(iter))
 		return NULL;
@@ -3070,10 +3165,20 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 		goto again;
 
 	case RINGBUF_TYPE_DATA:
-		if (ts) {
-			*ts = iter->read_stamp + event->time_delta;
+		if (ts || timespec) {
+			timestamp = iter->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(buffer,
-							 cpu_buffer->cpu, ts);
+							 cpu_buffer->cpu,
+							 &timestamp);
+		}
+		if (ts)
+			*ts = timestamp;
+		if (timespec) {
+			u32 walltime[3];
+			rb_cal_walltime(walltime, timestamp,
+				iter->read_walltime, iter->read_walltimestamp);
+			timespec->tv_nsec = walltime[0];
+			timespec->tv_sec = *(__kernel_time_t *)(walltime + 1);
 		}
 		return event;
 
@@ -3110,7 +3215,8 @@ static inline int rb_ok_to_lock(void)
  * not consume the data.
  */
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
+		 struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu];
 	struct ring_buffer_event *event;
@@ -3125,7 +3231,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 	local_irq_save(flags);
 	if (dolock)
 		spin_lock(&cpu_buffer->reader_lock);
-	event = rb_buffer_peek(buffer, cpu, ts);
+	event = rb_buffer_peek(buffer, cpu, ts, timespec);
+
 	if (dolock)
 		spin_unlock(&cpu_buffer->reader_lock);
 	local_irq_restore(flags);
@@ -3147,7 +3254,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
  * not increment the iterator.
  */
 struct ring_buffer_event *
-ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts,
+		      struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer;
 	struct ring_buffer_event *event;
@@ -3155,7 +3263,7 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 
  again:
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
+	event = rb_iter_peek(iter, ts, timespec);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	if (event && event->type_len == RINGBUF_TYPE_PADDING) {
@@ -3175,7 +3283,8 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
  * and eventually empty the ring buffer if the producer is slower.
  */
 struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
+ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts,
+		    struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event = NULL;
@@ -3196,7 +3305,7 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
 	if (dolock)
 		spin_lock(&cpu_buffer->reader_lock);
 
-	event = rb_buffer_peek(buffer, cpu, ts);
+	event = rb_buffer_peek(buffer, cpu, ts, timespec);
 	if (!event)
 		goto out_unlock;
 
@@ -3287,7 +3396,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_read_finish);
  * This reads the next event in the ring buffer and increments the iterator.
  */
 struct ring_buffer_event *
-ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts)
+ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts,
+		 struct timespec *timespec)
 {
 	struct ring_buffer_event *event;
 	struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer;
@@ -3295,7 +3405,7 @@ ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts)
 
  again:
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
+	event = rb_iter_peek(iter, ts, timespec);
 	if (!event)
 		goto out;
 
@@ -3353,6 +3463,10 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer)
 
 	cpu_buffer->write_stamp = 0;
 	cpu_buffer->read_stamp = 0;
+	cpu_buffer->read_walltimestamp = 0;
+	cpu_buffer->read_walltime[0] = 0;
+	cpu_buffer->read_walltime[1] = 0;
+	cpu_buffer->read_walltime[2] = 0;
 
 	rb_head_page_activate(cpu_buffer);
 }
@@ -3620,6 +3734,7 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 	unsigned int commit;
 	unsigned int read;
 	u64 save_timestamp;
+	u32 save_walltime[3];
 	int ret = -1;
 
 	if (!cpumask_test_cpu(cpu, buffer->cpumask))
@@ -3679,6 +3794,9 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 
 		/* save the current timestamp, since the user will need it */
 		save_timestamp = cpu_buffer->read_stamp;
+		rb_cal_walltime(save_walltime, cpu_buffer->read_stamp,
+				cpu_buffer->read_walltime,
+				cpu_buffer->read_walltimestamp);
 
 		/* Need to copy one event at a time */
 		do {
@@ -3697,6 +3815,9 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 		/* update bpage */
 		local_set(&bpage->commit, pos);
 		bpage->time_stamp = save_timestamp;
+		bpage->walltime[0] = save_walltime[0];
+		bpage->walltime[1] = save_walltime[1];
+		bpage->walltime[2] = save_walltime[2];
 
 		/* we copied everything to the beginning */
 		read = 0;
diff --git a/kernel/trace/ring_buffer_benchmark.c b/kernel/trace/ring_buffer_benchmark.c
index 573d3cc..059a728 100644
--- a/kernel/trace/ring_buffer_benchmark.c
+++ b/kernel/trace/ring_buffer_benchmark.c
@@ -57,8 +57,9 @@ static enum event_status read_event(int cpu)
 	struct ring_buffer_event *event;
 	int *entry;
 	u64 ts;
+	struct timespec timespec;
 
-	event = ring_buffer_consume(buffer, cpu, &ts);
+	event = ring_buffer_consume(buffer, cpu, &ts, &timespec);
 	if (!event)
 		return EVENT_DROPPED;
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e30e6b1..22787b3 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1393,7 +1393,7 @@ static void trace_iterator_increment(struct trace_iterator *iter)
 
 	iter->idx++;
 	if (iter->buffer_iter[iter->cpu])
-		ring_buffer_read(iter->buffer_iter[iter->cpu], NULL);
+		ring_buffer_read(iter->buffer_iter[iter->cpu], NULL, NULL);
 
 	ftrace_enable_cpu();
 }
@@ -1408,9 +1408,9 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
 	ftrace_disable_cpu();
 
 	if (buf_iter)
-		event = ring_buffer_iter_peek(buf_iter, ts);
+		event = ring_buffer_iter_peek(buf_iter, ts, NULL);
 	else
-		event = ring_buffer_peek(iter->tr->buffer, cpu, ts);
+		event = ring_buffer_peek(iter->tr->buffer, cpu, ts, NULL);
 
 	ftrace_enable_cpu();
 
@@ -1489,7 +1489,7 @@ static void trace_consume(struct trace_iterator *iter)
 {
 	/* Don't allow ftrace to trace into the ring buffers */
 	ftrace_disable_cpu();
-	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts);
+	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts, NULL);
 	ftrace_enable_cpu();
 }
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index abf7c4a..03520a5 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -340,12 +340,12 @@ get_return_for_leaf(struct trace_iterator *iter,
 
 	/* First peek to compare current entry and the next one */
 	if (ring_iter)
-		event = ring_buffer_iter_peek(ring_iter, NULL);
+		event = ring_buffer_iter_peek(ring_iter, NULL, NULL);
 	else {
 	/* We need to consume the current entry to see the next one */
-		ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL);
+		ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL, NULL);
 		event = ring_buffer_peek(iter->tr->buffer, iter->cpu,
-					NULL);
+					NULL, NULL);
 	}
 
 	if (!event)
@@ -362,7 +362,7 @@ get_return_for_leaf(struct trace_iterator *iter,
 
 	/* this is a leaf, now advance the iterator */
 	if (ring_iter)
-		ring_buffer_read(ring_iter, NULL);
+		ring_buffer_read(ring_iter, NULL, NULL);
 
 	return next;
 }
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 71f2edb..e6fa0d1 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -29,7 +29,7 @@ static int trace_test_buffer_cpu(struct trace_array *tr, int cpu)
 	struct trace_entry *entry;
 	unsigned int loops = 0;
 
-	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL))) {
+	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL, NULL))) {
 		entry = ring_buffer_event_data(event);
 
 		/*
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [RFC PATCH 1/3] Add walltime support for ring-buffer
  2009-07-24 10:40 [PATCH 0/3] Add walltime support for ring-buffer Zhaolei
  2009-07-24 10:42 ` [PATCH] " Zhaolei
@ 2009-07-24 10:43 ` Zhaolei
  2009-07-24 10:43 ` [RFC PATCH 2/3] Apply walltime-supporting functions to trace system Zhaolei
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-07-24 10:43 UTC (permalink / raw)
  To: Zhaolei; +Cc: Steven Rostedt, Frederic Weisbecker, Ingo Molnar, LKML

Note:
It is only a prototype patch to for demostrate what is result
looks like.
There still have many works to do(bug to fix) before apply.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 drivers/oprofile/cpu_buffer.c        |    4 +-
 include/linux/ring_buffer.h          |   12 ++-
 kernel/trace/ring_buffer.c           |  159 ++++++++++++++++++++++++++++++----
 kernel/trace/ring_buffer_benchmark.c |    3 +-
 kernel/trace/trace.c                 |    8 +-
 kernel/trace/trace_functions_graph.c |    8 +-
 kernel/trace/trace_selftest.c        |    2 +-
 7 files changed, 161 insertions(+), 35 deletions(-)

diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
index a7aae24..1b9b36d 100644
--- a/drivers/oprofile/cpu_buffer.c
+++ b/drivers/oprofile/cpu_buffer.c
@@ -187,14 +187,14 @@ int op_cpu_buffer_write_commit(struct op_entry *entry)
 struct op_sample *op_cpu_buffer_read_entry(struct op_entry *entry, int cpu)
 {
 	struct ring_buffer_event *e;
-	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL);
+	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL, NULL);
 	if (e)
 		goto event;
 	if (ring_buffer_swap_cpu(op_ring_buffer_read,
 				 op_ring_buffer_write,
 				 cpu))
 		return NULL;
-	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL);
+	e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL, NULL);
 	if (e)
 		goto event;
 	return NULL;
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 7fca716..eafffca 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -134,18 +134,22 @@ int ring_buffer_write(struct ring_buffer *buffer,
 		      unsigned long length, void *data);
 
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts);
+ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
+		 struct timespec *timespec);
 struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts);
+ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts,
+		    struct timespec *timespec);
 
 struct ring_buffer_iter *
 ring_buffer_read_start(struct ring_buffer *buffer, int cpu);
 void ring_buffer_read_finish(struct ring_buffer_iter *iter);
 
 struct ring_buffer_event *
-ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts);
+ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts,
+		      struct timespec *timespec);
 struct ring_buffer_event *
-ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts);
+ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts,
+		 struct timespec *timespec);
 void ring_buffer_iter_reset(struct ring_buffer_iter *iter);
 int ring_buffer_iter_empty(struct ring_buffer_iter *iter);
 
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 51633d7..698e47f 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -318,6 +318,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
 
 struct buffer_data_page {
 	u64		 time_stamp;	/* page time stamp */
+	u32		 walltime[3];	/* page wall time */
 	local_t		 commit;	/* write committed index */
 	unsigned char	 data[];	/* data of buffer page */
 };
@@ -442,6 +443,8 @@ struct ring_buffer_per_cpu {
 	unsigned long			read;
 	u64				write_stamp;
 	u64				read_stamp;
+	u64				read_walltimestamp;
+	u32				read_walltime[3];
 	atomic_t			record_disabled;
 };
 
@@ -469,6 +472,8 @@ struct ring_buffer_iter {
 	unsigned long			head;
 	struct buffer_page		*head_page;
 	u64				read_stamp;
+	u64				read_walltimestamp;
+	u32				read_walltime[3];
 };
 
 /* buffer may be either ring_buffer or ring_buffer_per_cpu */
@@ -1499,6 +1504,14 @@ rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
 static void rb_reset_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 {
 	cpu_buffer->read_stamp = cpu_buffer->reader_page->page->time_stamp;
+	cpu_buffer->read_walltimestamp =
+		cpu_buffer->reader_page->page->time_stamp;
+	cpu_buffer->read_walltime[0] =
+		cpu_buffer->reader_page->page->walltime[0];
+	cpu_buffer->read_walltime[1] =
+		cpu_buffer->reader_page->page->walltime[1];
+	cpu_buffer->read_walltime[2] =
+		cpu_buffer->reader_page->page->walltime[2];
 	cpu_buffer->reader_page->read = 0;
 }
 
@@ -1518,6 +1531,10 @@ static void rb_inc_iter(struct ring_buffer_iter *iter)
 		rb_inc_page(cpu_buffer, &iter->head_page);
 
 	iter->read_stamp = iter->head_page->page->time_stamp;
+	iter->read_walltimestamp = iter->head_page->page->time_stamp;
+	iter->read_walltime[0] = iter->head_page->page->walltime[0];
+	iter->read_walltime[1] = iter->head_page->page->walltime[1];
+	iter->read_walltime[2] = iter->head_page->page->walltime[2];
 	iter->head = 0;
 }
 
@@ -1927,8 +1944,48 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 	 * If this is the first commit on the page, then update
 	 * its timestamp.
 	 */
-	if (!tail)
+	if (!tail) {
+		struct timespec timespec;
 		tail_page->page->time_stamp = *ts;
+		/*
+		 * Fixme:
+		 * 1: Get walltime close to get timestamp
+		 * 2: don't need to set timestamp in rb_move_tail(), timestamp
+		 *    will be rewrited here
+		 */
+		/*
+		 * Fixme:
+		 * We can't call getnstimeofday() here,
+		 * because when when a function called with xtime_lock holded,
+		 * kernel will run to here by function_trace, then
+		 * getnstimeofday() will run into infinite loop, fox ex:
+		 * tick_do_update_jiffies64()  * write_seqlock(xtime_lock) *
+		 * ->do_timer()
+		 * ->update_wall_time()
+		 * ->update_xtime_cache()
+		 * ->ftrace_call()  * here into function trace *
+		 * ->function_test_events_call()
+		 * ->trace_current_buffer_lock_reserve()
+		 * -> ...
+		 * -> here
+		 *
+		 * So we use get_seconds() when xtime_lock is holded,
+		 * it is only a temporary way just to get walltime function
+		 * work, we must fix it.
+		 *
+		 * Another way is call ftrace_disable_cpu() before
+		 * write_seqlock().
+		 */
+		if (xtime_lock.sequence & 1) {
+			timespec.tv_sec = get_seconds();
+			timespec.tv_nsec = 0;
+		} else {
+			getnstimeofday(&timespec);
+		}
+		tail_page->page->walltime[0] = timespec.tv_nsec;
+		*(__kernel_time_t *)(tail_page->page->walltime + 1) =
+			timespec.tv_sec;
+	}
 
 	return event;
 }
@@ -2674,10 +2731,19 @@ static void rb_iter_reset(struct ring_buffer_iter *iter)
 		iter->head_page = cpu_buffer->reader_page;
 		iter->head = cpu_buffer->reader_page->read;
 	}
-	if (iter->head)
+	if (iter->head) {
 		iter->read_stamp = cpu_buffer->read_stamp;
-	else
+		iter->read_walltimestamp = cpu_buffer->read_walltimestamp;
+		iter->read_walltime[0] = cpu_buffer->read_walltime[0];
+		iter->read_walltime[1] = cpu_buffer->read_walltime[1];
+		iter->read_walltime[2] = cpu_buffer->read_walltime[2];
+	} else {
 		iter->read_stamp = iter->head_page->page->time_stamp;
+		iter->read_walltimestamp = iter->head_page->page->time_stamp;
+		iter->read_walltime[0] = iter->head_page->page->walltime[0];
+		iter->read_walltime[1] = iter->head_page->page->walltime[1];
+		iter->read_walltime[2] = iter->head_page->page->walltime[2];
+	}
 }
 
 /**
@@ -2780,6 +2846,21 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *iter,
 	return;
 }
 
+static void rb_cal_walltime(u32 *walltime, u64 ts, u32 *base_walltime,
+			    u64 base_ts)
+{
+	u64 sec;
+	u64 nsec;
+	sec = *(__kernel_time_t *)(base_walltime + 1);
+	nsec = base_walltime[0] + ts - base_ts;
+	while (nsec >= NSEC_PER_SEC) {
+		nsec -= NSEC_PER_SEC;
+		++sec;
+	}
+	walltime[0] = nsec;
+	*(__kernel_time_t *)(walltime + 1) = sec;
+}
+
 static struct buffer_page *
 rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
 {
@@ -2953,12 +3034,14 @@ static void rb_advance_iter(struct ring_buffer_iter *iter)
 }
 
 static struct ring_buffer_event *
-rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
+	       struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
 	struct buffer_page *reader;
 	int nr_loops = 0;
+	u64 timestamp;
 
 	cpu_buffer = buffer->buffers[cpu];
 
@@ -3004,10 +3087,21 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 		goto again;
 
 	case RINGBUF_TYPE_DATA:
-		if (ts) {
-			*ts = cpu_buffer->read_stamp + event->time_delta;
+		if (ts || timespec) {
+			timestamp = cpu_buffer->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(buffer,
-							 cpu_buffer->cpu, ts);
+							 cpu_buffer->cpu,
+							 &timestamp);
+		}
+		if (ts)
+			*ts = timestamp;
+		if (timespec) {
+			u32 walltime[3];
+			rb_cal_walltime(walltime, timestamp,
+				cpu_buffer->read_walltime,
+				cpu_buffer->read_walltimestamp);
+			timespec->tv_nsec = walltime[0];
+			timespec->tv_sec = *(__kernel_time_t *)(walltime + 1);
 		}
 		return event;
 
@@ -3020,12 +3114,13 @@ rb_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 EXPORT_SYMBOL_GPL(ring_buffer_peek);
 
 static struct ring_buffer_event *
-rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts, struct timespec *timespec)
 {
 	struct ring_buffer *buffer;
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event;
 	int nr_loops = 0;
+	u64 timestamp;
 
 	if (ring_buffer_iter_empty(iter))
 		return NULL;
@@ -3070,10 +3165,20 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 		goto again;
 
 	case RINGBUF_TYPE_DATA:
-		if (ts) {
-			*ts = iter->read_stamp + event->time_delta;
+		if (ts || timespec) {
+			timestamp = iter->read_stamp + event->time_delta;
 			ring_buffer_normalize_time_stamp(buffer,
-							 cpu_buffer->cpu, ts);
+							 cpu_buffer->cpu,
+							 &timestamp);
+		}
+		if (ts)
+			*ts = timestamp;
+		if (timespec) {
+			u32 walltime[3];
+			rb_cal_walltime(walltime, timestamp,
+				iter->read_walltime, iter->read_walltimestamp);
+			timespec->tv_nsec = walltime[0];
+			timespec->tv_sec = *(__kernel_time_t *)(walltime + 1);
 		}
 		return event;
 
@@ -3110,7 +3215,8 @@ static inline int rb_ok_to_lock(void)
  * not consume the data.
  */
 struct ring_buffer_event *
-ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
+ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts,
+		 struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu];
 	struct ring_buffer_event *event;
@@ -3125,7 +3231,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
 	local_irq_save(flags);
 	if (dolock)
 		spin_lock(&cpu_buffer->reader_lock);
-	event = rb_buffer_peek(buffer, cpu, ts);
+	event = rb_buffer_peek(buffer, cpu, ts, timespec);
+
 	if (dolock)
 		spin_unlock(&cpu_buffer->reader_lock);
 	local_irq_restore(flags);
@@ -3147,7 +3254,8 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
  * not increment the iterator.
  */
 struct ring_buffer_event *
-ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts,
+		      struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer;
 	struct ring_buffer_event *event;
@@ -3155,7 +3263,7 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 
  again:
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
+	event = rb_iter_peek(iter, ts, timespec);
 	spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
 
 	if (event && event->type_len == RINGBUF_TYPE_PADDING) {
@@ -3175,7 +3283,8 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
  * and eventually empty the ring buffer if the producer is slower.
  */
 struct ring_buffer_event *
-ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
+ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts,
+		    struct timespec *timespec)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
 	struct ring_buffer_event *event = NULL;
@@ -3196,7 +3305,7 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
 	if (dolock)
 		spin_lock(&cpu_buffer->reader_lock);
 
-	event = rb_buffer_peek(buffer, cpu, ts);
+	event = rb_buffer_peek(buffer, cpu, ts, timespec);
 	if (!event)
 		goto out_unlock;
 
@@ -3287,7 +3396,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_read_finish);
  * This reads the next event in the ring buffer and increments the iterator.
  */
 struct ring_buffer_event *
-ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts)
+ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts,
+		 struct timespec *timespec)
 {
 	struct ring_buffer_event *event;
 	struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer;
@@ -3295,7 +3405,7 @@ ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts)
 
  again:
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
-	event = rb_iter_peek(iter, ts);
+	event = rb_iter_peek(iter, ts, timespec);
 	if (!event)
 		goto out;
 
@@ -3353,6 +3463,10 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer)
 
 	cpu_buffer->write_stamp = 0;
 	cpu_buffer->read_stamp = 0;
+	cpu_buffer->read_walltimestamp = 0;
+	cpu_buffer->read_walltime[0] = 0;
+	cpu_buffer->read_walltime[1] = 0;
+	cpu_buffer->read_walltime[2] = 0;
 
 	rb_head_page_activate(cpu_buffer);
 }
@@ -3620,6 +3734,7 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 	unsigned int commit;
 	unsigned int read;
 	u64 save_timestamp;
+	u32 save_walltime[3];
 	int ret = -1;
 
 	if (!cpumask_test_cpu(cpu, buffer->cpumask))
@@ -3679,6 +3794,9 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 
 		/* save the current timestamp, since the user will need it */
 		save_timestamp = cpu_buffer->read_stamp;
+		rb_cal_walltime(save_walltime, cpu_buffer->read_stamp,
+				cpu_buffer->read_walltime,
+				cpu_buffer->read_walltimestamp);
 
 		/* Need to copy one event at a time */
 		do {
@@ -3697,6 +3815,9 @@ int ring_buffer_read_page(struct ring_buffer *buffer,
 		/* update bpage */
 		local_set(&bpage->commit, pos);
 		bpage->time_stamp = save_timestamp;
+		bpage->walltime[0] = save_walltime[0];
+		bpage->walltime[1] = save_walltime[1];
+		bpage->walltime[2] = save_walltime[2];
 
 		/* we copied everything to the beginning */
 		read = 0;
diff --git a/kernel/trace/ring_buffer_benchmark.c b/kernel/trace/ring_buffer_benchmark.c
index 573d3cc..059a728 100644
--- a/kernel/trace/ring_buffer_benchmark.c
+++ b/kernel/trace/ring_buffer_benchmark.c
@@ -57,8 +57,9 @@ static enum event_status read_event(int cpu)
 	struct ring_buffer_event *event;
 	int *entry;
 	u64 ts;
+	struct timespec timespec;
 
-	event = ring_buffer_consume(buffer, cpu, &ts);
+	event = ring_buffer_consume(buffer, cpu, &ts, &timespec);
 	if (!event)
 		return EVENT_DROPPED;
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e30e6b1..22787b3 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1393,7 +1393,7 @@ static void trace_iterator_increment(struct trace_iterator *iter)
 
 	iter->idx++;
 	if (iter->buffer_iter[iter->cpu])
-		ring_buffer_read(iter->buffer_iter[iter->cpu], NULL);
+		ring_buffer_read(iter->buffer_iter[iter->cpu], NULL, NULL);
 
 	ftrace_enable_cpu();
 }
@@ -1408,9 +1408,9 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
 	ftrace_disable_cpu();
 
 	if (buf_iter)
-		event = ring_buffer_iter_peek(buf_iter, ts);
+		event = ring_buffer_iter_peek(buf_iter, ts, NULL);
 	else
-		event = ring_buffer_peek(iter->tr->buffer, cpu, ts);
+		event = ring_buffer_peek(iter->tr->buffer, cpu, ts, NULL);
 
 	ftrace_enable_cpu();
 
@@ -1489,7 +1489,7 @@ static void trace_consume(struct trace_iterator *iter)
 {
 	/* Don't allow ftrace to trace into the ring buffers */
 	ftrace_disable_cpu();
-	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts);
+	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts, NULL);
 	ftrace_enable_cpu();
 }
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index abf7c4a..03520a5 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -340,12 +340,12 @@ get_return_for_leaf(struct trace_iterator *iter,
 
 	/* First peek to compare current entry and the next one */
 	if (ring_iter)
-		event = ring_buffer_iter_peek(ring_iter, NULL);
+		event = ring_buffer_iter_peek(ring_iter, NULL, NULL);
 	else {
 	/* We need to consume the current entry to see the next one */
-		ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL);
+		ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL, NULL);
 		event = ring_buffer_peek(iter->tr->buffer, iter->cpu,
-					NULL);
+					NULL, NULL);
 	}
 
 	if (!event)
@@ -362,7 +362,7 @@ get_return_for_leaf(struct trace_iterator *iter,
 
 	/* this is a leaf, now advance the iterator */
 	if (ring_iter)
-		ring_buffer_read(ring_iter, NULL);
+		ring_buffer_read(ring_iter, NULL, NULL);
 
 	return next;
 }
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 71f2edb..e6fa0d1 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -29,7 +29,7 @@ static int trace_test_buffer_cpu(struct trace_array *tr, int cpu)
 	struct trace_entry *entry;
 	unsigned int loops = 0;
 
-	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL))) {
+	while ((event = ring_buffer_consume(tr->buffer, cpu, NULL, NULL))) {
 		entry = ring_buffer_event_data(event);
 
 		/*
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [RFC PATCH 2/3] Apply walltime-supporting functions to trace system
  2009-07-24 10:40 [PATCH 0/3] Add walltime support for ring-buffer Zhaolei
  2009-07-24 10:42 ` [PATCH] " Zhaolei
  2009-07-24 10:43 ` [RFC PATCH 1/3] " Zhaolei
@ 2009-07-24 10:43 ` Zhaolei
  2009-07-24 10:44 ` [RFC PATCH 3/3] Make ftrace display walltime in output Zhaolei
  2009-07-24 13:05 ` [PATCH 0/3] Add walltime support for ring-buffer Steven Rostedt
  4 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-07-24 10:43 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Frederic Weisbecker, Ingo Molnar, LKML

We modified ring-buffer function's arguments to support walltime recording.
There function's caller site needs to be updated to support it.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/ftrace_event.h |    1 +
 kernel/trace/trace.c         |   25 +++++++++++++++----------
 kernel/trace/trace.h         |    3 ++-
 kernel/trace/trace_output.c  |    2 +-
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 5c093ff..4ffd87a 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -58,6 +58,7 @@ struct trace_iterator {
 	struct trace_entry	*ent;
 	int			cpu;
 	u64			ts;
+	struct timespec		timespec;
 
 	loff_t			pos;
 	long			idx;
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 22787b3..af25bec 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1399,7 +1399,8 @@ static void trace_iterator_increment(struct trace_iterator *iter)
 }
 
 static struct trace_entry *
-peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
+peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts,
+		struct timespec *timespec)
 {
 	struct ring_buffer_event *event;
 	struct ring_buffer_iter *buf_iter = iter->buffer_iter[cpu];
@@ -1408,9 +1409,9 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
 	ftrace_disable_cpu();
 
 	if (buf_iter)
-		event = ring_buffer_iter_peek(buf_iter, ts, NULL);
+		event = ring_buffer_iter_peek(buf_iter, ts, timespec);
 	else
-		event = ring_buffer_peek(iter->tr->buffer, cpu, ts, NULL);
+		event = ring_buffer_peek(iter->tr->buffer, cpu, ts, timespec);
 
 	ftrace_enable_cpu();
 
@@ -1418,7 +1419,8 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts)
 }
 
 static struct trace_entry *
-__find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts)
+__find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts,
+		  struct timespec *timespec)
 {
 	struct ring_buffer *buffer = iter->tr->buffer;
 	struct trace_entry *ent, *next = NULL;
@@ -1434,7 +1436,7 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts)
 	if (cpu_file > TRACE_PIPE_ALL_CPU) {
 		if (ring_buffer_empty_cpu(buffer, cpu_file))
 			return NULL;
-		ent = peek_next_entry(iter, cpu_file, ent_ts);
+		ent = peek_next_entry(iter, cpu_file, ent_ts, timespec);
 		if (ent_cpu)
 			*ent_cpu = cpu_file;
 
@@ -1446,7 +1448,7 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts)
 		if (ring_buffer_empty_cpu(buffer, cpu))
 			continue;
 
-		ent = peek_next_entry(iter, cpu, &ts);
+		ent = peek_next_entry(iter, cpu, &ts, timespec);
 
 		/*
 		 * Pick the entry with the smallest timestamp:
@@ -1469,15 +1471,17 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts)
 
 /* Find the next real entry, without updating the iterator itself */
 struct trace_entry *trace_find_next_entry(struct trace_iterator *iter,
-					  int *ent_cpu, u64 *ent_ts)
+					  int *ent_cpu, u64 *ent_ts,
+					  struct timespec *timespec)
 {
-	return __find_next_entry(iter, ent_cpu, ent_ts);
+	return __find_next_entry(iter, ent_cpu, ent_ts, timespec);
 }
 
 /* Find the next real entry, and increment the iterator to the next entry */
 static void *find_next_entry_inc(struct trace_iterator *iter)
 {
-	iter->ent = __find_next_entry(iter, &iter->cpu, &iter->ts);
+	iter->ent = __find_next_entry(iter, &iter->cpu, &iter->ts,
+				      &iter->timespec);
 
 	if (iter->ent)
 		trace_iterator_increment(iter);
@@ -1489,7 +1493,8 @@ static void trace_consume(struct trace_iterator *iter)
 {
 	/* Don't allow ftrace to trace into the ring buffers */
 	ftrace_disable_cpu();
-	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts, NULL);
+	ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts,
+			    &iter->timespec);
 	ftrace_enable_cpu();
 }
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 06886a0..fb3dcc8 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -452,7 +452,8 @@ struct trace_entry *tracing_get_trace_entry(struct trace_array *tr,
 						struct trace_array_cpu *data);
 
 struct trace_entry *trace_find_next_entry(struct trace_iterator *iter,
-					  int *ent_cpu, u64 *ent_ts);
+					  int *ent_cpu, u64 *ent_ts,
+					  struct timespec *timtspec);
 
 void tracing_generic_entry_update(struct trace_entry *entry,
 				  unsigned long flags,
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index e0c2545..f5c7f0c 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -519,7 +519,7 @@ int trace_print_lat_context(struct trace_iterator *iter)
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent,
 			   *next_entry = trace_find_next_entry(iter, NULL,
-							       &next_ts);
+							       &next_ts, NULL);
 	unsigned long verbose = (trace_flags & TRACE_ITER_VERBOSE);
 	unsigned long abs_usecs = ns2usecs(iter->ts - iter->tr->time_start);
 	unsigned long rel_usecs;
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [RFC PATCH 3/3] Make ftrace display walltime in output
  2009-07-24 10:40 [PATCH 0/3] Add walltime support for ring-buffer Zhaolei
                   ` (2 preceding siblings ...)
  2009-07-24 10:43 ` [RFC PATCH 2/3] Apply walltime-supporting functions to trace system Zhaolei
@ 2009-07-24 10:44 ` Zhaolei
  2009-07-24 13:05 ` [PATCH 0/3] Add walltime support for ring-buffer Steven Rostedt
  4 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-07-24 10:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Frederic Weisbecker, Ingo Molnar, LKML

Now ring-buffer system have ability to support events' walltime record.
We can make ftrace display walltime in output as:
 # cd /mnt/debugfs/tracing/
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-2457  [000]   214.895446:   2457:120:S   + [000]  2457:120:S bash
             bash-2457  [000]   214.898298:   2457:120:R   + [000]     5:115:S events/0
             bash-2457  [000]   214.898684:   2457:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]   214.899458:      5:115:R   + [000]  2455:120:S sshd
         events/0-5     [000]   214.899495:      5:115:S ==> [000]  2455:120:R sshd
 ...
 # echo 1 > options/walltime
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#                  TIMESTAMP  FUNCTION
 #              | |       |                        |         |
           <idle>-0     [000] 2009-07-25 18:01:00.848468:      0:140:R ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-07-25 18:01:00.848523:      5:115:S ==> [000]     0:140:R <idle>
           <idle>-0     [000] 2009-07-25 18:01:01.613479:      0:140:R   + [000]  2455:120:S sshd
           <idle>-0     [000] 2009-07-25 18:01:01.613678:      0:140:R ==> [000]  2455:120:R sshd
             sshd-2455  [000] 2009-07-25 18:01:01.614015:   2455:120:S   + [000]  2455:120:S sshd
 ...

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 kernel/trace/trace.c        |   14 ++++++++++++--
 kernel/trace/trace.h        |    1 +
 kernel/trace/trace_output.c |   21 +++++++++++++++++----
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index af25bec..4e360cf 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -326,6 +326,7 @@ static const char *trace_options[] = {
 	"global-clock",
 	"sleep-time",
 	"graph-time",
+	"walltime",
 	NULL
 };
 
@@ -1601,8 +1602,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_flags & TRACE_ITER_WALLTIME) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"                 TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"                     |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index fb3dcc8..0860b3b 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -664,6 +664,7 @@ enum trace_iterator_flags {
 	TRACE_ITER_GLOBAL_CLK		= 0x80000,
 	TRACE_ITER_SLEEP_TIME		= 0x100000,
 	TRACE_ITER_GRAPH_TIME		= 0x200000,
+	TRACE_ITER_WALLTIME		= 0x400000, /* Print walltime */
 };
 
 /*
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index f5c7f0c..3a52bb5 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -501,15 +501,28 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+	if (trace_flags & TRACE_ITER_WALLTIME) {
+		struct tm tm;
+		localtime_r(iter->timespec.tv_sec, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+					"%04u-%02u-%02u "
+					"%02u:%02u:%02u.%06llu: ",
+					comm, entry->pid, iter->cpu,
+					tm.tm_year + 1900, tm.tm_mon + 1,
+					tm.tm_mday, tm.tm_hour, tm.tm_min,
+					tm.tm_sec,
+					ns2usecs(iter->timespec.tv_nsec));
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
 				comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH] Add walltime support for ring-buffer
  2009-07-24 10:42 ` [PATCH] " Zhaolei
@ 2009-07-24 10:47   ` Zhaolei
  0 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-07-24 10:47 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Frederic Weisbecker, Ingo Molnar, LKML

Zhaolei wrote:
> Note:
> It is only a prototype patch to for demostrate what is result
> looks like.
> There still have many works to do(bug to fix) before apply.
> 
Hello,

I forgot add [RFC 1/3] in title, and resend it with correct title.
Please ignore this mail, sorry.

Thanks
Zhaolei


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-07-24 10:40 [PATCH 0/3] Add walltime support for ring-buffer Zhaolei
                   ` (3 preceding siblings ...)
  2009-07-24 10:44 ` [RFC PATCH 3/3] Make ftrace display walltime in output Zhaolei
@ 2009-07-24 13:05 ` Steven Rostedt
  2009-07-28  1:43   ` KOSAKI Motohiro
  4 siblings, 1 reply; 46+ messages in thread
From: Steven Rostedt @ 2009-07-24 13:05 UTC (permalink / raw)
  To: Zhaolei; +Cc: Frederic Weisbecker, Ingo Molnar, LKML


On Fri, 24 Jul 2009, Zhaolei wrote:

> Hello, Steven
> 
> It is a prototype code to make ftrace display walltime of events.
> 
> It need to applys on my patch of:
> [PATCH v3 1/2] Add function to convert between calendar time and broken-down time for universal use
> 
> This code can run but need to be fixed, it only for demostrate what is result
> looks like.
> 
> I think display walltime is valuable for ftrace, it can provide more
> recognizable information to system manager that he can known "when"
> event happened.
> And it is also helpful for flight-recorder we are doing now, we can get
> walltime of events after panic for analyse.
> 
> We can get following result on this patch:
>  # cd /mnt/debugfs/tracing/
>  # echo sched_switch > current_tracer
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>  #              | |       |          |         |
>              bash-2457  [000]   214.895446:   2457:120:S   + [000]  2457:120:S bash
>              bash-2457  [000]   214.898298:   2457:120:R   + [000]     5:115:S events/0
>              bash-2457  [000]   214.898684:   2457:120:R ==> [000]     5:115:R events/0
>          events/0-5     [000]   214.899458:      5:115:R   + [000]  2455:120:S sshd
>          events/0-5     [000]   214.899495:      5:115:S ==> [000]  2455:120:R sshd
>  ...
>  # echo 1 > options/walltime
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#                  TIMESTAMP  FUNCTION
>  #              | |       |                        |         |
>            <idle>-0     [000] 2009-07-25 18:01:00.848468:      0:140:R ==> [000]     5:115:R events/0
>          events/0-5     [000] 2009-07-25 18:01:00.848523:      5:115:S ==> [000]     0:140:R <idle>
>            <idle>-0     [000] 2009-07-25 18:01:01.613479:      0:140:R   + [000]  2455:120:S sshd
>            <idle>-0     [000] 2009-07-25 18:01:01.613678:      0:140:R ==> [000]  2455:120:R sshd
>              sshd-2455  [000] 2009-07-25 18:01:01.614015:   2455:120:S   + [000]  2455:120:S sshd
>  ...
> 
> This patch is prototype because we have following things to do:
> 1: function to get walltime is not near rb_time_stamp()
>    So walltime is not absolutely synchronize with trace time.
>    But if we place code to get walltime near rb_time_stamp(), it will be called
>    on every reserve-event and will make system slow.
> 2: We can't call getnstimeofday() freely in ring-buffer's code, because
>    it is possibility already hold xtime_lock. Prototype code used a ugly way
>    to get gid of this problem, maybe we need to ftrace_disable_cpu() before
>    every write_seqlock() instead.
> 3: People maybe change walltime when tracing, but current ring-buffer page's
>    walltime can not reflect this change. We need to add tracepoints to watch
>    this change, and add a walltime-change event(already reserved in
>    ring-buffer.c) to record this change.
> 
> I'll continue to finish remain steps, what is your opinion on it?

Hi Zhaolei,

Could you not just save the wall time in some 64bit format within the 
current timestamp?  Right now there's three clocks that can be used by 
ftrace. Two that you can really get two (I need to add access to the 
third).  The default clock is sched_clock. But there's an option to get to 
global_clock:

 debug/tracing/options/global-clock

# echo 1 > /debug/tracing/options/global-clock

Will enable the global clock. Perhaps we can add a "wall clock" version 
that will map the timestamps to the wall clock?

I really hate adding any more fields to the ring buffer headers. That 
takes away from the amount that you can record per page.

-- Steve


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-07-24 13:05 ` [PATCH 0/3] Add walltime support for ring-buffer Steven Rostedt
@ 2009-07-28  1:43   ` KOSAKI Motohiro
  2009-07-28  1:53     ` Frederic Weisbecker
  0 siblings, 1 reply; 46+ messages in thread
From: KOSAKI Motohiro @ 2009-07-28  1:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: kosaki.motohiro, Zhaolei, Frederic Weisbecker, Ingo Molnar, LKML

Hi Zhaolei,

> Hi Zhaolei,
> 
> Could you not just save the wall time in some 64bit format within the 
> current timestamp?  Right now there's three clocks that can be used by 
> ftrace. Two that you can really get two (I need to add access to the 
> third).  The default clock is sched_clock. But there's an option to get to 
> global_clock:
> 
>  debug/tracing/options/global-clock
> 
> # echo 1 > /debug/tracing/options/global-clock
> 
> Will enable the global clock. Perhaps we can add a "wall clock" version 
> that will map the timestamps to the wall clock?
> 
> I really hate adding any more fields to the ring buffer headers. That 
> takes away from the amount that you can record per page.

I think wall-time recording on every event is too slow. slower tracer
isn't useful.
Instead, Can we make pseudo wall-time recording event periodically?

In nowadays, many workload run on cluster machine environment. Then
We often need compare different machines tracing log. it mean TSC isn't
sufficient. but nobody want performance reduce.
Fortunatelly, ntp time adjustment is not happend so frequently. we
don't need wall-time on _every_ event.

Thanks.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-07-28  1:43   ` KOSAKI Motohiro
@ 2009-07-28  1:53     ` Frederic Weisbecker
  2009-07-28  2:19       ` Steven Rostedt
  2009-07-28  2:23       ` [PATCH 0/3] Add walltime support for ring-buffer KOSAKI Motohiro
  0 siblings, 2 replies; 46+ messages in thread
From: Frederic Weisbecker @ 2009-07-28  1:53 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Steven Rostedt, Zhaolei, Ingo Molnar, LKML

On Tue, Jul 28, 2009 at 10:43:11AM +0900, KOSAKI Motohiro wrote:
> Hi Zhaolei,
> 
> > Hi Zhaolei,
> > 
> > Could you not just save the wall time in some 64bit format within the 
> > current timestamp?  Right now there's three clocks that can be used by 
> > ftrace. Two that you can really get two (I need to add access to the 
> > third).  The default clock is sched_clock. But there's an option to get to 
> > global_clock:
> > 
> >  debug/tracing/options/global-clock
> > 
> > # echo 1 > /debug/tracing/options/global-clock
> > 
> > Will enable the global clock. Perhaps we can add a "wall clock" version 
> > that will map the timestamps to the wall clock?
> > 
> > I really hate adding any more fields to the ring buffer headers. That 
> > takes away from the amount that you can record per page.
> 
> I think wall-time recording on every event is too slow. slower tracer
> isn't useful.
> Instead, Can we make pseudo wall-time recording event periodically?
> 
> In nowadays, many workload run on cluster machine environment. Then
> We often need compare different machines tracing log. it mean TSC isn't
> sufficient. but nobody want performance reduce.
> Fortunatelly, ntp time adjustment is not happend so frequently. we
> don't need wall-time on _every_ event.
> 
> Thanks.


Yeah, that would really slow the tracing.

Can't we get a snapshot of the couple (walltime, cpu clock) once
at startup.
Then we can retrieve the walltime when cpu_clock was 0 and compute
the walltime for each traces at output time using the trace timestamp as
a delta?

Thanks,
Frederic.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-07-28  1:53     ` Frederic Weisbecker
@ 2009-07-28  2:19       ` Steven Rostedt
  2009-08-17  9:22         ` [RFC PATCH] Add timer-source of walltime for ftrace Zhaolei
  2009-07-28  2:23       ` [PATCH 0/3] Add walltime support for ring-buffer KOSAKI Motohiro
  1 sibling, 1 reply; 46+ messages in thread
From: Steven Rostedt @ 2009-07-28  2:19 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: KOSAKI Motohiro, Zhaolei, Ingo Molnar, LKML


On Tue, 28 Jul 2009, Frederic Weisbecker wrote:
> > 
> > In nowadays, many workload run on cluster machine environment. Then
> > We often need compare different machines tracing log. it mean TSC isn't
> > sufficient. but nobody want performance reduce.
> > Fortunatelly, ntp time adjustment is not happend so frequently. we
> > don't need wall-time on _every_ event.
> > 
> > Thanks.
> 
> 
> Yeah, that would really slow the tracing.
> 
> Can't we get a snapshot of the couple (walltime, cpu clock) once
> at startup.
> Then we can retrieve the walltime when cpu_clock was 0 and compute
> the walltime for each traces at output time using the trace timestamp as
> a delta?

This should be something in one of the trace clock sources. It can perhaps 
have a helper function called by gettimeofday or something, but it should 
be something that gets plugged into the ring buffer and not something that 
needs to modify any of its code.

-- Steve


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-07-28  1:53     ` Frederic Weisbecker
  2009-07-28  2:19       ` Steven Rostedt
@ 2009-07-28  2:23       ` KOSAKI Motohiro
  2009-08-03  7:22         ` Ingo Molnar
  1 sibling, 1 reply; 46+ messages in thread
From: KOSAKI Motohiro @ 2009-07-28  2:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: kosaki.motohiro, Steven Rostedt, Zhaolei, Ingo Molnar, LKML

> On Tue, Jul 28, 2009 at 10:43:11AM +0900, KOSAKI Motohiro wrote:
> > Hi Zhaolei,
> > 
> > > Hi Zhaolei,
> > > 
> > > Could you not just save the wall time in some 64bit format within the 
> > > current timestamp?  Right now there's three clocks that can be used by 
> > > ftrace. Two that you can really get two (I need to add access to the 
> > > third).  The default clock is sched_clock. But there's an option to get to 
> > > global_clock:
> > > 
> > >  debug/tracing/options/global-clock
> > > 
> > > # echo 1 > /debug/tracing/options/global-clock
> > > 
> > > Will enable the global clock. Perhaps we can add a "wall clock" version 
> > > that will map the timestamps to the wall clock?
> > > 
> > > I really hate adding any more fields to the ring buffer headers. That 
> > > takes away from the amount that you can record per page.
> > 
> > I think wall-time recording on every event is too slow. slower tracer
> > isn't useful.
> > Instead, Can we make pseudo wall-time recording event periodically?
> > 
> > In nowadays, many workload run on cluster machine environment. Then
> > We often need compare different machines tracing log. it mean TSC isn't
> > sufficient. but nobody want performance reduce.
> > Fortunatelly, ntp time adjustment is not happend so frequently. we
> > don't need wall-time on _every_ event.
> > 
> > Thanks.
> 
> 
> Yeah, that would really slow the tracing.
> 
> Can't we get a snapshot of the couple (walltime, cpu clock) once
> at startup.
> Then we can retrieve the walltime when cpu_clock was 0 and compute
> the walltime for each traces at output time using the trace timestamp as
> a delta?


It can. but I don't think it is sufficient.
On flight recorder use-case, The tracing run more than one year. And,
Administrator only see last N sec log. (previous logs were overwritten
by ring-buffer go-around)

Thanks.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-07-28  2:23       ` [PATCH 0/3] Add walltime support for ring-buffer KOSAKI Motohiro
@ 2009-08-03  7:22         ` Ingo Molnar
  2009-08-03  9:32           ` KOSAKI Motohiro
  0 siblings, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2009-08-03  7:22 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Frederic Weisbecker, Steven Rostedt, Zhaolei, LKML


* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> It can. but I don't think it is sufficient. On flight recorder 
> use-case, The tracing run more than one year. And, Administrator 
> only see last N sec log. (previous logs were overwritten by 
> ring-buffer go-around)

That's definitely something we want to enable ASAP.

I'm wondering why a single u64 timestamp is not enough to express 
the full, absolute range of time elapsed since bootup, in 
nanoseconds.

That would make walltime a matter of pretty-printing only - we'd 
have to convert the u64 nsec timestamp into a walltime format, 
right?

In fact we could change all the timestamps to be standardized along 
'nanoseconds elapsed since 1970' or so - not nanoseconds since the 
last bootup. That still fits just fine within 64 bits - u64 
nanoseconds has a scope of 500+ years.

	Ingo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-08-03  7:22         ` Ingo Molnar
@ 2009-08-03  9:32           ` KOSAKI Motohiro
  2009-08-04 14:38             ` Ingo Molnar
  0 siblings, 1 reply; 46+ messages in thread
From: KOSAKI Motohiro @ 2009-08-03  9:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: kosaki.motohiro, Frederic Weisbecker, Steven Rostedt, Zhaolei, LKML

> 
> * KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > It can. but I don't think it is sufficient. On flight recorder 
> > use-case, The tracing run more than one year. And, Administrator 
> > only see last N sec log. (previous logs were overwritten by 
> > ring-buffer go-around)
> 
> That's definitely something we want to enable ASAP.
> 
> I'm wondering why a single u64 timestamp is not enough to express 
> the full, absolute range of time elapsed since bootup, in 
> nanoseconds.
> 
> That would make walltime a matter of pretty-printing only - we'd 
> have to convert the u64 nsec timestamp into a walltime format, 
> right?
> 
> In fact we could change all the timestamps to be standardized along 
> 'nanoseconds elapsed since 1970' or so - not nanoseconds since the 
> last bootup. That still fits just fine within 64 bits - u64 
> nanoseconds has a scope of 500+ years.

Ah, you are right. thanks correct me.
but I think ntp updating time storing is necessary too.




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/3] Add walltime support for ring-buffer
  2009-08-03  9:32           ` KOSAKI Motohiro
@ 2009-08-04 14:38             ` Ingo Molnar
  0 siblings, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2009-08-04 14:38 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Frederic Weisbecker, Steven Rostedt, Zhaolei, LKML


* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> > 
> > * KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > 
> > > It can. but I don't think it is sufficient. On flight recorder 
> > > use-case, The tracing run more than one year. And, Administrator 
> > > only see last N sec log. (previous logs were overwritten by 
> > > ring-buffer go-around)
> > 
> > That's definitely something we want to enable ASAP.
> > 
> > I'm wondering why a single u64 timestamp is not enough to express 
> > the full, absolute range of time elapsed since bootup, in 
> > nanoseconds.
> > 
> > That would make walltime a matter of pretty-printing only - we'd 
> > have to convert the u64 nsec timestamp into a walltime format, 
> > right?
> > 
> > In fact we could change all the timestamps to be standardized along 
> > 'nanoseconds elapsed since 1970' or so - not nanoseconds since the 
> > last bootup. That still fits just fine within 64 bits - u64 
> > nanoseconds has a scope of 500+ years.
> 
> Ah, you are right. thanks correct me. but I think ntp updating 
> time storing is necessary too.

Yeah. As long as we only save a single u64 into the trace it should 
all be fine.

There could be periodic samples of <walltime, sched_clock()> saved 
on a per cpu basis, and thus converting sched_clock() values into 
walltime unit?

	Ingo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC PATCH] Add timer-source of walltime for ftrace
  2009-07-28  2:19       ` Steven Rostedt
@ 2009-08-17  9:22         ` Zhaolei
  2009-08-17 16:49           ` Frederic Weisbecker
  2009-08-18 15:57           ` KOSAKI Motohiro
  0 siblings, 2 replies; 46+ messages in thread
From: Zhaolei @ 2009-08-17  9:22 UTC (permalink / raw)
  To: Steven Rostedt, Frederic Weisbecker, KOSAKI Motohiro, Ingo Molnar; +Cc: LKML

We can add a timer-source for ftrace to support wall-time display.
It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
http://lkml.org/lkml/2009/7/24/103

It needs to be applied on my patch of:
[PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
and
[PATCH v2 2/2] ftrace: Unify effect of writing to trace_options and option/*

We can get following output:
 # cd /mnt/debugfs/tracing/
 # ls options/*clock
 options/global-clock  options/local-clock  options/walltime-clock
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-3560  [000]   694.242386:   3560:120:S   + [000]  3560:120:S bash
             bash-3560  [000]   694.242476:   3560:120:S   + [000]  3560:120:S bash
             bash-3560  [000]   694.242865:   3560:120:R   + [000]     5:115:S events/0
             bash-3560  [000]   694.242912:   3560:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]   694.243015:      5:115:R   + [000]  3558:120:S sshd
         events/0-5     [000]   694.243040:      5:115:S ==> [000]  3560:120:R bash
             bash-3560  [000]   694.243312:   3560:120:S   + [000]  3560:120:S bash
             bash-3560  [000]   694.243374:   3560:120:S ==> [000]  3558:120:R sshd
 ...
 # echo 1 > options/walltime-clock
 # echo > trace
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
 #              | |       |                 |         |
             bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
             bash-3560  [000] 2009-08-17 23:58:22   3560:120:R   + [000]     5:115:S events/0
             bash-3560  [000] 2009-08-17 23:58:22   3560:120:R ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-08-17 23:58:22      5:115:R   + [000]  3558:120:S sshd
         events/0-5     [000] 2009-08-17 23:58:22      5:115:S ==> [000]  3558:120:R sshd
             sshd-3558  [000] 2009-08-17 23:58:22   3558:120:S ==> [000]  3560:120:R bash
             bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
             bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
 ...

Note:
It is only a prototype patch to for demostrate what is result looks like.
There still have many works to do(bug to fix) before apply.
For example, we need to add a hook(tracepoint) of xtime change event,
so we can update walltime's clock base then.
And maybe we need clear all events when changing clock-source, because old
clock value can't displayed correctly in new clock's format.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/trace_clock.h |    1 +
 kernel/trace/trace.c        |   55 ++++++++++++++++++++++++++++++------------
 kernel/trace/trace.h        |    6 +++-
 kernel/trace/trace_clock.c  |   47 ++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_output.c |   20 ++++++++++++---
 5 files changed, 107 insertions(+), 22 deletions(-)

diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
index 7a81303..322707e 100644
--- a/include/linux/trace_clock.h
+++ b/include/linux/trace_clock.h
@@ -15,5 +15,6 @@
 extern u64 notrace trace_clock_local(void);
 extern u64 notrace trace_clock(void);
 extern u64 notrace trace_clock_global(void);
+extern u64 notrace trace_clock_walltime(void);
 
 #endif /* _LINUX_TRACE_CLOCK_H */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b7d873b..4474f7d 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -263,8 +263,8 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
 
 /* trace_flags holds trace_options default values */
 unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
-	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
-	TRACE_ITER_GRAPH_TIME;
+	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_LOCAL_CLK |
+	TRACE_ITER_SLEEP_TIME | TRACE_ITER_GRAPH_TIME;
 
 /**
  * trace_wake_up - wake up tasks waiting for trace input
@@ -324,6 +324,8 @@ static const char *trace_options[] = {
 	"context-info",
 	"latency-format",
 	"global-clock",
+	"local-clock",
+	"walltime-clock",
 	"sleep-time",
 	"graph-time",
 	NULL
@@ -1423,8 +1425,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_flags & TRACE_ITER_WALLTIME_CLK) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"          TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"              |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
@@ -2151,30 +2162,42 @@ static int set_tracer_option(struct tracer *trace, char *cmp, int neg)
 
 static void set_tracer_flags(unsigned int mask, int enabled)
 {
+	u64 (*clkfunc)(void) = NULL;
+
 	/* do nothing if flag is already set */
 	if (!!(trace_flags & mask) == !!enabled)
 		return;
 
-	if (enabled)
-		trace_flags |= mask;
-	else
-		trace_flags &= ~mask;
+	if (mask == TRACE_ITER_GLOBAL_CLK)
+		clkfunc = trace_clock_global;
+	if (mask == TRACE_ITER_LOCAL_CLK)
+		clkfunc = trace_clock_local;
+	if (mask == TRACE_ITER_WALLTIME_CLK)
+		clkfunc = trace_clock_walltime;
 
-	if (mask == TRACE_ITER_GLOBAL_CLK) {
-		u64 (*func)(void);
+	if (clkfunc) {
+		if (!enabled)
+			return;
 
-		if (enabled)
-			func = trace_clock_global;
-		else
-			func = trace_clock_local;
+		trace_flags &= ~TRACE_ITER_GLOBAL_CLK;
+		trace_flags &= ~TRACE_ITER_LOCAL_CLK;
+		trace_flags &= ~TRACE_ITER_WALLTIME_CLK;
+		trace_flags |= mask;
 
 		mutex_lock(&trace_types_lock);
-		ring_buffer_set_clock(global_trace.buffer, func);
+		ring_buffer_set_clock(global_trace.buffer, clkfunc);
 
 		if (max_tr.buffer)
-			ring_buffer_set_clock(max_tr.buffer, func);
+			ring_buffer_set_clock(max_tr.buffer, clkfunc);
 		mutex_unlock(&trace_types_lock);
+
+		return;
 	}
+
+	if (enabled)
+		trace_flags |= mask;
+	else
+		trace_flags &= ~mask;
 }
 
 static ssize_t
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 606073c..02ffa4c 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -675,8 +675,10 @@ enum trace_iterator_flags {
 	TRACE_ITER_CONTEXT_INFO		= 0x20000, /* Print pid/cpu/time */
 	TRACE_ITER_LATENCY_FMT		= 0x40000,
 	TRACE_ITER_GLOBAL_CLK		= 0x80000,
-	TRACE_ITER_SLEEP_TIME		= 0x100000,
-	TRACE_ITER_GRAPH_TIME		= 0x200000,
+	TRACE_ITER_LOCAL_CLK		= 0x100000,
+	TRACE_ITER_WALLTIME_CLK		= 0x200000,
+	TRACE_ITER_SLEEP_TIME		= 0x400000,
+	TRACE_ITER_GRAPH_TIME		= 0x800000,
 };
 
 /*
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index b588fd8..53ced3e 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -107,3 +107,50 @@ u64 notrace trace_clock_global(void)
 
 	return now;
 }
+
+/* value of time_t when traceclock is 0 */
+static struct timespec base_walltime;
+
+u64 notrace trace_clock_walltime(void)
+{
+	u32 remainder;
+	return base_walltime.tv_sec + div_u64_rem(
+			trace_clock() + base_walltime.tv_nsec,
+			NSEC_PER_SEC, &remainder);
+}
+
+static int __init walltime_init(void)
+{
+	unsigned long flags;
+	struct timespec ts;
+	u64 traceclock;
+	u32 remainder;
+
+	local_irq_save(flags);
+
+	getnstimeofday(&ts);
+	traceclock = trace_clock();
+
+	if (traceclock > ts.tv_nsec) {
+		base_walltime.tv_sec = ts.tv_sec - div_u64_rem(
+				traceclock - ts.tv_nsec + NSEC_PER_SEC - 1,
+				NSEC_PER_SEC, &remainder);
+
+		/*
+		 * base_walltime.tv_nsec =
+		 * 	(NSEC_PER_SEC - (traceclock - ts.tv_nsec) %
+		 * 	NSEC_PER_SEC) % NSEC_PER_SEC;
+		 */
+		div_u64_rem(traceclock - ts.tv_nsec, NSEC_PER_SEC, &remainder);
+		base_walltime.tv_nsec = remainder ?
+					NSEC_PER_SEC - remainder : 0;
+	} else {
+		base_walltime.tv_sec = ts.tv_sec;
+		base_walltime.tv_nsec = ts.tv_nsec - traceclock;
+	}
+
+	local_irq_restore(flags);
+	return 0;
+}
+
+device_initcall_sync(walltime_init);
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index e0c2545..0419b37 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -501,15 +501,27 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+	if (trace_flags & TRACE_ITER_WALLTIME_CLK) {
+		struct tm tm;
+		time_to_tm(iter->ts, 0, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+					"%04ld-%02d-%02d "
+					"%02d:%02d:%02d ",
+					comm, entry->pid, iter->cpu,
+					tm.tm_year + 1900, tm.tm_mon + 1,
+					tm.tm_mday, tm.tm_hour, tm.tm_min,
+					tm.tm_sec);
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
 				comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] Add timer-source of walltime for ftrace
  2009-08-17  9:22         ` [RFC PATCH] Add timer-source of walltime for ftrace Zhaolei
@ 2009-08-17 16:49           ` Frederic Weisbecker
  2009-08-18  2:09             ` Zhaolei
  2009-08-18 15:57           ` KOSAKI Motohiro
  1 sibling, 1 reply; 46+ messages in thread
From: Frederic Weisbecker @ 2009-08-17 16:49 UTC (permalink / raw)
  To: Zhaolei; +Cc: Steven Rostedt, KOSAKI Motohiro, Ingo Molnar, LKML

On Mon, Aug 17, 2009 at 05:22:04PM +0800, Zhaolei wrote:
> We can add a timer-source for ftrace to support wall-time display.
> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
> http://lkml.org/lkml/2009/7/24/103
> 
> It needs to be applied on my patch of:
> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
> and
> [PATCH v2 2/2] ftrace: Unify effect of writing to trace_options and option/*
> 
> We can get following output:
>  # cd /mnt/debugfs/tracing/
>  # ls options/*clock
>  options/global-clock  options/local-clock  options/walltime-clock
>  # echo sched_switch > current_tracer
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>  #              | |       |          |         |
>              bash-3560  [000]   694.242386:   3560:120:S   + [000]  3560:120:S bash
>              bash-3560  [000]   694.242476:   3560:120:S   + [000]  3560:120:S bash
>              bash-3560  [000]   694.242865:   3560:120:R   + [000]     5:115:S events/0
>              bash-3560  [000]   694.242912:   3560:120:R ==> [000]     5:115:R events/0
>          events/0-5     [000]   694.243015:      5:115:R   + [000]  3558:120:S sshd
>          events/0-5     [000]   694.243040:      5:115:S ==> [000]  3560:120:R bash
>              bash-3560  [000]   694.243312:   3560:120:S   + [000]  3560:120:S bash
>              bash-3560  [000]   694.243374:   3560:120:S ==> [000]  3558:120:R sshd
>  ...
>  # echo 1 > options/walltime-clock
>  # echo > trace
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
>  #              | |       |                 |         |
>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:R   + [000]     5:115:S events/0
>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:R ==> [000]     5:115:R events/0
>          events/0-5     [000] 2009-08-17 23:58:22      5:115:R   + [000]  3558:120:S sshd
>          events/0-5     [000] 2009-08-17 23:58:22      5:115:S ==> [000]  3558:120:R sshd
>              sshd-3558  [000] 2009-08-17 23:58:22   3558:120:S ==> [000]  3560:120:R bash
>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
>  ...
> 
> Note:
> It is only a prototype patch to for demostrate what is result looks like.
> There still have many works to do(bug to fix) before apply.
> For example, we need to add a hook(tracepoint) of xtime change event,
> so we can update walltime's clock base then.
> And maybe we need clear all events when changing clock-source, because old
> clock value can't displayed correctly in new clock's format.
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> ---
>  include/linux/trace_clock.h |    1 +
>  kernel/trace/trace.c        |   55 ++++++++++++++++++++++++++++++------------
>  kernel/trace/trace.h        |    6 +++-
>  kernel/trace/trace_clock.c  |   47 ++++++++++++++++++++++++++++++++++++
>  kernel/trace/trace_output.c |   20 ++++++++++++---
>  5 files changed, 107 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
> index 7a81303..322707e 100644
> --- a/include/linux/trace_clock.h
> +++ b/include/linux/trace_clock.h
> @@ -15,5 +15,6 @@
>  extern u64 notrace trace_clock_local(void);
>  extern u64 notrace trace_clock(void);
>  extern u64 notrace trace_clock_global(void);
> +extern u64 notrace trace_clock_walltime(void);
>  
>  #endif /* _LINUX_TRACE_CLOCK_H */
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index b7d873b..4474f7d 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -263,8 +263,8 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
>  
>  /* trace_flags holds trace_options default values */
>  unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
> -	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
> -	TRACE_ITER_GRAPH_TIME;
> +	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_LOCAL_CLK |
> +	TRACE_ITER_SLEEP_TIME | TRACE_ITER_GRAPH_TIME;
>  
>  /**
>   * trace_wake_up - wake up tasks waiting for trace input
> @@ -324,6 +324,8 @@ static const char *trace_options[] = {
>  	"context-info",
>  	"latency-format",
>  	"global-clock",
> +	"local-clock",
> +	"walltime-clock",
>  	"sleep-time",
>  	"graph-time",
>  	NULL
> @@ -1423,8 +1425,17 @@ static void print_lat_help_header(struct seq_file *m)
>  
>  static void print_func_help_header(struct seq_file *m)
>  {
> -	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
> -	seq_puts(m, "#              | |       |          |         |\n");
> +	if (trace_flags & TRACE_ITER_WALLTIME_CLK) {
> +		seq_puts(m, "#           TASK-PID    CPU# "
> +			"          TIMESTAMP  FUNCTION\n");
> +		seq_puts(m, "#              | |       |   "
> +			"              |         |\n");
> +	} else {
> +		seq_puts(m, "#           TASK-PID    CPU# "
> +			"   TIMESTAMP  FUNCTION\n");
> +		seq_puts(m, "#              | |       |   "
> +			"       |         |\n");
> +	}
>  }
>  
>  
> @@ -2151,30 +2162,42 @@ static int set_tracer_option(struct tracer *trace, char *cmp, int neg)
>  
>  static void set_tracer_flags(unsigned int mask, int enabled)
>  {
> +	u64 (*clkfunc)(void) = NULL;
> +
>  	/* do nothing if flag is already set */
>  	if (!!(trace_flags & mask) == !!enabled)
>  		return;
>  
> -	if (enabled)
> -		trace_flags |= mask;
> -	else
> -		trace_flags &= ~mask;
> +	if (mask == TRACE_ITER_GLOBAL_CLK)
> +		clkfunc = trace_clock_global;
> +	if (mask == TRACE_ITER_LOCAL_CLK)
> +		clkfunc = trace_clock_local;
> +	if (mask == TRACE_ITER_WALLTIME_CLK)
> +		clkfunc = trace_clock_walltime;
>  
> -	if (mask == TRACE_ITER_GLOBAL_CLK) {
> -		u64 (*func)(void);
> +	if (clkfunc) {
> +		if (!enabled)
> +			return;



If the user set a clock option file to 0, then it means he
was expecting something, probably reset the default clock
which is clock_local.

But what if the user set 0 to the local_clock...hmm..



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [RFC PATCH] Add timer-source of walltime for ftrace
  2009-08-17 16:49           ` Frederic Weisbecker
@ 2009-08-18  2:09             ` Zhaolei
  2009-08-18 18:52               ` Steven Rostedt
  0 siblings, 1 reply; 46+ messages in thread
From: Zhaolei @ 2009-08-18  2:09 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Steven Rostedt, KOSAKI Motohiro, Ingo Molnar, LKML

Frederic Weisbecker wrote:
> On Mon, Aug 17, 2009 at 05:22:04PM +0800, Zhaolei wrote:
>> We can add a timer-source for ftrace to support wall-time display.
>> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
>> http://lkml.org/lkml/2009/7/24/103
>>
>> It needs to be applied on my patch of:
>> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
>> and
>> [PATCH v2 2/2] ftrace: Unify effect of writing to trace_options and option/*
>>
>> We can get following output:
>>  # cd /mnt/debugfs/tracing/
>>  # ls options/*clock
>>  options/global-clock  options/local-clock  options/walltime-clock
>>  # echo sched_switch > current_tracer
>>  # cat trace
>>  # tracer: sched_switch
>>  #
>>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>>  #              | |       |          |         |
>>              bash-3560  [000]   694.242386:   3560:120:S   + [000]  3560:120:S bash
>>              bash-3560  [000]   694.242476:   3560:120:S   + [000]  3560:120:S bash
>>              bash-3560  [000]   694.242865:   3560:120:R   + [000]     5:115:S events/0
>>              bash-3560  [000]   694.242912:   3560:120:R ==> [000]     5:115:R events/0
>>          events/0-5     [000]   694.243015:      5:115:R   + [000]  3558:120:S sshd
>>          events/0-5     [000]   694.243040:      5:115:S ==> [000]  3560:120:R bash
>>              bash-3560  [000]   694.243312:   3560:120:S   + [000]  3560:120:S bash
>>              bash-3560  [000]   694.243374:   3560:120:S ==> [000]  3558:120:R sshd
>>  ...
>>  # echo 1 > options/walltime-clock
>>  # echo > trace
>>  # cat trace
>>  # tracer: sched_switch
>>  #
>>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
>>  #              | |       |                 |         |
>>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
>>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:R   + [000]     5:115:S events/0
>>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:R ==> [000]     5:115:R events/0
>>          events/0-5     [000] 2009-08-17 23:58:22      5:115:R   + [000]  3558:120:S sshd
>>          events/0-5     [000] 2009-08-17 23:58:22      5:115:S ==> [000]  3558:120:R sshd
>>              sshd-3558  [000] 2009-08-17 23:58:22   3558:120:S ==> [000]  3560:120:R bash
>>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
>>              bash-3560  [000] 2009-08-17 23:58:22   3560:120:S   + [000]  3560:120:S bash
>>  ...
>>
>> Note:
>> It is only a prototype patch to for demostrate what is result looks like.
>> There still have many works to do(bug to fix) before apply.
>> For example, we need to add a hook(tracepoint) of xtime change event,
>> so we can update walltime's clock base then.
>> And maybe we need clear all events when changing clock-source, because old
>> clock value can't displayed correctly in new clock's format.
>>
>> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
>> ---
>>  include/linux/trace_clock.h |    1 +
>>  kernel/trace/trace.c        |   55 ++++++++++++++++++++++++++++++------------
>>  kernel/trace/trace.h        |    6 +++-
>>  kernel/trace/trace_clock.c  |   47 ++++++++++++++++++++++++++++++++++++
>>  kernel/trace/trace_output.c |   20 ++++++++++++---
>>  5 files changed, 107 insertions(+), 22 deletions(-)
>>
>> diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
>> index 7a81303..322707e 100644
>> --- a/include/linux/trace_clock.h
>> +++ b/include/linux/trace_clock.h
>> @@ -15,5 +15,6 @@
>>  extern u64 notrace trace_clock_local(void);
>>  extern u64 notrace trace_clock(void);
>>  extern u64 notrace trace_clock_global(void);
>> +extern u64 notrace trace_clock_walltime(void);
>>  
>>  #endif /* _LINUX_TRACE_CLOCK_H */
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index b7d873b..4474f7d 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -263,8 +263,8 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
>>  
>>  /* trace_flags holds trace_options default values */
>>  unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
>> -	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
>> -	TRACE_ITER_GRAPH_TIME;
>> +	TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_LOCAL_CLK |
>> +	TRACE_ITER_SLEEP_TIME | TRACE_ITER_GRAPH_TIME;
>>  
>>  /**
>>   * trace_wake_up - wake up tasks waiting for trace input
>> @@ -324,6 +324,8 @@ static const char *trace_options[] = {
>>  	"context-info",
>>  	"latency-format",
>>  	"global-clock",
>> +	"local-clock",
>> +	"walltime-clock",
>>  	"sleep-time",
>>  	"graph-time",
>>  	NULL
>> @@ -1423,8 +1425,17 @@ static void print_lat_help_header(struct seq_file *m)
>>  
>>  static void print_func_help_header(struct seq_file *m)
>>  {
>> -	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
>> -	seq_puts(m, "#              | |       |          |         |\n");
>> +	if (trace_flags & TRACE_ITER_WALLTIME_CLK) {
>> +		seq_puts(m, "#           TASK-PID    CPU# "
>> +			"          TIMESTAMP  FUNCTION\n");
>> +		seq_puts(m, "#              | |       |   "
>> +			"              |         |\n");
>> +	} else {
>> +		seq_puts(m, "#           TASK-PID    CPU# "
>> +			"   TIMESTAMP  FUNCTION\n");
>> +		seq_puts(m, "#              | |       |   "
>> +			"       |         |\n");
>> +	}
>>  }
>>  
>>  
>> @@ -2151,30 +2162,42 @@ static int set_tracer_option(struct tracer *trace, char *cmp, int neg)
>>  
>>  static void set_tracer_flags(unsigned int mask, int enabled)
>>  {
>> +	u64 (*clkfunc)(void) = NULL;
>> +
>>  	/* do nothing if flag is already set */
>>  	if (!!(trace_flags & mask) == !!enabled)
>>  		return;
>>  
>> -	if (enabled)
>> -		trace_flags |= mask;
>> -	else
>> -		trace_flags &= ~mask;
>> +	if (mask == TRACE_ITER_GLOBAL_CLK)
>> +		clkfunc = trace_clock_global;
>> +	if (mask == TRACE_ITER_LOCAL_CLK)
>> +		clkfunc = trace_clock_local;
>> +	if (mask == TRACE_ITER_WALLTIME_CLK)
>> +		clkfunc = trace_clock_walltime;
>>  
>> -	if (mask == TRACE_ITER_GLOBAL_CLK) {
>> -		u64 (*func)(void);
>> +	if (clkfunc) {
>> +		if (!enabled)
>> +			return;
> 
> 
> 
> If the user set a clock option file to 0, then it means he
> was expecting something, probably reset the default clock
> which is clock_local.

Agree.

> But what if the user set 0 to the local_clock...hmm..

or make option like:
# echo wallclock > option/clock
# echo localclock > option/clock
is best way.
But it needs rewriting whole option source.

Maybe now, we can use a neutrality way as:
# echo 0 > option/wallclock  # reset to default clock(local clock)
# echo 0 > option/locakclock # no effect

Thanks
Zhaolei





^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] Add timer-source of walltime for ftrace
  2009-08-17  9:22         ` [RFC PATCH] Add timer-source of walltime for ftrace Zhaolei
  2009-08-17 16:49           ` Frederic Weisbecker
@ 2009-08-18 15:57           ` KOSAKI Motohiro
  2009-08-18 18:58             ` Steven Rostedt
  1 sibling, 1 reply; 46+ messages in thread
From: KOSAKI Motohiro @ 2009-08-18 15:57 UTC (permalink / raw)
  To: Zhaolei
  Cc: kosaki.motohiro, Steven Rostedt, Frederic Weisbecker, Ingo Molnar, LKML

> +u64 notrace trace_clock_walltime(void)
> +{
> +	u32 remainder;
> +	return base_walltime.tv_sec + div_u64_rem(
> +			trace_clock() + base_walltime.tv_nsec,
> +			NSEC_PER_SEC, &remainder);
> +}

Why do we need calculate walltime at ring-buffer recordng phase?
IOW, Why can't we calculate this at displaing time?





^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [RFC PATCH] Add timer-source of walltime for ftrace
  2009-08-18  2:09             ` Zhaolei
@ 2009-08-18 18:52               ` Steven Rostedt
  0 siblings, 0 replies; 46+ messages in thread
From: Steven Rostedt @ 2009-08-18 18:52 UTC (permalink / raw)
  To: Zhaolei; +Cc: Frederic Weisbecker, KOSAKI Motohiro, Ingo Molnar, LKML


On Tue, 18 Aug 2009, Zhaolei wrote:

> 
> or make option like:
> # echo wallclock > option/clock
> # echo localclock > option/clock
> is best way.
> But it needs rewriting whole option source.
> 
> Maybe now, we can use a neutrality way as:
> # echo 0 > option/wallclock  # reset to default clock(local clock)
> # echo 0 > option/locakclock # no effect

Or we move it out of options and make it add a "clock_source" file. or 
something?

-- Steve


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] Add timer-source of walltime for ftrace
  2009-08-18 15:57           ` KOSAKI Motohiro
@ 2009-08-18 18:58             ` Steven Rostedt
  2009-08-19  9:16               ` Zhaolei
  2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
  0 siblings, 2 replies; 46+ messages in thread
From: Steven Rostedt @ 2009-08-18 18:58 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Zhaolei, Frederic Weisbecker, Ingo Molnar, LKML


On Wed, 19 Aug 2009, KOSAKI Motohiro wrote:

> > +u64 notrace trace_clock_walltime(void)
> > +{
> > +	u32 remainder;
> > +	return base_walltime.tv_sec + div_u64_rem(
> > +			trace_clock() + base_walltime.tv_nsec,
> > +			NSEC_PER_SEC, &remainder);
> > +}
> 
> Why do we need calculate walltime at ring-buffer recordng phase?
> IOW, Why can't we calculate this at displaing time?

Right! We should have something like this:

	return (u64)base_walltime.tv_sec * NSEC_PER_SEC +
		trace_clock() + base_walltime.tv_nsec;

And then we could divide it out later.

And then we could even show more fields in the output and not limit 
ourselves to seconds.

-- Steve


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [RFC PATCH] Add timer-source of walltime for ftrace
  2009-08-18 18:58             ` Steven Rostedt
@ 2009-08-19  9:16               ` Zhaolei
  2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
  1 sibling, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-08-19  9:16 UTC (permalink / raw)
  To: Steven Rostedt, KOSAKI Motohiro; +Cc: Frederic Weisbecker, Ingo Molnar, LKML

Steven Rostedt wrote:
> On Wed, 19 Aug 2009, KOSAKI Motohiro wrote:
> 
>>> +u64 notrace trace_clock_walltime(void)
>>> +{
>>> +	u32 remainder;
>>> +	return base_walltime.tv_sec + div_u64_rem(
>>> +			trace_clock() + base_walltime.tv_nsec,
>>> +			NSEC_PER_SEC, &remainder);
>>> +}
>> Why do we need calculate walltime at ring-buffer recordng phase?
>> IOW, Why can't we calculate this at displaing time?
> 
> Right! We should have something like this:
> 
> 	return (u64)base_walltime.tv_sec * NSEC_PER_SEC +
> 		trace_clock() + base_walltime.tv_nsec;
> 
> And then we could divide it out later.
> 
> And then we could even show more fields in the output and not limit 
> ourselves to seconds.
> 
> -- Steve

Hello, Steven, Kosaki-san:

Thanks for your good suggestion,
I'll fix it in next version.

Thanks
Zhaolei


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 0/3] ftrace: Add timer-source of walltime for ftrace
  2009-08-18 18:58             ` Steven Rostedt
  2009-08-19  9:16               ` Zhaolei
@ 2009-08-25  8:12               ` Zhaolei
  2009-08-25  8:12                 ` [PATCH 1/3] ftrace: Move setting of clock-source out of options Zhaolei
                                   ` (3 more replies)
  1 sibling, 4 replies; 46+ messages in thread
From: Zhaolei @ 2009-08-25  8:12 UTC (permalink / raw)
  To: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar; +Cc: LKML

Hi,

I finished this patchset to enable ftrace display walltime.

We can get following output by using it:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
             bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
 ...
 # echo walltime > trace_clock
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
 ...

This patchset merged suggestion of Steven, Frederic and Kosaki-san on last RFC.

Thanks
Zhaolei


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] ftrace: Move setting of clock-source out of options
  2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
@ 2009-08-25  8:12                 ` Zhaolei
  2009-08-26  2:35                   ` Steven Rostedt
  2009-08-26  7:23                   ` [tip:tracing/core] " tip-bot for Zhaolei
  2009-08-25  8:14                 ` [PATCH 2/3] ftrace: add tracepoint for xtime Zhaolei
                                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 46+ messages in thread
From: Zhaolei @ 2009-08-25  8:12 UTC (permalink / raw)
  To: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar; +Cc: LKML

There are many clock source for tracing system but we can only
enable/disable one bit by trace/options file.
We can move setting of clock-source out of options and add a separate
file for it:
 # cat trace_clock
 [local] global
 # echo global > trace_clock
 # cat trace_clock
 local [global]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 kernel/trace/trace.c |   92 ++++++++++++++++++++++++++++++++++++++++---------
 kernel/trace/trace.h |    7 ++--
 2 files changed, 79 insertions(+), 20 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b7d873b..5af6f83 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -323,12 +323,21 @@ static const char *trace_options[] = {
 	"printk-msg-only",
 	"context-info",
 	"latency-format",
-	"global-clock",
 	"sleep-time",
 	"graph-time",
 	NULL
 };
 
+static struct {
+	u64 (*func)(void);
+	const char *name;
+} trace_clocks[] = {
+	{ trace_clock_local,	"local" },
+	{ trace_clock_global,	"global" },
+};
+
+int trace_clock_id;
+
 /*
  * ftrace_max_lock is used to protect the swapping of buffers
  * when taking a max snapshot. The buffers themselves are
@@ -2159,22 +2168,6 @@ static void set_tracer_flags(unsigned int mask, int enabled)
 		trace_flags |= mask;
 	else
 		trace_flags &= ~mask;
-
-	if (mask == TRACE_ITER_GLOBAL_CLK) {
-		u64 (*func)(void);
-
-		if (enabled)
-			func = trace_clock_global;
-		else
-			func = trace_clock_local;
-
-		mutex_lock(&trace_types_lock);
-		ring_buffer_set_clock(global_trace.buffer, func);
-
-		if (max_tr.buffer)
-			ring_buffer_set_clock(max_tr.buffer, func);
-		mutex_unlock(&trace_types_lock);
-	}
 }
 
 static ssize_t
@@ -3142,6 +3135,62 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
 	return cnt;
 }
 
+static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
+				  size_t cnt, loff_t *ppos)
+{
+	char buf[64];
+	int bufiter = 0;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
+		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
+			"%s%s%s%s", i ? " " : "",
+			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_id ? "]" : "");
+	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
+
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
+}
+
+static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
+				   size_t cnt, loff_t *fpos)
+{
+	char buf[64];
+	const char *clockstr;
+	int i;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	buf[cnt] = 0;
+
+	clockstr = strstrip(buf);
+
+	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++) {
+		if (strcmp(trace_clocks[i].name, clockstr) == 0)
+			break;
+	}
+	if (i == ARRAY_SIZE(trace_clocks))
+		return -EINVAL;
+
+	trace_clock_id = i;
+
+	mutex_lock(&trace_types_lock);
+
+	ring_buffer_set_clock(global_trace.buffer, trace_clocks[i].func);
+	if (max_tr.buffer)
+		ring_buffer_set_clock(max_tr.buffer, trace_clocks[i].func);
+
+	mutex_unlock(&trace_types_lock);
+
+	*fpos += cnt;
+
+	return cnt;
+}
+
 static const struct file_operations tracing_max_lat_fops = {
 	.open		= tracing_open_generic,
 	.read		= tracing_max_lat_read,
@@ -3179,6 +3228,12 @@ static const struct file_operations tracing_mark_fops = {
 	.write		= tracing_mark_write,
 };
 
+static const struct file_operations trace_clock_fops = {
+	.open		= tracing_open_generic,
+	.read		= tracing_clock_read,
+	.write		= tracing_clock_write,
+};
+
 struct ftrace_buffer_info {
 	struct trace_array	*tr;
 	void			*spare;
@@ -3910,6 +3965,9 @@ static __init int tracer_init_debugfs(void)
 	trace_create_file("saved_cmdlines", 0444, d_tracer,
 			NULL, &tracing_saved_cmdlines_fops);
 
+	trace_create_file("trace_clock", 0644, d_tracer, NULL,
+			  &trace_clock_fops);
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 	trace_create_file("dyn_ftrace_total_info", 0444, d_tracer,
 			&ftrace_update_tot_cnt, &tracing_dyn_info_fops);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 606073c..0545c2b 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -586,6 +586,8 @@ trace_vprintk(unsigned long ip, const char *fmt, va_list args);
 
 extern unsigned long trace_flags;
 
+extern int trace_clock_id;
+
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 extern enum print_line_t print_graph_function(struct trace_iterator *iter);
@@ -674,9 +676,8 @@ enum trace_iterator_flags {
 	TRACE_ITER_PRINTK_MSGONLY	= 0x10000,
 	TRACE_ITER_CONTEXT_INFO		= 0x20000, /* Print pid/cpu/time */
 	TRACE_ITER_LATENCY_FMT		= 0x40000,
-	TRACE_ITER_GLOBAL_CLK		= 0x80000,
-	TRACE_ITER_SLEEP_TIME		= 0x100000,
-	TRACE_ITER_GRAPH_TIME		= 0x200000,
+	TRACE_ITER_SLEEP_TIME		= 0x80000,
+	TRACE_ITER_GRAPH_TIME		= 0x100000,
 };
 
 /*
-- 
1.5.5.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
  2009-08-25  8:12                 ` [PATCH 1/3] ftrace: Move setting of clock-source out of options Zhaolei
@ 2009-08-25  8:14                 ` Zhaolei
  2009-08-26  2:39                   ` Steven Rostedt
  2009-09-16 19:56                   ` john stultz
  2009-08-25  8:15                 ` [PATCH 3/3] ftrace: Add timer-source of walltime for ftrace Zhaolei
  2009-09-16  5:25                 ` [PATCH v2 0/2] " Zhaolei
  3 siblings, 2 replies; 46+ messages in thread
From: Zhaolei @ 2009-08-25  8:14 UTC (permalink / raw)
  To: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar; +Cc: LKML

From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

This patch can trace current xtime and wall_to_monotonic. Then user can
use ctime() to convert them to wall time which is easier to be understood,
especially for flight-recorder which need to get trace event from a kernel
dump file.

Example of ftrace output:
          <idle>-0     [000] 20118.489849: gtod_update_xtime: xtime=1243265589.999999824 wall_to_monotonic=3051713268.744158739
           <...>-4020  [001] 20118.489855: sys_open(filename: bf9e66e0, flags: 98800, mode: bf9e66e0)
           <...>-4020  [001] 20118.489873: sys_open -> 0xffffffec

ctime(1243265590) = date:Mon May 25 11:33:10 2009
So we can realize the task with pid 4020 open a file at
Mon May 25 11:33:10 2009

Changelog:
v1->v2: Rebased by Zhao Lei <zhaolei@cn.fujitsu.com>

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 include/trace/events/xtime.h |   38 ++++++++++++++++++++++++++++++++++++++
 kernel/time/ntp.c            |    4 ++++
 kernel/time/timekeeping.c    |    6 ++++++
 3 files changed, 48 insertions(+), 0 deletions(-)
 create mode 100644 include/trace/events/xtime.h

diff --git a/include/trace/events/xtime.h b/include/trace/events/xtime.h
new file mode 100644
index 0000000..398e679
--- /dev/null
+++ b/include/trace/events/xtime.h
@@ -0,0 +1,38 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM xtime
+
+#if !defined(_TRACE_XTIME_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_XTIME_H
+
+#include <linux/time.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(gtod_update_xtime,
+
+	TP_PROTO(struct timespec *xtime, struct timespec *wall_to_monotonic),
+
+	TP_ARGS(xtime, wall_to_monotonic),
+
+	TP_STRUCT__entry(
+		__field(	long,	xtime_sec		)
+		__field(	long,	xtime_nsec		)
+		__field(	long,	wall_to_monotonic_sec	)
+		__field(	long,	wall_to_monotonic_nsec	)
+	),
+
+	TP_fast_assign(
+		__entry->xtime_sec		= xtime->tv_sec;
+		__entry->xtime_nsec		= xtime->tv_nsec;
+		__entry->wall_to_monotonic_sec	= wall_to_monotonic->tv_sec;
+		__entry->wall_to_monotonic_nsec	= wall_to_monotonic->tv_nsec;
+	),
+
+	TP_printk("xtime=%ld.%09ld wall_to_monotonic=%ld.%09ld",
+		  __entry->xtime_sec, __entry->xtime_nsec,
+		  __entry->wall_to_monotonic_sec, __entry->wall_to_monotonic_nsec)
+);
+
+#endif /* _TRACE_XTIME_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 4800f93..fc2f13a 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -15,6 +15,8 @@
 #include <linux/time.h>
 #include <linux/mm.h>
 
+#include <trace/events/xtime.h>
+
 /*
  * NTP timekeeping variables:
  */
@@ -218,6 +220,8 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
 		break;
 	}
 
+	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
+
 	write_sequnlock(&xtime_lock);
 
 	return res;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 03cbeb3..2e57a87 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -20,6 +20,9 @@
 #include <linux/tick.h>
 #include <linux/stop_machine.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/xtime.h>
+
 /* Structure holding internal timekeeping values. */
 struct timekeeper {
 	/* Current clocksource used for timekeeping. */
@@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
 
 	update_vsyscall(&xtime, timekeeper.clock);
 
+	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
+
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
 	/* signal hrtimers about time change */
@@ -811,6 +816,7 @@ void update_wall_time(void)
 
 	/* check to see if there is a new clocksource to use */
 	update_vsyscall(&xtime, timekeeper.clock);
+	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
 }
 
 /**
-- 
1.5.5.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/3] ftrace: Add timer-source of walltime for ftrace
  2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
  2009-08-25  8:12                 ` [PATCH 1/3] ftrace: Move setting of clock-source out of options Zhaolei
  2009-08-25  8:14                 ` [PATCH 2/3] ftrace: add tracepoint for xtime Zhaolei
@ 2009-08-25  8:15                 ` Zhaolei
  2009-08-26  2:52                   ` Steven Rostedt
  2009-09-16  5:25                 ` [PATCH v2 0/2] " Zhaolei
  3 siblings, 1 reply; 46+ messages in thread
From: Zhaolei @ 2009-08-25  8:15 UTC (permalink / raw)
  To: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar; +Cc: LKML

We can add a timer-source for ftrace to support wall-time display.
It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
http://lkml.org/lkml/2009/7/24/103

It need to applys on my patch of:
[PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use

We can get following output:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
             bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
 ...
 # echo walltime > trace_clock
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
 ...

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/trace_clock.h |    1 +
 kernel/trace/trace.c        |   24 ++++++++++++++++-----
 kernel/trace/trace.h        |    8 ++++++-
 kernel/trace/trace_clock.c  |   48 +++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_output.c |   21 +++++++++++++++---
 5 files changed, 91 insertions(+), 11 deletions(-)

diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
index 7a81303..322707e 100644
--- a/include/linux/trace_clock.h
+++ b/include/linux/trace_clock.h
@@ -15,5 +15,6 @@
 extern u64 notrace trace_clock_local(void);
 extern u64 notrace trace_clock(void);
 extern u64 notrace trace_clock_global(void);
+extern u64 notrace trace_clock_walltime(void);
 
 #endif /* _LINUX_TRACE_CLOCK_H */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 5af6f83..afb2c5a 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -334,9 +334,10 @@ static struct {
 } trace_clocks[] = {
 	{ trace_clock_local,	"local" },
 	{ trace_clock_global,	"global" },
+	{ trace_clock_walltime,	"walltime" },
 };
 
-int trace_clock_id;
+enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
 
 /*
  * ftrace_max_lock is used to protect the swapping of buffers
@@ -1432,8 +1433,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"          TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"                     |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
@@ -3145,8 +3155,8 @@ static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
 	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
 		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
 			"%s%s%s%s", i ? " " : "",
-			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
-			i == trace_clock_id ? "]" : "");
+			i == trace_clock_type ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_type ? "]" : "");
 	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
 
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
@@ -3176,7 +3186,7 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 	if (i == ARRAY_SIZE(trace_clocks))
 		return -EINVAL;
 
-	trace_clock_id = i;
+	trace_clock_type = i;
 
 	mutex_lock(&trace_types_lock);
 
@@ -3186,6 +3196,8 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 
 	mutex_unlock(&trace_types_lock);
 
+	tracing_reset_online_cpus(&global_trace);
+
 	*fpos += cnt;
 
 	return cnt;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 0545c2b..2ce01b9 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -586,7 +586,13 @@ trace_vprintk(unsigned long ip, const char *fmt, va_list args);
 
 extern unsigned long trace_flags;
 
-extern int trace_clock_id;
+enum TRACE_CLOCK_TYPE {
+	TRACE_CLOCK_LOCAL,
+	TRACE_CLOCK_GLOBAL,
+	TRACE_CLOCK_WALLTIME,
+};
+
+extern enum TRACE_CLOCK_TYPE trace_clock_type;
 
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index b588fd8..2b23023 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -19,6 +19,7 @@
 #include <linux/sched.h>
 #include <linux/ktime.h>
 #include <linux/trace_clock.h>
+#include <trace/events/xtime.h>
 
 /*
  * trace_clock_local(): the simplest and least coherent tracing clock.
@@ -107,3 +108,50 @@ u64 notrace trace_clock_global(void)
 
 	return now;
 }
+
+/*
+ * the number of nanoseconds from 1970-1-1 00:00:00 to
+ * machine boots (trace_clock()==0)
+ */
+static u64 base_walltime;
+
+static void sync_base_walltime(struct timespec *ts, u64 traceclock)
+{
+	base_walltime = (u64)ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec -
+			traceclock;
+}
+
+static void probe_gtod_update_xtime(struct timespec *xtime,
+				    struct timespec *wall_to_monotonic)
+{
+	unsigned long flags;
+	u64 traceclock;
+
+	local_irq_save(flags);
+	traceclock = trace_clock();
+	sync_base_walltime(xtime, traceclock);
+	local_irq_restore(flags);
+}
+
+static int __init walltime_init(void)
+{
+	unsigned long flags;
+	struct timespec ts;
+	u64 traceclock;
+
+	local_irq_save(flags);
+	getnstimeofday(&ts);
+	traceclock = trace_clock();
+	sync_base_walltime(&ts, traceclock);
+	local_irq_restore(flags);
+
+	register_trace_gtod_update_xtime(probe_gtod_update_xtime);
+
+	return 0;
+}
+device_initcall_sync(walltime_init);
+
+u64 notrace trace_clock_walltime(void)
+{
+	return base_walltime + trace_clock();
+}
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index e0c2545..405e37d 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -501,15 +501,28 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		struct tm tm;
+		u32 remainder;
+		time_to_tm(div_u64_rem(iter->ts, NSEC_PER_SEC, &remainder),
+				-sys_tz.tz_minuteswest * 60, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+				"%04ld-%02d-%02d %02d:%02d:%02d.%06llu: ",
+				comm, entry->pid, iter->cpu,
+				tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+				tm.tm_hour, tm.tm_min, tm.tm_sec,
+				ns2usecs(remainder));
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
 				comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] ftrace: Move setting of clock-source out of options
  2009-08-25  8:12                 ` [PATCH 1/3] ftrace: Move setting of clock-source out of options Zhaolei
@ 2009-08-26  2:35                   ` Steven Rostedt
  2009-08-26  7:23                   ` [tip:tracing/core] " tip-bot for Zhaolei
  1 sibling, 0 replies; 46+ messages in thread
From: Steven Rostedt @ 2009-08-26  2:35 UTC (permalink / raw)
  To: Zhaolei; +Cc: KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML


Hi,

I like the series and I'll pull it, but there's one thing that bothers me. 
We should fix it too...


On Tue, 25 Aug 2009, Zhaolei wrote:

> There are many clock source for tracing system but we can only
> enable/disable one bit by trace/options file.
> We can move setting of clock-source out of options and add a separate
> file for it:
>  # cat trace_clock
>  [local] global
>  # echo global > trace_clock
>  # cat trace_clock
>  local [global]
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> ---
>  kernel/trace/trace.c |   92 ++++++++++++++++++++++++++++++++++++++++---------
>  kernel/trace/trace.h |    7 ++--
>  2 files changed, 79 insertions(+), 20 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index b7d873b..5af6f83 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -323,12 +323,21 @@ static const char *trace_options[] = {
>  	"printk-msg-only",
>  	"context-info",
>  	"latency-format",
> -	"global-clock",
>  	"sleep-time",
>  	"graph-time",
>  	NULL
>  };
>  
> +static struct {
> +	u64 (*func)(void);
> +	const char *name;
> +} trace_clocks[] = {
> +	{ trace_clock_local,	"local" },
> +	{ trace_clock_global,	"global" },
> +};
> +
> +int trace_clock_id;
> +
>  /*
>   * ftrace_max_lock is used to protect the swapping of buffers
>   * when taking a max snapshot. The buffers themselves are
> @@ -2159,22 +2168,6 @@ static void set_tracer_flags(unsigned int mask, int enabled)
>  		trace_flags |= mask;
>  	else
>  		trace_flags &= ~mask;
> -
> -	if (mask == TRACE_ITER_GLOBAL_CLK) {
> -		u64 (*func)(void);
> -
> -		if (enabled)
> -			func = trace_clock_global;
> -		else
> -			func = trace_clock_local;
> -
> -		mutex_lock(&trace_types_lock);
> -		ring_buffer_set_clock(global_trace.buffer, func);
> -
> -		if (max_tr.buffer)
> -			ring_buffer_set_clock(max_tr.buffer, func);
> -		mutex_unlock(&trace_types_lock);
> -	}
>  }
>  
>  static ssize_t
> @@ -3142,6 +3135,62 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
>  	return cnt;
>  }
>  
> +static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
> +				  size_t cnt, loff_t *ppos)
> +{
> +	char buf[64];

What happens if we add more clocks. Our list can only be 64 bytes long. It 
would be a good idea to change this to do something like the trace_options 
uses. That is, it dynamically allocates buf, by figuring out how much it 
needs. This should do that too.

Thanks,

-- Steve

> +	int bufiter = 0;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
> +		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
> +			"%s%s%s%s", i ? " " : "",
> +			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
> +			i == trace_clock_id ? "]" : "");
> +	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
> +
> +	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
> +}
> +
> +static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
> +				   size_t cnt, loff_t *fpos)
> +{
> +	char buf[64];
> +	const char *clockstr;
> +	int i;
> +
> +	if (cnt >= sizeof(buf))
> +		return -EINVAL;
> +
> +	if (copy_from_user(&buf, ubuf, cnt))
> +		return -EFAULT;
> +
> +	buf[cnt] = 0;
> +
> +	clockstr = strstrip(buf);
> +
> +	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++) {
> +		if (strcmp(trace_clocks[i].name, clockstr) == 0)
> +			break;
> +	}
> +	if (i == ARRAY_SIZE(trace_clocks))
> +		return -EINVAL;
> +
> +	trace_clock_id = i;
> +
> +	mutex_lock(&trace_types_lock);
> +
> +	ring_buffer_set_clock(global_trace.buffer, trace_clocks[i].func);
> +	if (max_tr.buffer)
> +		ring_buffer_set_clock(max_tr.buffer, trace_clocks[i].func);
> +
> +	mutex_unlock(&trace_types_lock);
> +
> +	*fpos += cnt;
> +
> +	return cnt;
> +}
> +
>  static const struct file_operations tracing_max_lat_fops = {
>  	.open		= tracing_open_generic,
>  	.read		= tracing_max_lat_read,
> @@ -3179,6 +3228,12 @@ static const struct file_operations tracing_mark_fops = {
>  	.write		= tracing_mark_write,
>  };
>  
> +static const struct file_operations trace_clock_fops = {
> +	.open		= tracing_open_generic,
> +	.read		= tracing_clock_read,
> +	.write		= tracing_clock_write,
> +};
> +
>  struct ftrace_buffer_info {
>  	struct trace_array	*tr;
>  	void			*spare;
> @@ -3910,6 +3965,9 @@ static __init int tracer_init_debugfs(void)
>  	trace_create_file("saved_cmdlines", 0444, d_tracer,
>  			NULL, &tracing_saved_cmdlines_fops);
>  
> +	trace_create_file("trace_clock", 0644, d_tracer, NULL,
> +			  &trace_clock_fops);
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  	trace_create_file("dyn_ftrace_total_info", 0444, d_tracer,
>  			&ftrace_update_tot_cnt, &tracing_dyn_info_fops);
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 606073c..0545c2b 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -586,6 +586,8 @@ trace_vprintk(unsigned long ip, const char *fmt, va_list args);
>  
>  extern unsigned long trace_flags;
>  
> +extern int trace_clock_id;
> +
>  /* Standard output formatting function used for function return traces */
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  extern enum print_line_t print_graph_function(struct trace_iterator *iter);
> @@ -674,9 +676,8 @@ enum trace_iterator_flags {
>  	TRACE_ITER_PRINTK_MSGONLY	= 0x10000,
>  	TRACE_ITER_CONTEXT_INFO		= 0x20000, /* Print pid/cpu/time */
>  	TRACE_ITER_LATENCY_FMT		= 0x40000,
> -	TRACE_ITER_GLOBAL_CLK		= 0x80000,
> -	TRACE_ITER_SLEEP_TIME		= 0x100000,
> -	TRACE_ITER_GRAPH_TIME		= 0x200000,
> +	TRACE_ITER_SLEEP_TIME		= 0x80000,
> +	TRACE_ITER_GRAPH_TIME		= 0x100000,
>  };
>  
>  /*
> -- 
> 1.5.5.3
> 
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-08-25  8:14                 ` [PATCH 2/3] ftrace: add tracepoint for xtime Zhaolei
@ 2009-08-26  2:39                   ` Steven Rostedt
  2009-09-01  8:03                     ` Zhaolei
  2009-09-16 19:56                   ` john stultz
  1 sibling, 1 reply; 46+ messages in thread
From: Steven Rostedt @ 2009-08-26  2:39 UTC (permalink / raw)
  To: Zhaolei
  Cc: KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML, Thomas Gleixner


Thomas,

This patch touches time keeping work. Can I get your ACK on it, or if it 
belongs to someone else, could you tell me who I need.

Thanks,

-- Steve


On Tue, 25 Aug 2009, Zhaolei wrote:

> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> 
> This patch can trace current xtime and wall_to_monotonic. Then user can
> use ctime() to convert them to wall time which is easier to be understood,
> especially for flight-recorder which need to get trace event from a kernel
> dump file.
> 
> Example of ftrace output:
>           <idle>-0     [000] 20118.489849: gtod_update_xtime: xtime=1243265589.999999824 wall_to_monotonic=3051713268.744158739
>            <...>-4020  [001] 20118.489855: sys_open(filename: bf9e66e0, flags: 98800, mode: bf9e66e0)
>            <...>-4020  [001] 20118.489873: sys_open -> 0xffffffec
> 
> ctime(1243265590) = date:Mon May 25 11:33:10 2009
> So we can realize the task with pid 4020 open a file at
> Mon May 25 11:33:10 2009
> 
> Changelog:
> v1->v2: Rebased by Zhao Lei <zhaolei@cn.fujitsu.com>
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> ---
>  include/trace/events/xtime.h |   38 ++++++++++++++++++++++++++++++++++++++
>  kernel/time/ntp.c            |    4 ++++
>  kernel/time/timekeeping.c    |    6 ++++++
>  3 files changed, 48 insertions(+), 0 deletions(-)
>  create mode 100644 include/trace/events/xtime.h
> 
> diff --git a/include/trace/events/xtime.h b/include/trace/events/xtime.h
> new file mode 100644
> index 0000000..398e679
> --- /dev/null
> +++ b/include/trace/events/xtime.h
> @@ -0,0 +1,38 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM xtime
> +
> +#if !defined(_TRACE_XTIME_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_XTIME_H
> +
> +#include <linux/time.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(gtod_update_xtime,
> +
> +	TP_PROTO(struct timespec *xtime, struct timespec *wall_to_monotonic),
> +
> +	TP_ARGS(xtime, wall_to_monotonic),
> +
> +	TP_STRUCT__entry(
> +		__field(	long,	xtime_sec		)
> +		__field(	long,	xtime_nsec		)
> +		__field(	long,	wall_to_monotonic_sec	)
> +		__field(	long,	wall_to_monotonic_nsec	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->xtime_sec		= xtime->tv_sec;
> +		__entry->xtime_nsec		= xtime->tv_nsec;
> +		__entry->wall_to_monotonic_sec	= wall_to_monotonic->tv_sec;
> +		__entry->wall_to_monotonic_nsec	= wall_to_monotonic->tv_nsec;
> +	),
> +
> +	TP_printk("xtime=%ld.%09ld wall_to_monotonic=%ld.%09ld",
> +		  __entry->xtime_sec, __entry->xtime_nsec,
> +		  __entry->wall_to_monotonic_sec, __entry->wall_to_monotonic_nsec)
> +);
> +
> +#endif /* _TRACE_XTIME_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
> index 4800f93..fc2f13a 100644
> --- a/kernel/time/ntp.c
> +++ b/kernel/time/ntp.c
> @@ -15,6 +15,8 @@
>  #include <linux/time.h>
>  #include <linux/mm.h>
>  
> +#include <trace/events/xtime.h>
> +
>  /*
>   * NTP timekeeping variables:
>   */
> @@ -218,6 +220,8 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
>  		break;
>  	}
>  
> +	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
> +
>  	write_sequnlock(&xtime_lock);
>  
>  	return res;
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 03cbeb3..2e57a87 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -20,6 +20,9 @@
>  #include <linux/tick.h>
>  #include <linux/stop_machine.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/xtime.h>
> +
>  /* Structure holding internal timekeeping values. */
>  struct timekeeper {
>  	/* Current clocksource used for timekeeping. */
> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
>  
>  	update_vsyscall(&xtime, timekeeper.clock);
>  
> +	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
> +
>  	write_sequnlock_irqrestore(&xtime_lock, flags);
>  
>  	/* signal hrtimers about time change */
> @@ -811,6 +816,7 @@ void update_wall_time(void)
>  
>  	/* check to see if there is a new clocksource to use */
>  	update_vsyscall(&xtime, timekeeper.clock);
> +	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
>  }
>  
>  /**
> -- 
> 1.5.5.3
> 
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] ftrace: Add timer-source of walltime for ftrace
  2009-08-25  8:15                 ` [PATCH 3/3] ftrace: Add timer-source of walltime for ftrace Zhaolei
@ 2009-08-26  2:52                   ` Steven Rostedt
  0 siblings, 0 replies; 46+ messages in thread
From: Steven Rostedt @ 2009-08-26  2:52 UTC (permalink / raw)
  To: Zhaolei
  Cc: KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML, Thomas Gleixner




On Tue, 25 Aug 2009, Zhaolei wrote:

> We can add a timer-source for ftrace to support wall-time display.
> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
> http://lkml.org/lkml/2009/7/24/103
> 
> It need to applys on my patch of:
> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
> 
> We can get following output:
>  # cd /mnt/debugfs/tracing/
>  # cat trace_clock
>  [local] global walltime
>  # echo sched_switch > current_tracer
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>  #              | |       |          |         |
>              bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
>              bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
>          events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
>  ...
>  # echo walltime > trace_clock
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
>  #              | |       |                        |         |
>              bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
>          events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
>  ...
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>


> +
> +static int __init walltime_init(void)
> +{
> +	unsigned long flags;
> +	struct timespec ts;
> +	u64 traceclock;
> +
> +	local_irq_save(flags);
> +	getnstimeofday(&ts);
> +	traceclock = trace_clock();
> +	sync_base_walltime(&ts, traceclock);
> +	local_irq_restore(flags);
> +
> +	register_trace_gtod_update_xtime(probe_gtod_update_xtime);

Because this patch depends on patch 2 that creates the "gtod_update_xtime" 
trace point, I have to hold off on this one too. But the first patch is a 
nice clean up regardless. I'll pull that one in.

Thanks,

-- Steve


> +
> +	return 0;
> +}
> +device_initcall_sync(walltime_init);
> +
> +u64 notrace trace_clock_walltime(void)
> +{
> +	return base_walltime + trace_clock();
> +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [tip:tracing/core] ftrace: Move setting of clock-source out of options
  2009-08-25  8:12                 ` [PATCH 1/3] ftrace: Move setting of clock-source out of options Zhaolei
  2009-08-26  2:35                   ` Steven Rostedt
@ 2009-08-26  7:23                   ` tip-bot for Zhaolei
  1 sibling, 0 replies; 46+ messages in thread
From: tip-bot for Zhaolei @ 2009-08-26  7:23 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, zhaolei, tglx

Commit-ID:  5079f3261ffd7fe4a537679af695f2328943a245
Gitweb:     http://git.kernel.org/tip/5079f3261ffd7fe4a537679af695f2328943a245
Author:     Zhaolei <zhaolei@cn.fujitsu.com>
AuthorDate: Tue, 25 Aug 2009 16:12:56 +0800
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Wed, 26 Aug 2009 00:32:08 -0400

ftrace: Move setting of clock-source out of options

There are many clock sources for the tracing system but we can only
enable/disable one at a time with the trace/options file.
We can move the setting of clock-source out of options and add a separate
file for it:
 # cat trace_clock
 [local] global
 # echo global > trace_clock
 # cat trace_clock
 local [global]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
LKML-Reference: <4A939D08.6050604@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


---
 kernel/trace/trace.c |   92 ++++++++++++++++++++++++++++++++++++++++---------
 kernel/trace/trace.h |    7 ++--
 2 files changed, 79 insertions(+), 20 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8ac2043..63dbc7f 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -323,12 +323,21 @@ static const char *trace_options[] = {
 	"printk-msg-only",
 	"context-info",
 	"latency-format",
-	"global-clock",
 	"sleep-time",
 	"graph-time",
 	NULL
 };
 
+static struct {
+	u64 (*func)(void);
+	const char *name;
+} trace_clocks[] = {
+	{ trace_clock_local,	"local" },
+	{ trace_clock_global,	"global" },
+};
+
+int trace_clock_id;
+
 /*
  * ftrace_max_lock is used to protect the swapping of buffers
  * when taking a max snapshot. The buffers themselves are
@@ -2159,22 +2168,6 @@ static void set_tracer_flags(unsigned int mask, int enabled)
 		trace_flags |= mask;
 	else
 		trace_flags &= ~mask;
-
-	if (mask == TRACE_ITER_GLOBAL_CLK) {
-		u64 (*func)(void);
-
-		if (enabled)
-			func = trace_clock_global;
-		else
-			func = trace_clock_local;
-
-		mutex_lock(&trace_types_lock);
-		ring_buffer_set_clock(global_trace.buffer, func);
-
-		if (max_tr.buffer)
-			ring_buffer_set_clock(max_tr.buffer, func);
-		mutex_unlock(&trace_types_lock);
-	}
 }
 
 static ssize_t
@@ -3142,6 +3135,62 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
 	return cnt;
 }
 
+static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
+				  size_t cnt, loff_t *ppos)
+{
+	char buf[64];
+	int bufiter = 0;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
+		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
+			"%s%s%s%s", i ? " " : "",
+			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_id ? "]" : "");
+	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
+
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
+}
+
+static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
+				   size_t cnt, loff_t *fpos)
+{
+	char buf[64];
+	const char *clockstr;
+	int i;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	buf[cnt] = 0;
+
+	clockstr = strstrip(buf);
+
+	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++) {
+		if (strcmp(trace_clocks[i].name, clockstr) == 0)
+			break;
+	}
+	if (i == ARRAY_SIZE(trace_clocks))
+		return -EINVAL;
+
+	trace_clock_id = i;
+
+	mutex_lock(&trace_types_lock);
+
+	ring_buffer_set_clock(global_trace.buffer, trace_clocks[i].func);
+	if (max_tr.buffer)
+		ring_buffer_set_clock(max_tr.buffer, trace_clocks[i].func);
+
+	mutex_unlock(&trace_types_lock);
+
+	*fpos += cnt;
+
+	return cnt;
+}
+
 static const struct file_operations tracing_max_lat_fops = {
 	.open		= tracing_open_generic,
 	.read		= tracing_max_lat_read,
@@ -3179,6 +3228,12 @@ static const struct file_operations tracing_mark_fops = {
 	.write		= tracing_mark_write,
 };
 
+static const struct file_operations trace_clock_fops = {
+	.open		= tracing_open_generic,
+	.read		= tracing_clock_read,
+	.write		= tracing_clock_write,
+};
+
 struct ftrace_buffer_info {
 	struct trace_array	*tr;
 	void			*spare;
@@ -3918,6 +3973,9 @@ static __init int tracer_init_debugfs(void)
 	trace_create_file("saved_cmdlines", 0444, d_tracer,
 			NULL, &tracing_saved_cmdlines_fops);
 
+	trace_create_file("trace_clock", 0644, d_tracer, NULL,
+			  &trace_clock_fops);
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 	trace_create_file("dyn_ftrace_total_info", 0444, d_tracer,
 			&ftrace_update_tot_cnt, &tracing_dyn_info_fops);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 64dda57..654fd65 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -568,6 +568,8 @@ trace_vprintk(unsigned long ip, const char *fmt, va_list args);
 
 extern unsigned long trace_flags;
 
+extern int trace_clock_id;
+
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 extern enum print_line_t print_graph_function(struct trace_iterator *iter);
@@ -656,9 +658,8 @@ enum trace_iterator_flags {
 	TRACE_ITER_PRINTK_MSGONLY	= 0x10000,
 	TRACE_ITER_CONTEXT_INFO		= 0x20000, /* Print pid/cpu/time */
 	TRACE_ITER_LATENCY_FMT		= 0x40000,
-	TRACE_ITER_GLOBAL_CLK		= 0x80000,
-	TRACE_ITER_SLEEP_TIME		= 0x100000,
-	TRACE_ITER_GRAPH_TIME		= 0x200000,
+	TRACE_ITER_SLEEP_TIME		= 0x80000,
+	TRACE_ITER_GRAPH_TIME		= 0x100000,
 };
 
 /*

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-08-26  2:39                   ` Steven Rostedt
@ 2009-09-01  8:03                     ` Zhaolei
  0 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-09-01  8:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML

Hello, Thomas

I done this patchset to add walltime display for ftrace.
I think it will make ftrace more convenient to use.

To make it work, I rebased a patch from Xiao Guangrong to add tracepoint
into xtime, so I can use this feture for my code.

Could you have a look at this patch?

Thanks
Zhaolei


Steven Rostedt wrote:
> Thomas,
> 
> This patch touches time keeping work. Can I get your ACK on it, or if it 
> belongs to someone else, could you tell me who I need.
> 
> Thanks,
> 
> -- Steve
> 
> 
> On Tue, 25 Aug 2009, Zhaolei wrote:
> 
>> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>>
>> This patch can trace current xtime and wall_to_monotonic. Then user can
>> use ctime() to convert them to wall time which is easier to be understood,
>> especially for flight-recorder which need to get trace event from a kernel
>> dump file.
>>
>> Example of ftrace output:
>>           <idle>-0     [000] 20118.489849: gtod_update_xtime: xtime=1243265589.999999824 wall_to_monotonic=3051713268.744158739
>>            <...>-4020  [001] 20118.489855: sys_open(filename: bf9e66e0, flags: 98800, mode: bf9e66e0)
>>            <...>-4020  [001] 20118.489873: sys_open -> 0xffffffec
>>
>> ctime(1243265590) = date:Mon May 25 11:33:10 2009
>> So we can realize the task with pid 4020 open a file at
>> Mon May 25 11:33:10 2009
>>
>> Changelog:
>> v1->v2: Rebased by Zhao Lei <zhaolei@cn.fujitsu.com>
>>
>> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>> ---
>>  include/trace/events/xtime.h |   38 ++++++++++++++++++++++++++++++++++++++
>>  kernel/time/ntp.c            |    4 ++++
>>  kernel/time/timekeeping.c    |    6 ++++++
>>  3 files changed, 48 insertions(+), 0 deletions(-)
>>  create mode 100644 include/trace/events/xtime.h
>>
>> diff --git a/include/trace/events/xtime.h b/include/trace/events/xtime.h
>> new file mode 100644
>> index 0000000..398e679
>> --- /dev/null
>> +++ b/include/trace/events/xtime.h
>> @@ -0,0 +1,38 @@
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM xtime
>> +
>> +#if !defined(_TRACE_XTIME_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_XTIME_H
>> +
>> +#include <linux/time.h>
>> +#include <linux/tracepoint.h>
>> +
>> +TRACE_EVENT(gtod_update_xtime,
>> +
>> +	TP_PROTO(struct timespec *xtime, struct timespec *wall_to_monotonic),
>> +
>> +	TP_ARGS(xtime, wall_to_monotonic),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	long,	xtime_sec		)
>> +		__field(	long,	xtime_nsec		)
>> +		__field(	long,	wall_to_monotonic_sec	)
>> +		__field(	long,	wall_to_monotonic_nsec	)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->xtime_sec		= xtime->tv_sec;
>> +		__entry->xtime_nsec		= xtime->tv_nsec;
>> +		__entry->wall_to_monotonic_sec	= wall_to_monotonic->tv_sec;
>> +		__entry->wall_to_monotonic_nsec	= wall_to_monotonic->tv_nsec;
>> +	),
>> +
>> +	TP_printk("xtime=%ld.%09ld wall_to_monotonic=%ld.%09ld",
>> +		  __entry->xtime_sec, __entry->xtime_nsec,
>> +		  __entry->wall_to_monotonic_sec, __entry->wall_to_monotonic_nsec)
>> +);
>> +
>> +#endif /* _TRACE_XTIME_H */
>> +
>> +/* This part must be outside protection */
>> +#include <trace/define_trace.h>
>> diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
>> index 4800f93..fc2f13a 100644
>> --- a/kernel/time/ntp.c
>> +++ b/kernel/time/ntp.c
>> @@ -15,6 +15,8 @@
>>  #include <linux/time.h>
>>  #include <linux/mm.h>
>>  
>> +#include <trace/events/xtime.h>
>> +
>>  /*
>>   * NTP timekeeping variables:
>>   */
>> @@ -218,6 +220,8 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
>>  		break;
>>  	}
>>  
>> +	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
>> +
>>  	write_sequnlock(&xtime_lock);
>>  
>>  	return res;
>> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> index 03cbeb3..2e57a87 100644
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -20,6 +20,9 @@
>>  #include <linux/tick.h>
>>  #include <linux/stop_machine.h>
>>  
>> +#define CREATE_TRACE_POINTS
>> +#include <trace/events/xtime.h>
>> +
>>  /* Structure holding internal timekeeping values. */
>>  struct timekeeper {
>>  	/* Current clocksource used for timekeeping. */
>> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
>>  
>>  	update_vsyscall(&xtime, timekeeper.clock);
>>  
>> +	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
>> +
>>  	write_sequnlock_irqrestore(&xtime_lock, flags);
>>  
>>  	/* signal hrtimers about time change */
>> @@ -811,6 +816,7 @@ void update_wall_time(void)
>>  
>>  	/* check to see if there is a new clocksource to use */
>>  	update_vsyscall(&xtime, timekeeper.clock);
>> +	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
>>  }
>>  
>>  /**
>> -- 
>> 1.5.5.3
>>
>>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 0/2] ftrace: Add timer-source of walltime for ftrace
  2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
                                   ` (2 preceding siblings ...)
  2009-08-25  8:15                 ` [PATCH 3/3] ftrace: Add timer-source of walltime for ftrace Zhaolei
@ 2009-09-16  5:25                 ` Zhaolei
  2009-09-16  5:27                   ` [PATCH v2 1/2] ftrace: add tracepoint for xtime Zhaolei
  2009-09-16  5:29                   ` [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace Zhaolei
  3 siblings, 2 replies; 46+ messages in thread
From: Zhaolei @ 2009-09-16  5:25 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Frederic Weisbecker,
	KOSAKI Motohiro, Ingo Molnar
  Cc: LKML

Hi,

I rebased patchset of "enable ftrace display walltime".

It can make ftrace display intuitionistic time information to user
with walltime clock.

We can get following output by using it:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
             bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
 ...
 # echo walltime > trace_clock
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
 ...

It need to applys on my patch of:
[PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use

Hi, Thomas,

To make ftrace's walltime function work, I need to add a tracepoint into xtime's code,
so we can set ftrace's walltime clockbase when time changed.

Could you have a look at this patch, and give me your suggestion?

Changelog:
v1->v2: Rebase to top of tip tree.

Thanks
Zhaolei



^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 1/2] ftrace: add tracepoint for xtime
  2009-09-16  5:25                 ` [PATCH v2 0/2] " Zhaolei
@ 2009-09-16  5:27                   ` Zhaolei
  2009-09-16 19:33                     ` Steven Rostedt
  2009-09-16  5:29                   ` [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace Zhaolei
  1 sibling, 1 reply; 46+ messages in thread
From: Zhaolei @ 2009-09-16  5:27 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Frederic Weisbecker,
	KOSAKI Motohiro, Ingo Molnar
  Cc: LKML

From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

This patch can trace current xtime and wall_to_monotonic. Then user can
use ctime() to convert them to wall time which is easier to be understood,
especially for flight-recorder which need to get trace event from a kernel
dump file.

Example of ftrace output:
          <idle>-0     [000] 20118.489849: gtod_update_xtime: xtime=1243265589.999999824 wall_to_monotonic=3051713268.744158739
           <...>-4020  [001] 20118.489855: sys_open(filename: bf9e66e0, flags: 98800, mode: bf9e66e0)
           <...>-4020  [001] 20118.489873: sys_open -> 0xffffffec

ctime(1243265590) = date:Mon May 25 11:33:10 2009
So we can realize the task with pid 4020 open a file at
Mon May 25 11:33:10 2009

Changelog:
v1->v2: Rebased by Zhao Lei <zhaolei@cn.fujitsu.com>

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/trace/events/xtime.h |   38 ++++++++++++++++++++++++++++++++++++++
 kernel/time/ntp.c            |    4 ++++
 kernel/time/timekeeping.c    |    6 ++++++
 3 files changed, 48 insertions(+), 0 deletions(-)
 create mode 100644 include/trace/events/xtime.h

diff --git a/include/trace/events/xtime.h b/include/trace/events/xtime.h
new file mode 100644
index 0000000..398e679
--- /dev/null
+++ b/include/trace/events/xtime.h
@@ -0,0 +1,38 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM xtime
+
+#if !defined(_TRACE_XTIME_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_XTIME_H
+
+#include <linux/time.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(gtod_update_xtime,
+
+	TP_PROTO(struct timespec *xtime, struct timespec *wall_to_monotonic),
+
+	TP_ARGS(xtime, wall_to_monotonic),
+
+	TP_STRUCT__entry(
+		__field(	long,	xtime_sec		)
+		__field(	long,	xtime_nsec		)
+		__field(	long,	wall_to_monotonic_sec	)
+		__field(	long,	wall_to_monotonic_nsec	)
+	),
+
+	TP_fast_assign(
+		__entry->xtime_sec		= xtime->tv_sec;
+		__entry->xtime_nsec		= xtime->tv_nsec;
+		__entry->wall_to_monotonic_sec	= wall_to_monotonic->tv_sec;
+		__entry->wall_to_monotonic_nsec	= wall_to_monotonic->tv_nsec;
+	),
+
+	TP_printk("xtime=%ld.%09ld wall_to_monotonic=%ld.%09ld",
+		  __entry->xtime_sec, __entry->xtime_nsec,
+		  __entry->wall_to_monotonic_sec, __entry->wall_to_monotonic_nsec)
+);
+
+#endif /* _TRACE_XTIME_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 4800f93..fc2f13a 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -15,6 +15,8 @@
 #include <linux/time.h>
 #include <linux/mm.h>
 
+#include <trace/events/xtime.h>
+
 /*
  * NTP timekeeping variables:
  */
@@ -218,6 +220,8 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
 		break;
 	}
 
+	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
+
 	write_sequnlock(&xtime_lock);
 
 	return res;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index fb0f46f..42b220f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -20,6 +20,9 @@
 #include <linux/tick.h>
 #include <linux/stop_machine.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/xtime.h>
+
 /* Structure holding internal timekeeping values. */
 struct timekeeper {
 	/* Current clocksource used for timekeeping. */
@@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
 
 	update_vsyscall(&xtime, timekeeper.clock);
 
+	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
+
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
 	/* signal hrtimers about time change */
@@ -811,6 +816,7 @@ void update_wall_time(void)
 
 	/* check to see if there is a new clocksource to use */
 	update_vsyscall(&xtime, timekeeper.clock);
+	trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
 }
 
 /**
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace
  2009-09-16  5:25                 ` [PATCH v2 0/2] " Zhaolei
  2009-09-16  5:27                   ` [PATCH v2 1/2] ftrace: add tracepoint for xtime Zhaolei
@ 2009-09-16  5:29                   ` Zhaolei
  2009-09-16  5:59                     ` Frederic Weisbecker
  1 sibling, 1 reply; 46+ messages in thread
From: Zhaolei @ 2009-09-16  5:29 UTC (permalink / raw)
  To: Steven Rostedt, Thomas Gleixner, Frederic Weisbecker,
	KOSAKI Motohiro, Ingo Molnar
  Cc: LKML

We can add a timer-source for ftrace to support wall-time display.
It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
http://lkml.org/lkml/2009/7/24/103

It need to applys on my patch of:
[PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use

We can get following output:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo sched_switch > current_tracer
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
             bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
         events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
 ...
 # echo walltime > trace_clock
 # cat trace
 # tracer: sched_switch
 #
 #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
             bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
         events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
 ...

Changelog:
v1->v2: Rebase to top of tip tree.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/trace_clock.h |    1 +
 kernel/trace/trace.c        |   24 ++++++++++++++++-----
 kernel/trace/trace.h        |    8 ++++++-
 kernel/trace/trace_clock.c  |   48 +++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_output.c |   21 +++++++++++++++---
 5 files changed, 91 insertions(+), 11 deletions(-)

diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
index 7a81303..322707e 100644
--- a/include/linux/trace_clock.h
+++ b/include/linux/trace_clock.h
@@ -15,5 +15,6 @@
 extern u64 notrace trace_clock_local(void);
 extern u64 notrace trace_clock(void);
 extern u64 notrace trace_clock_global(void);
+extern u64 notrace trace_clock_walltime(void);
 
 #endif /* _LINUX_TRACE_CLOCK_H */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index fd52a19..160b6a8 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -335,9 +335,10 @@ static struct {
 } trace_clocks[] = {
 	{ trace_clock_local,	"local" },
 	{ trace_clock_global,	"global" },
+	{ trace_clock_walltime,	"walltime" },
 };
 
-int trace_clock_id;
+enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
 
 /*
  * trace_parser_get_init - gets the buffer for trace parser
@@ -1653,8 +1654,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"          TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"                     |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
@@ -3387,8 +3397,8 @@ static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
 	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
 		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
 			"%s%s%s%s", i ? " " : "",
-			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
-			i == trace_clock_id ? "]" : "");
+			i == trace_clock_type ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_type ? "]" : "");
 	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
 
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
@@ -3418,7 +3428,7 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 	if (i == ARRAY_SIZE(trace_clocks))
 		return -EINVAL;
 
-	trace_clock_id = i;
+	trace_clock_type = i;
 
 	mutex_lock(&trace_types_lock);
 
@@ -3428,6 +3438,8 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 
 	mutex_unlock(&trace_types_lock);
 
+	tracing_reset_online_cpus(&global_trace);
+
 	*fpos += cnt;
 
 	return cnt;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index e747162..afee4de 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -475,7 +475,13 @@ int trace_array_printk(struct trace_array *tr,
 
 extern unsigned long trace_flags;
 
-extern int trace_clock_id;
+enum TRACE_CLOCK_TYPE {
+	TRACE_CLOCK_LOCAL,
+	TRACE_CLOCK_GLOBAL,
+	TRACE_CLOCK_WALLTIME,
+};
+
+extern enum TRACE_CLOCK_TYPE trace_clock_type;
 
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index b588fd8..2b23023 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -19,6 +19,7 @@
 #include <linux/sched.h>
 #include <linux/ktime.h>
 #include <linux/trace_clock.h>
+#include <trace/events/xtime.h>
 
 /*
  * trace_clock_local(): the simplest and least coherent tracing clock.
@@ -107,3 +108,50 @@ u64 notrace trace_clock_global(void)
 
 	return now;
 }
+
+/*
+ * the number of nanoseconds from 1970-1-1 00:00:00 to
+ * machine boots (trace_clock()==0)
+ */
+static u64 base_walltime;
+
+static void sync_base_walltime(struct timespec *ts, u64 traceclock)
+{
+	base_walltime = (u64)ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec -
+			traceclock;
+}
+
+static void probe_gtod_update_xtime(struct timespec *xtime,
+				    struct timespec *wall_to_monotonic)
+{
+	unsigned long flags;
+	u64 traceclock;
+
+	local_irq_save(flags);
+	traceclock = trace_clock();
+	sync_base_walltime(xtime, traceclock);
+	local_irq_restore(flags);
+}
+
+static int __init walltime_init(void)
+{
+	unsigned long flags;
+	struct timespec ts;
+	u64 traceclock;
+
+	local_irq_save(flags);
+	getnstimeofday(&ts);
+	traceclock = trace_clock();
+	sync_base_walltime(&ts, traceclock);
+	local_irq_restore(flags);
+
+	register_trace_gtod_update_xtime(probe_gtod_update_xtime);
+
+	return 0;
+}
+device_initcall_sync(walltime_init);
+
+u64 notrace trace_clock_walltime(void)
+{
+	return base_walltime + trace_clock();
+}
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index f572f44..2116c5f 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -527,15 +527,28 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		struct tm tm;
+		u32 remainder;
+		time_to_tm(div_u64_rem(iter->ts, NSEC_PER_SEC, &remainder),
+				-sys_tz.tz_minuteswest * 60, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+				"%04ld-%02d-%02d %02d:%02d:%02d.%06llu: ",
+				comm, entry->pid, iter->cpu,
+				tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+				tm.tm_hour, tm.tm_min, tm.tm_sec,
+				ns2usecs(remainder));
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
 				comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace
  2009-09-16  5:29                   ` [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace Zhaolei
@ 2009-09-16  5:59                     ` Frederic Weisbecker
  2009-09-16  6:40                       ` Zhaolei
  0 siblings, 1 reply; 46+ messages in thread
From: Frederic Weisbecker @ 2009-09-16  5:59 UTC (permalink / raw)
  To: Zhaolei
  Cc: Steven Rostedt, Thomas Gleixner, KOSAKI Motohiro, Ingo Molnar, LKML

On Wed, Sep 16, 2009 at 01:29:19PM +0800, Zhaolei wrote:
> We can add a timer-source for ftrace to support wall-time display.
> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
> http://lkml.org/lkml/2009/7/24/103
> 
> It need to applys on my patch of:
> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
> 
> We can get following output:
>  # cd /mnt/debugfs/tracing/
>  # cat trace_clock
>  [local] global walltime
>  # echo sched_switch > current_tracer
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>  #              | |       |          |         |
>              bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
>              bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
>          events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
>  ...
>  # echo walltime > trace_clock
>  # cat trace
>  # tracer: sched_switch
>  #
>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
>  #              | |       |                        |         |




The headers seem a bit broken. Is it a copy paste effect or?



>              bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
>          events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
>  ...


But nice result :)


> 
> Changelog:
> v1->v2: Rebase to top of tip tree.
> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> ---
>  include/linux/trace_clock.h |    1 +
>  kernel/trace/trace.c        |   24 ++++++++++++++++-----
>  kernel/trace/trace.h        |    8 ++++++-
>  kernel/trace/trace_clock.c  |   48 +++++++++++++++++++++++++++++++++++++++++++
>  kernel/trace/trace_output.c |   21 +++++++++++++++---
>  5 files changed, 91 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
> index 7a81303..322707e 100644
> --- a/include/linux/trace_clock.h
> +++ b/include/linux/trace_clock.h
> @@ -15,5 +15,6 @@
>  extern u64 notrace trace_clock_local(void);
>  extern u64 notrace trace_clock(void);
>  extern u64 notrace trace_clock_global(void);
> +extern u64 notrace trace_clock_walltime(void);
>  
>  #endif /* _LINUX_TRACE_CLOCK_H */
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index fd52a19..160b6a8 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -335,9 +335,10 @@ static struct {
>  } trace_clocks[] = {
>  	{ trace_clock_local,	"local" },
>  	{ trace_clock_global,	"global" },
> +	{ trace_clock_walltime,	"walltime" },
>  };
>  
> -int trace_clock_id;
> +enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;



Uppercase names are usually reserved for CPP macros.



>  
>  /*
>   * trace_parser_get_init - gets the buffer for trace parser
> @@ -1653,8 +1654,17 @@ static void print_lat_help_header(struct seq_file *m)
>  
>  static void print_func_help_header(struct seq_file *m)
>  {
> -	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
> -	seq_puts(m, "#              | |       |          |         |\n");
> +	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
> +		seq_puts(m, "#           TASK-PID    CPU# "
> +			"          TIMESTAMP  FUNCTION\n");
> +		seq_puts(m, "#              | |       |   "
> +			"                     |         |\n");
> +	} else {
> +		seq_puts(m, "#           TASK-PID    CPU# "
> +			"   TIMESTAMP  FUNCTION\n");
> +		seq_puts(m, "#              | |       |   "
> +			"       |         |\n");
> +	}
>  }
>  
>  
> @@ -3387,8 +3397,8 @@ static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
>  	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
>  		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
>  			"%s%s%s%s", i ? " " : "",
> -			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
> -			i == trace_clock_id ? "]" : "");
> +			i == trace_clock_type ? "[" : "", trace_clocks[i].name,
> +			i == trace_clock_type ? "]" : "");
>  	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
>  
>  	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
> @@ -3418,7 +3428,7 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
>  	if (i == ARRAY_SIZE(trace_clocks))
>  		return -EINVAL;
>  
> -	trace_clock_id = i;
> +	trace_clock_type = i;
>  
>  	mutex_lock(&trace_types_lock);
>  
> @@ -3428,6 +3438,8 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
>  
>  	mutex_unlock(&trace_types_lock);
>  
> +	tracing_reset_online_cpus(&global_trace);
> +
>  	*fpos += cnt;
>  
>  	return cnt;
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index e747162..afee4de 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -475,7 +475,13 @@ int trace_array_printk(struct trace_array *tr,
>  
>  extern unsigned long trace_flags;
>  
> -extern int trace_clock_id;
> +enum TRACE_CLOCK_TYPE {
> +	TRACE_CLOCK_LOCAL,
> +	TRACE_CLOCK_GLOBAL,
> +	TRACE_CLOCK_WALLTIME,
> +};
> +
> +extern enum TRACE_CLOCK_TYPE trace_clock_type;
>  
>  /* Standard output formatting function used for function return traces */
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
> index b588fd8..2b23023 100644
> --- a/kernel/trace/trace_clock.c
> +++ b/kernel/trace/trace_clock.c
> @@ -19,6 +19,7 @@
>  #include <linux/sched.h>
>  #include <linux/ktime.h>
>  #include <linux/trace_clock.h>
> +#include <trace/events/xtime.h>
>  
>  /*
>   * trace_clock_local(): the simplest and least coherent tracing clock.
> @@ -107,3 +108,50 @@ u64 notrace trace_clock_global(void)
>  
>  	return now;
>  }
> +
> +/*
> + * the number of nanoseconds from 1970-1-1 00:00:00 to
> + * machine boots (trace_clock()==0)
> + */
> +static u64 base_walltime;
> +
> +static void sync_base_walltime(struct timespec *ts, u64 traceclock)
> +{
> +	base_walltime = (u64)ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec -
> +			traceclock;
> +}
> +
> +static void probe_gtod_update_xtime(struct timespec *xtime,
> +				    struct timespec *wall_to_monotonic)
> +{
> +	unsigned long flags;
> +	u64 traceclock;
> +
> +	local_irq_save(flags);
> +	traceclock = trace_clock();
> +	sync_base_walltime(xtime, traceclock);
> +	local_irq_restore(flags);
> +}



I don't know how much we want a high level of precision but
if we are in the local clock, traceclock may have a small
imprecision basis across cpus.

Well, I guess that's not a big deal though.



> +
> +static int __init walltime_init(void)
> +{
> +	unsigned long flags;
> +	struct timespec ts;
> +	u64 traceclock;
> +
> +	local_irq_save(flags);
> +	getnstimeofday(&ts);
> +	traceclock = trace_clock();
> +	sync_base_walltime(&ts, traceclock);
> +	local_irq_restore(flags);
> +
> +	register_trace_gtod_update_xtime(probe_gtod_update_xtime);
> +
> +	return 0;
> +}
> +device_initcall_sync(walltime_init);
> +
> +u64 notrace trace_clock_walltime(void)
> +{
> +	return base_walltime + trace_clock();
> +}
> diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
> index f572f44..2116c5f 100644
> --- a/kernel/trace/trace_output.c
> +++ b/kernel/trace/trace_output.c
> @@ -527,15 +527,28 @@ int trace_print_context(struct trace_iterator *iter)
>  {
>  	struct trace_seq *s = &iter->seq;
>  	struct trace_entry *entry = iter->ent;
> -	unsigned long long t = ns2usecs(iter->ts);
> -	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
> -	unsigned long secs = (unsigned long)t;
>  	char comm[TASK_COMM_LEN];
>  
>  	trace_find_cmdline(entry->pid, comm);
>  
> -	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
> +	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
> +		struct tm tm;
> +		u32 remainder;
> +		time_to_tm(div_u64_rem(iter->ts, NSEC_PER_SEC, &remainder),
> +				-sys_tz.tz_minuteswest * 60, &tm);
> +		return trace_seq_printf(s, "%16s-%-5d [%03d] "
> +				"%04ld-%02d-%02d %02d:%02d:%02d.%06llu: ",
> +				comm, entry->pid, iter->cpu,
> +				tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
> +				tm.tm_hour, tm.tm_min, tm.tm_sec,
> +				ns2usecs(remainder));
> +	} else {
> +		unsigned long long t = ns2usecs(iter->ts);
> +		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
> +		unsigned long secs = (unsigned long)t;
> +		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
>  				comm, entry->pid, iter->cpu, secs, usec_rem);
> +	}
>  }
>  
>  int trace_print_lat_context(struct trace_iterator *iter)
> -- 
> 1.5.5.3
> 
> 


Other than that, that looks good!

Thanks.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace
  2009-09-16  5:59                     ` Frederic Weisbecker
@ 2009-09-16  6:40                       ` Zhaolei
  2009-09-16 19:37                         ` Steven Rostedt
  0 siblings, 1 reply; 46+ messages in thread
From: Zhaolei @ 2009-09-16  6:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Steven Rostedt, Thomas Gleixner, KOSAKI Motohiro, Ingo Molnar, LKML

Frederic Weisbecker wrote:
> On Wed, Sep 16, 2009 at 01:29:19PM +0800, Zhaolei wrote:
>> We can add a timer-source for ftrace to support wall-time display.
>> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
>> http://lkml.org/lkml/2009/7/24/103
>>
>> It need to applys on my patch of:
>> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
>>
Hello, Frederic

Thanks for your review.

>> We can get following output:
>>  # cd /mnt/debugfs/tracing/
>>  # cat trace_clock
>>  [local] global walltime
>>  # echo sched_switch > current_tracer
>>  # cat trace
>>  # tracer: sched_switch
>>  #
>>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>>  #              | |       |          |         |
>>              bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
>>              bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
>>              bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
>>              bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
>>          events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
>>  ...
>>  # echo walltime > trace_clock
>>  # cat trace
>>  # tracer: sched_switch
>>  #
>>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
>>  #              | |       |                        |         |
> 
> 
> 
> 
> The headers seem a bit broken. Is it a copy paste effect or?

No, it is a format bug of patch.
I'll fix it.

> 
> 
> 
>>              bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
>>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
>>          events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
>>  ...
> 
> 
> But nice result :)
> 
> 
>> Changelog:
>> v1->v2: Rebase to top of tip tree.
>>
>> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
>> ---
>>  include/linux/trace_clock.h |    1 +
>>  kernel/trace/trace.c        |   24 ++++++++++++++++-----
>>  kernel/trace/trace.h        |    8 ++++++-
>>  kernel/trace/trace_clock.c  |   48 +++++++++++++++++++++++++++++++++++++++++++
>>  kernel/trace/trace_output.c |   21 +++++++++++++++---
>>  5 files changed, 91 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
>> index 7a81303..322707e 100644
>> --- a/include/linux/trace_clock.h
>> +++ b/include/linux/trace_clock.h
>> @@ -15,5 +15,6 @@
>>  extern u64 notrace trace_clock_local(void);
>>  extern u64 notrace trace_clock(void);
>>  extern u64 notrace trace_clock_global(void);
>> +extern u64 notrace trace_clock_walltime(void);
>>  
>>  #endif /* _LINUX_TRACE_CLOCK_H */
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index fd52a19..160b6a8 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -335,9 +335,10 @@ static struct {
>>  } trace_clocks[] = {
>>  	{ trace_clock_local,	"local" },
>>  	{ trace_clock_global,	"global" },
>> +	{ trace_clock_walltime,	"walltime" },
>>  };
>>  
>> -int trace_clock_id;
>> +enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
> 
> 
> 
> Uppercase names are usually reserved for CPP macros.

I'll rename it.

Thanks
Zhaolei


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 1/2] ftrace: add tracepoint for xtime
  2009-09-16  5:27                   ` [PATCH v2 1/2] ftrace: add tracepoint for xtime Zhaolei
@ 2009-09-16 19:33                     ` Steven Rostedt
  0 siblings, 0 replies; 46+ messages in thread
From: Steven Rostedt @ 2009-09-16 19:33 UTC (permalink / raw)
  To: Zhaolei
  Cc: Thomas Gleixner, Frederic Weisbecker, KOSAKI Motohiro,
	Ingo Molnar, LKML, john stultz

On Wed, 2009-09-16 at 13:27 +0800, Zhaolei wrote:
> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> 
> This patch can trace current xtime and wall_to_monotonic. Then user can
> use ctime() to convert them to wall time which is easier to be understood,
> especially for flight-recorder which need to get trace event from a kernel
> dump file.
> 
> Example of ftrace output:
>           <idle>-0     [000] 20118.489849: gtod_update_xtime: xtime=1243265589.999999824 wall_to_monotonic=3051713268.744158739
>            <...>-4020  [001] 20118.489855: sys_open(filename: bf9e66e0, flags: 98800, mode: bf9e66e0)
>            <...>-4020  [001] 20118.489873: sys_open -> 0xffffffec
> 
> ctime(1243265590) = date:Mon May 25 11:33:10 2009
> So we can realize the task with pid 4020 open a file at
> Mon May 25 11:33:10 2009
> 
> Changelog:
> v1->v2: Rebased by Zhao Lei <zhaolei@cn.fujitsu.com>
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> ---
>  include/trace/events/xtime.h |   38 ++++++++++++++++++++++++++++++++++++++
>  kernel/time/ntp.c            |    4 ++++
>  kernel/time/timekeeping.c    |    6 ++++++

This patch will need an ACK from Thomas Gleixner and/or John Stultz.

-- Steve



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace
  2009-09-16  6:40                       ` Zhaolei
@ 2009-09-16 19:37                         ` Steven Rostedt
  2009-09-17  7:10                           ` Zhaolei
  0 siblings, 1 reply; 46+ messages in thread
From: Steven Rostedt @ 2009-09-16 19:37 UTC (permalink / raw)
  To: Zhaolei
  Cc: Frederic Weisbecker, Thomas Gleixner, KOSAKI Motohiro, Ingo Molnar, LKML

On Wed, 2009-09-16 at 14:40 +0800, Zhaolei wrote:
> Frederic Weisbecker wrote:
> > On Wed, Sep 16, 2009 at 01:29:19PM +0800, Zhaolei wrote:
> >> We can add a timer-source for ftrace to support wall-time display.
> >> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
> >> http://lkml.org/lkml/2009/7/24/103
> >>
> >> It need to applys on my patch of:
> >> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
> >>
> Hello, Frederic
> 
> Thanks for your review.
> 
> >> We can get following output:
> >>  # cd /mnt/debugfs/tracing/
> >>  # cat trace_clock
> >>  [local] global walltime
> >>  # echo sched_switch > current_tracer
> >>  # cat trace
> >>  # tracer: sched_switch
> >>  #
> >>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> >>  #              | |       |          |         |
> >>              bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
> >>              bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
> >>              bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
> >>              bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
> >>          events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
> >>  ...
> >>  # echo walltime > trace_clock
> >>  # cat trace
> >>  # tracer: sched_switch
> >>  #
> >>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
> >>  #              | |       |                        |         |
> > 
> > 
> > 
> > 
> > The headers seem a bit broken. Is it a copy paste effect or?
> 
> No, it is a format bug of patch.
> I'll fix it.
> 
> > 
> > 
> > 
> >>              bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
> >>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
> >>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
> >>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
> >>          events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
> >>  ...
> > 
> > 
> > But nice result :)
> > 
> > 
> >> Changelog:
> >> v1->v2: Rebase to top of tip tree.
> >>
> >> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> >> ---
> >>  include/linux/trace_clock.h |    1 +
> >>  kernel/trace/trace.c        |   24 ++++++++++++++++-----
> >>  kernel/trace/trace.h        |    8 ++++++-
> >>  kernel/trace/trace_clock.c  |   48 +++++++++++++++++++++++++++++++++++++++++++
> >>  kernel/trace/trace_output.c |   21 +++++++++++++++---
> >>  5 files changed, 91 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
> >> index 7a81303..322707e 100644
> >> --- a/include/linux/trace_clock.h
> >> +++ b/include/linux/trace_clock.h
> >> @@ -15,5 +15,6 @@
> >>  extern u64 notrace trace_clock_local(void);
> >>  extern u64 notrace trace_clock(void);
> >>  extern u64 notrace trace_clock_global(void);
> >> +extern u64 notrace trace_clock_walltime(void);
> >>  
> >>  #endif /* _LINUX_TRACE_CLOCK_H */
> >> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> >> index fd52a19..160b6a8 100644
> >> --- a/kernel/trace/trace.c
> >> +++ b/kernel/trace/trace.c
> >> @@ -335,9 +335,10 @@ static struct {
> >>  } trace_clocks[] = {
> >>  	{ trace_clock_local,	"local" },
> >>  	{ trace_clock_global,	"global" },
> >> +	{ trace_clock_walltime,	"walltime" },
> >>  };
> >>  
> >> -int trace_clock_id;
> >> +enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
> > 
> > 
> > 
> > Uppercase names are usually reserved for CPP macros.
> 
> I'll rename it.

After the fixes that Frederic suggested, and the required timekeeping
ACKS, I'll queue up your next version for v2.6.33. Looking forward to
your new patches.


Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-08-25  8:14                 ` [PATCH 2/3] ftrace: add tracepoint for xtime Zhaolei
  2009-08-26  2:39                   ` Steven Rostedt
@ 2009-09-16 19:56                   ` john stultz
  2009-09-16 19:58                     ` john stultz
  1 sibling, 1 reply; 46+ messages in thread
From: john stultz @ 2009-09-16 19:56 UTC (permalink / raw)
  To: Zhaolei
  Cc: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML

On Tue, Aug 25, 2009 at 1:14 AM, Zhaolei <zhaolei@cn.fujitsu.com> wrote:
> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>
> This patch can trace current xtime and wall_to_monotonic. Then user can
> use ctime() to convert them to wall time which is easier to be understood,
> especially for flight-recorder which need to get trace event from a kernel
> dump file.
>
> Example of ftrace output:
>          <idle>-0     [000] 20118.489849: gtod_update_xtime: xtime=1243265589.999999824 wall_to_monotonic=3051713268.744158739
>           <...>-4020  [001] 20118.489855: sys_open(filename: bf9e66e0, flags: 98800, mode: bf9e66e0)
>           <...>-4020  [001] 20118.489873: sys_open -> 0xffffffec
>
> ctime(1243265590) = date:Mon May 25 11:33:10 2009
> So we can realize the task with pid 4020 open a file at
> Mon May 25 11:33:10 2009
>
> Changelog:
> v1->v2: Rebased by Zhao Lei <zhaolei@cn.fujitsu.com>
>
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> ---
>  include/trace/events/xtime.h |   38 ++++++++++++++++++++++++++++++++++++++
>  kernel/time/ntp.c            |    4 ++++
>  kernel/time/timekeeping.c    |    6 ++++++
>  3 files changed, 48 insertions(+), 0 deletions(-)
>  create mode 100644 include/trace/events/xtime.h
>
> diff --git a/include/trace/events/xtime.h b/include/trace/events/xtime.h
> new file mode 100644
> index 0000000..398e679
> --- /dev/null
> +++ b/include/trace/events/xtime.h
> @@ -0,0 +1,38 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM xtime
> +
> +#if !defined(_TRACE_XTIME_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_XTIME_H
> +
> +#include <linux/time.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(gtod_update_xtime,
> +
> +       TP_PROTO(struct timespec *xtime, struct timespec *wall_to_monotonic),
> +
> +       TP_ARGS(xtime, wall_to_monotonic),
> +
> +       TP_STRUCT__entry(
> +               __field(        long,   xtime_sec               )
> +               __field(        long,   xtime_nsec              )
> +               __field(        long,   wall_to_monotonic_sec   )
> +               __field(        long,   wall_to_monotonic_nsec  )
> +       ),
> +
> +       TP_fast_assign(
> +               __entry->xtime_sec              = xtime->tv_sec;
> +               __entry->xtime_nsec             = xtime->tv_nsec;
> +               __entry->wall_to_monotonic_sec  = wall_to_monotonic->tv_sec;
> +               __entry->wall_to_monotonic_nsec = wall_to_monotonic->tv_nsec;
> +       ),
> +
> +       TP_printk("xtime=%ld.%09ld wall_to_monotonic=%ld.%09ld",
> +                 __entry->xtime_sec, __entry->xtime_nsec,
> +                 __entry->wall_to_monotonic_sec, __entry->wall_to_monotonic_nsec)
> +);
> +
> +#endif /* _TRACE_XTIME_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
> index 4800f93..fc2f13a 100644
> --- a/kernel/time/ntp.c
> +++ b/kernel/time/ntp.c
> @@ -15,6 +15,8 @@
>  #include <linux/time.h>
>  #include <linux/mm.h>
>
> +#include <trace/events/xtime.h>
> +
>  /*
>  * NTP timekeeping variables:
>  */
> @@ -218,6 +220,8 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
>                break;
>        }
>
> +       trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
> +
>        write_sequnlock(&xtime_lock);
>
>        return res;
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 03cbeb3..2e57a87 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -20,6 +20,9 @@
>  #include <linux/tick.h>
>  #include <linux/stop_machine.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/xtime.h>
> +
>  /* Structure holding internal timekeeping values. */
>  struct timekeeper {
>        /* Current clocksource used for timekeeping. */
> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
>
>        update_vsyscall(&xtime, timekeeper.clock);
>
> +       trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
> +

So the only thing to watch out on here is that xtime doesn't hold the
current time, but the accumulated time. There is some unaccumulated
time still kept in the clocksource structure.

You probably want (assuming you only need tick granularity time) to
use current_kernel_time().

As an aside, is there a reason you have to have update callbacks and
duplicate the time storage instead of using the existing interfaces?
(ie: Is this due to locking or something else?)

thanks
-john

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-09-16 19:56                   ` john stultz
@ 2009-09-16 19:58                     ` john stultz
  2009-09-16 20:32                       ` Steven Rostedt
  0 siblings, 1 reply; 46+ messages in thread
From: john stultz @ 2009-09-16 19:58 UTC (permalink / raw)
  To: Zhaolei
  Cc: Steven Rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML

On Wed, Sep 16, 2009 at 12:56 PM, john stultz <johnstul@us.ibm.com> wrote:
> On Tue, Aug 25, 2009 at 1:14 AM, Zhaolei <zhaolei@cn.fujitsu.com> wrote:
>> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>>  /* Structure holding internal timekeeping values. */
>>  struct timekeeper {
>>        /* Current clocksource used for timekeeping. */
>> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
>>
>>        update_vsyscall(&xtime, timekeeper.clock);
>>
>> +       trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
>> +
>
> So the only thing to watch out on here is that xtime doesn't hold the
> current time, but the accumulated time. There is some unaccumulated
> time still kept in the clocksource structure.
>
> You probably want (assuming you only need tick granularity time) to
> use current_kernel_time().
>
> As an aside, is there a reason you have to have update callbacks and
> duplicate the time storage instead of using the existing interfaces?
> (ie: Is this due to locking or something else?)

Doh. Sorry, you're actually tracing the timekeeping code. Not using
this to assist tracing. Got this confused.

So yea, I think this should be ok, plus or minus determining if you
really want xtime or xtime_cache.

-john

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-09-16 19:58                     ` john stultz
@ 2009-09-16 20:32                       ` Steven Rostedt
  2009-09-16 20:49                         ` john stultz
  0 siblings, 1 reply; 46+ messages in thread
From: Steven Rostedt @ 2009-09-16 20:32 UTC (permalink / raw)
  To: john stultz
  Cc: Zhaolei, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML

On Wed, 2009-09-16 at 12:58 -0700, john stultz wrote:
> On Wed, Sep 16, 2009 at 12:56 PM, john stultz <johnstul@us.ibm.com> wrote:
> > On Tue, Aug 25, 2009 at 1:14 AM, Zhaolei <zhaolei@cn.fujitsu.com> wrote:
> >> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> >>  /* Structure holding internal timekeeping values. */
> >>  struct timekeeper {
> >>        /* Current clocksource used for timekeeping. */
> >> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
> >>
> >>        update_vsyscall(&xtime, timekeeper.clock);
> >>
> >> +       trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
> >> +
> >
> > So the only thing to watch out on here is that xtime doesn't hold the
> > current time, but the accumulated time. There is some unaccumulated
> > time still kept in the clocksource structure.
> >
> > You probably want (assuming you only need tick granularity time) to
> > use current_kernel_time().
> >
> > As an aside, is there a reason you have to have update callbacks and
> > duplicate the time storage instead of using the existing interfaces?
> > (ie: Is this due to locking or something else?)
> 
> Doh. Sorry, you're actually tracing the timekeeping code. Not using
> this to assist tracing. Got this confused.
> 
> So yea, I think this should be ok, plus or minus determining if you
> really want xtime or xtime_cache.

Well this may be a real concern. It's not about tracing timekeeping
(although it adds that functionality). His second patch (I didn't Cc you
on that one) hooks to these tracepoints to update time accordingly.

What is being done is a way to have a "wall time" value being added to
the ring buffer. But this needs to be very carefully done, because the
all tracers use this, including the function tracer in NMI code. So the
clock source can not take locks or do anything fancy.

What the idea is, is to have a semi clock that is read with some kind of
fast increment, and then at clock ticks, this clock is synced up.

I think that's the idea, anyone correct me if I'm wrong ;-)

-- Steve



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-09-16 20:32                       ` Steven Rostedt
@ 2009-09-16 20:49                         ` john stultz
  2009-09-17  6:34                           ` Zhaolei
  0 siblings, 1 reply; 46+ messages in thread
From: john stultz @ 2009-09-16 20:49 UTC (permalink / raw)
  To: rostedt; +Cc: Zhaolei, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML

On Wed, 2009-09-16 at 16:32 -0400, Steven Rostedt wrote:
> On Wed, 2009-09-16 at 12:58 -0700, john stultz wrote:
> > On Wed, Sep 16, 2009 at 12:56 PM, john stultz <johnstul@us.ibm.com> wrote:
> > > On Tue, Aug 25, 2009 at 1:14 AM, Zhaolei <zhaolei@cn.fujitsu.com> wrote:
> > >> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> > >>  /* Structure holding internal timekeeping values. */
> > >>  struct timekeeper {
> > >>        /* Current clocksource used for timekeeping. */
> > >> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
> > >>
> > >>        update_vsyscall(&xtime, timekeeper.clock);
> > >>
> > >> +       trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
> > >> +
> > >
> > > So the only thing to watch out on here is that xtime doesn't hold the
> > > current time, but the accumulated time. There is some unaccumulated
> > > time still kept in the clocksource structure.
> > >
> > > You probably want (assuming you only need tick granularity time) to
> > > use current_kernel_time().
> > >
> > > As an aside, is there a reason you have to have update callbacks and
> > > duplicate the time storage instead of using the existing interfaces?
> > > (ie: Is this due to locking or something else?)
> > 
> > Doh. Sorry, you're actually tracing the timekeeping code. Not using
> > this to assist tracing. Got this confused.
> > 
> > So yea, I think this should be ok, plus or minus determining if you
> > really want xtime or xtime_cache.
> 
> Well this may be a real concern. It's not about tracing timekeeping
> (although it adds that functionality). His second patch (I didn't Cc you
> on that one) hooks to these tracepoints to update time accordingly.
> 
> What is being done is a way to have a "wall time" value being added to
> the ring buffer. But this needs to be very carefully done, because the
> all tracers use this, including the function tracer in NMI code. So the
> clock source can not take locks or do anything fancy.
> 
> What the idea is, is to have a semi clock that is read with some kind of
> fast increment, and then at clock ticks, this clock is synced up.

Hmm.. Yea, if that's the case, then I'm not a big fan of this approach.

It sounds like what's really needed is a lock-free variant of
current_kernel_time() or something close to the CLOCK_MONOTONIC_COARSE
functionality currently queued.

Doing it without locks might have some downsides, and I guess that's the
point of the callback method (updates happen at prescribed times and
likely under locks that the trace code understands so it avoids races
and deadlocks).

-john



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH 2/3] ftrace: add tracepoint for xtime
  2009-09-16 20:49                         ` john stultz
@ 2009-09-17  6:34                           ` Zhaolei
  0 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-09-17  6:34 UTC (permalink / raw)
  To: john stultz
  Cc: rostedt, KOSAKI Motohiro, Frederic Weisbecker, Ingo Molnar, LKML

john stultz wrote:
> On Wed, 2009-09-16 at 16:32 -0400, Steven Rostedt wrote:
>> On Wed, 2009-09-16 at 12:58 -0700, john stultz wrote:
>>> On Wed, Sep 16, 2009 at 12:56 PM, john stultz <johnstul@us.ibm.com> wrote:
>>>> On Tue, Aug 25, 2009 at 1:14 AM, Zhaolei <zhaolei@cn.fujitsu.com> wrote:
>>>>> From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>>>>>  /* Structure holding internal timekeeping values. */
>>>>>  struct timekeeper {
>>>>>        /* Current clocksource used for timekeeping. */
>>>>> @@ -338,6 +341,8 @@ int do_settimeofday(struct timespec *tv)
>>>>>
>>>>>        update_vsyscall(&xtime, timekeeper.clock);
>>>>>
>>>>> +       trace_gtod_update_xtime(&xtime, &wall_to_monotonic);
>>>>> +
>>>> So the only thing to watch out on here is that xtime doesn't hold the
>>>> current time, but the accumulated time. There is some unaccumulated
>>>> time still kept in the clocksource structure.
>>>>
>>>> You probably want (assuming you only need tick granularity time) to
>>>> use current_kernel_time().
>>>>
>>>> As an aside, is there a reason you have to have update callbacks and
>>>> duplicate the time storage instead of using the existing interfaces?
>>>> (ie: Is this due to locking or something else?)
>>> Doh. Sorry, you're actually tracing the timekeeping code. Not using
>>> this to assist tracing. Got this confused.
>>>
>>> So yea, I think this should be ok, plus or minus determining if you
>>> really want xtime or xtime_cache.
>> Well this may be a real concern. It's not about tracing timekeeping
>> (although it adds that functionality). His second patch (I didn't Cc you
>> on that one) hooks to these tracepoints to update time accordingly.
>>
>> What is being done is a way to have a "wall time" value being added to
>> the ring buffer. But this needs to be very carefully done, because the
>> all tracers use this, including the function tracer in NMI code. So the
>> clock source can not take locks or do anything fancy.
>>
>> What the idea is, is to have a semi clock that is read with some kind of
>> fast increment, and then at clock ticks, this clock is synced up.
> 
> Hmm.. Yea, if that's the case, then I'm not a big fan of this approach.
> 
> It sounds like what's really needed is a lock-free variant of
> current_kernel_time() or something close to the CLOCK_MONOTONIC_COARSE
> functionality currently queued.

Yes.

And it will be better if I can get current walltime fast and
with high resolution.

I see your patch of __current_kernel_time() in Aug 19,
and I think it can be used to get walltime in ftrace.
But IMHO, use tracepoint to do it maybe better for its high resolution.

Thanks
Zhaolei

> Doing it without locks might have some downsides, and I guess that's the
> point of the callback method (updates happen at prescribed times and
> likely under locks that the trace code understands so it avoids races
> and deadlocks).
> 
> -john
> 



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: Re: [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace
  2009-09-16 19:37                         ` Steven Rostedt
@ 2009-09-17  7:10                           ` Zhaolei
  2009-11-04  9:39                             ` [PATCH v3] ftrace: Add timer-source of walltime Zhaolei
                                               ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Zhaolei @ 2009-09-17  7:10 UTC (permalink / raw)
  To: rostedt
  Cc: Frederic Weisbecker, Thomas Gleixner, KOSAKI Motohiro, Ingo Molnar, LKML

Steven Rostedt wrote:
> On Wed, 2009-09-16 at 14:40 +0800, Zhaolei wrote:
>> Frederic Weisbecker wrote:
>>> On Wed, Sep 16, 2009 at 01:29:19PM +0800, Zhaolei wrote:
>>>> We can add a timer-source for ftrace to support wall-time display.
>>>> It is based on Steven Rostedt <rostedt@goodmis.org>'s suggestion in:
>>>> http://lkml.org/lkml/2009/7/24/103
>>>>
>>>> It need to applys on my patch of:
>>>> [PATCH v4 1/2] Add function to convert between calendar time and broken-down time for universal use
>>>>
>> Hello, Frederic
>>
>> Thanks for your review.
>>
>>>> We can get following output:
>>>>  # cd /mnt/debugfs/tracing/
>>>>  # cat trace_clock
>>>>  [local] global walltime
>>>>  # echo sched_switch > current_tracer
>>>>  # cat trace
>>>>  # tracer: sched_switch
>>>>  #
>>>>  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
>>>>  #              | |       |          |         |
>>>>              bash-3580  [000]  2916.188093:   3580:120:S   + [000]  3580:120:S bash
>>>>              bash-3580  [000]  2916.188205:   3580:120:S   + [000]  3580:120:S bash
>>>>              bash-3580  [000]  2916.188538:   3580:120:R   + [000]     5:115:S events/0
>>>>              bash-3580  [000]  2916.188599:   3580:120:R ==> [000]     5:115:R events/0
>>>>          events/0-5     [000]  2916.188669:      5:115:S ==> [000]  3580:120:R bash
>>>>  ...
>>>>  # echo walltime > trace_clock
>>>>  # cat trace
>>>>  # tracer: sched_switch
>>>>  #
>>>>  #           TASK-PID    CPU#           TIMESTAMP  FUNCTION
>>>>  #              | |       |                        |         |
>>>
>>>
>>>
>>> The headers seem a bit broken. Is it a copy paste effect or?
>> No, it is a format bug of patch.
>> I'll fix it.
>>
>>>
>>>
>>>>              bash-3580  [000] 2009-08-24 18:03:16.500705:   3580:120:S   + [000]  3580:120:S bash
>>>>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>>>>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S   + [000]  3580:120:S bash
>>>>              bash-3580  [000] 2009-08-24 18:03:16.500815:   3580:120:S ==> [000]     5:115:R events/0
>>>>          events/0-5     [000] 2009-08-24 18:03:16.500815:      5:115:R   + [000]  3578:120:S sshd
>>>>  ...
>>>
>>> But nice result :)
>>>
>>>
>>>> Changelog:
>>>> v1->v2: Rebase to top of tip tree.
>>>>
>>>> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
>>>> ---
>>>>  include/linux/trace_clock.h |    1 +
>>>>  kernel/trace/trace.c        |   24 ++++++++++++++++-----
>>>>  kernel/trace/trace.h        |    8 ++++++-
>>>>  kernel/trace/trace_clock.c  |   48 +++++++++++++++++++++++++++++++++++++++++++
>>>>  kernel/trace/trace_output.c |   21 +++++++++++++++---
>>>>  5 files changed, 91 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
>>>> index 7a81303..322707e 100644
>>>> --- a/include/linux/trace_clock.h
>>>> +++ b/include/linux/trace_clock.h
>>>> @@ -15,5 +15,6 @@
>>>>  extern u64 notrace trace_clock_local(void);
>>>>  extern u64 notrace trace_clock(void);
>>>>  extern u64 notrace trace_clock_global(void);
>>>> +extern u64 notrace trace_clock_walltime(void);
>>>>  
>>>>  #endif /* _LINUX_TRACE_CLOCK_H */
>>>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>>>> index fd52a19..160b6a8 100644
>>>> --- a/kernel/trace/trace.c
>>>> +++ b/kernel/trace/trace.c
>>>> @@ -335,9 +335,10 @@ static struct {
>>>>  } trace_clocks[] = {
>>>>  	{ trace_clock_local,	"local" },
>>>>  	{ trace_clock_global,	"global" },
>>>> +	{ trace_clock_walltime,	"walltime" },
>>>>  };
>>>>  
>>>> -int trace_clock_id;
>>>> +enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
>>>
>>>
>>> Uppercase names are usually reserved for CPP macros.
>> I'll rename it.
> 
> After the fixes that Frederic suggested, and the required timekeeping
> ACKS, I'll queue up your next version for v2.6.33. Looking forward to
> your new patches.

I'll send new patch after we finish xtime tracepoint's discussion.

Thanks
Zhaolei


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3] ftrace: Add timer-source of walltime
  2009-09-17  7:10                           ` Zhaolei
@ 2009-11-04  9:39                             ` Zhaolei
  2009-11-04  9:39                             ` Zhaolei
  2009-11-04  9:41                             ` Zhaolei
  2 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-11-04  9:39 UTC (permalink / raw)
  To: rostedt, Frederic Weisbecker, Thomas Gleixner, KOSAKI Motohiro,
	Ingo Molnar
  Cc: LKML

We can add a timer-source for ftrace to support wall-time display.

We can get following output:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo function > current_tracer
 # head trace
 # tracer: function
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             sshd-2644  [000] 15567.508188: fsnotify <-fsnotify_access
             sshd-2644  [000] 15567.508191: fput_light <-sys_read
             sshd-2644  [000] 15567.508299: sys_select <-sysenter_do_call
             sshd-2644  [000] 15567.508304: core_sys_select <-sys_select
             sshd-2644  [000] 15567.508306: get_fd_set <-core_sys_select
             sshd-2644  [000] 15567.508309: copy_from_user <-get_fd_set
 # echo walltime > trace_clock
 # head trace
 # tracer: function
 #
 #           TASK-PID    CPU#                  TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             sshd-2644  [000] 2009-11-04 15:53:55.454003: __spin_unlock_irq <-run_timer_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: rcu_bh_qs <-__do_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: _local_bh_enable <-__do_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: rcu_irq_exit <-irq_exit
             sshd-2644  [000] 2009-11-04 15:53:55.454003: idle_cpu <-irq_exit
             sshd-2644  [000] 2009-11-04 15:53:55.454003: fput_light <-do_select

Changelog:
v1->v2: Rebase to top of tip tree.
v2->v3: Use lockless __current_kernel_time() instead of xtime tracepoint

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/trace_clock.h |    1 +
 kernel/trace/trace.c        |   24 ++++++++++++++++++------
 kernel/trace/trace.h        |    8 +++++++-
 kernel/trace/trace_clock.c  |    7 +++++++
 kernel/trace/trace_output.c |   23 ++++++++++++++++++-----
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
index 7a81303..322707e 100644
--- a/include/linux/trace_clock.h
+++ b/include/linux/trace_clock.h
@@ -15,5 +15,6 @@
 extern u64 notrace trace_clock_local(void);
 extern u64 notrace trace_clock(void);
 extern u64 notrace trace_clock_global(void);
+extern u64 notrace trace_clock_walltime(void);
 
 #endif /* _LINUX_TRACE_CLOCK_H */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9d3067a..7885005 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -334,9 +334,10 @@ static struct {
 } trace_clocks[] = {
 	{ trace_clock_local,	"local" },
 	{ trace_clock_global,	"global" },
+	{ trace_clock_walltime,	"walltime" },
 };
 
-int trace_clock_id;
+enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
 
 /*
  * trace_parser_get_init - gets the buffer for trace parser
@@ -1642,8 +1643,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"                 TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"                     |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
@@ -3374,8 +3384,8 @@ static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
 	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
 		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
 			"%s%s%s%s", i ? " " : "",
-			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
-			i == trace_clock_id ? "]" : "");
+			i == trace_clock_type ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_type ? "]" : "");
 	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
 
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
@@ -3405,7 +3415,7 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 	if (i == ARRAY_SIZE(trace_clocks))
 		return -EINVAL;
 
-	trace_clock_id = i;
+	trace_clock_type = i;
 
 	mutex_lock(&trace_types_lock);
 
@@ -3415,6 +3425,8 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 
 	mutex_unlock(&trace_types_lock);
 
+	tracing_reset_online_cpus(&global_trace);
+
 	*fpos += cnt;
 
 	return cnt;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c30556a..52be403 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -495,7 +495,13 @@ int trace_array_printk(struct trace_array *tr,
 
 extern unsigned long trace_flags;
 
-extern int trace_clock_id;
+enum TRACE_CLOCK_TYPE {
+	TRACE_CLOCK_LOCAL,
+	TRACE_CLOCK_GLOBAL,
+	TRACE_CLOCK_WALLTIME,
+};
+
+extern enum TRACE_CLOCK_TYPE trace_clock_type;
 
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index 20c5f92..a93a6ca 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -111,3 +111,10 @@ u64 notrace trace_clock_global(void)
 
 	return now;
 }
+
+u64 notrace trace_clock_walltime(void)
+{
+	struct timespec timespec;
+	timespec = __current_kernel_time();
+	return (u64)timespec.tv_sec * NSEC_PER_SEC + timespec.tv_nsec;
+}
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b6c12c6..08249e3 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -532,15 +532,28 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
-				comm, entry->pid, iter->cpu, secs, usec_rem);
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		struct tm tm;
+		u32 remainder;
+		time_to_tm(div_u64_rem(iter->ts, NSEC_PER_SEC, &remainder),
+			   -sys_tz.tz_minuteswest * 60, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+			"%04ld-%02d-%02d %02d:%02d:%02d.%06llu: ",
+			comm, entry->pid, iter->cpu,
+			tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+			tm.tm_hour, tm.tm_min, tm.tm_sec,
+			ns2usecs(remainder));
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+			comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3] ftrace: Add timer-source of walltime
  2009-09-17  7:10                           ` Zhaolei
  2009-11-04  9:39                             ` [PATCH v3] ftrace: Add timer-source of walltime Zhaolei
@ 2009-11-04  9:39                             ` Zhaolei
  2009-11-04  9:41                             ` Zhaolei
  2 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-11-04  9:39 UTC (permalink / raw)
  To: rostedt, Frederic Weisbecker, Thomas Gleixner, KOSAKI Motohiro,
	Ingo Molnar
  Cc: LKML

We can add a timer-source for ftrace to support wall-time display.

We can get following output:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo function > current_tracer
 # head trace
 # tracer: function
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             sshd-2644  [000] 15567.508188: fsnotify <-fsnotify_access
             sshd-2644  [000] 15567.508191: fput_light <-sys_read
             sshd-2644  [000] 15567.508299: sys_select <-sysenter_do_call
             sshd-2644  [000] 15567.508304: core_sys_select <-sys_select
             sshd-2644  [000] 15567.508306: get_fd_set <-core_sys_select
             sshd-2644  [000] 15567.508309: copy_from_user <-get_fd_set
 # echo walltime > trace_clock
 # head trace
 # tracer: function
 #
 #           TASK-PID    CPU#                  TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             sshd-2644  [000] 2009-11-04 15:53:55.454003: __spin_unlock_irq <-run_timer_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: rcu_bh_qs <-__do_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: _local_bh_enable <-__do_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: rcu_irq_exit <-irq_exit
             sshd-2644  [000] 2009-11-04 15:53:55.454003: idle_cpu <-irq_exit
             sshd-2644  [000] 2009-11-04 15:53:55.454003: fput_light <-do_select

Changelog:
v1->v2: Rebase to top of tip tree.
v2->v3: Use lockless __current_kernel_time() instead of xtime tracepoint

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/trace_clock.h |    1 +
 kernel/trace/trace.c        |   24 ++++++++++++++++++------
 kernel/trace/trace.h        |    8 +++++++-
 kernel/trace/trace_clock.c  |    7 +++++++
 kernel/trace/trace_output.c |   23 ++++++++++++++++++-----
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
index 7a81303..322707e 100644
--- a/include/linux/trace_clock.h
+++ b/include/linux/trace_clock.h
@@ -15,5 +15,6 @@
 extern u64 notrace trace_clock_local(void);
 extern u64 notrace trace_clock(void);
 extern u64 notrace trace_clock_global(void);
+extern u64 notrace trace_clock_walltime(void);
 
 #endif /* _LINUX_TRACE_CLOCK_H */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9d3067a..7885005 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -334,9 +334,10 @@ static struct {
 } trace_clocks[] = {
 	{ trace_clock_local,	"local" },
 	{ trace_clock_global,	"global" },
+	{ trace_clock_walltime,	"walltime" },
 };
 
-int trace_clock_id;
+enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
 
 /*
  * trace_parser_get_init - gets the buffer for trace parser
@@ -1642,8 +1643,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"                 TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"                     |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
@@ -3374,8 +3384,8 @@ static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
 	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
 		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
 			"%s%s%s%s", i ? " " : "",
-			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
-			i == trace_clock_id ? "]" : "");
+			i == trace_clock_type ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_type ? "]" : "");
 	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
 
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
@@ -3405,7 +3415,7 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 	if (i == ARRAY_SIZE(trace_clocks))
 		return -EINVAL;
 
-	trace_clock_id = i;
+	trace_clock_type = i;
 
 	mutex_lock(&trace_types_lock);
 
@@ -3415,6 +3425,8 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 
 	mutex_unlock(&trace_types_lock);
 
+	tracing_reset_online_cpus(&global_trace);
+
 	*fpos += cnt;
 
 	return cnt;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c30556a..52be403 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -495,7 +495,13 @@ int trace_array_printk(struct trace_array *tr,
 
 extern unsigned long trace_flags;
 
-extern int trace_clock_id;
+enum TRACE_CLOCK_TYPE {
+	TRACE_CLOCK_LOCAL,
+	TRACE_CLOCK_GLOBAL,
+	TRACE_CLOCK_WALLTIME,
+};
+
+extern enum TRACE_CLOCK_TYPE trace_clock_type;
 
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index 20c5f92..a93a6ca 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -111,3 +111,10 @@ u64 notrace trace_clock_global(void)
 
 	return now;
 }
+
+u64 notrace trace_clock_walltime(void)
+{
+	struct timespec timespec;
+	timespec = __current_kernel_time();
+	return (u64)timespec.tv_sec * NSEC_PER_SEC + timespec.tv_nsec;
+}
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b6c12c6..08249e3 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -532,15 +532,28 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
-				comm, entry->pid, iter->cpu, secs, usec_rem);
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		struct tm tm;
+		u32 remainder;
+		time_to_tm(div_u64_rem(iter->ts, NSEC_PER_SEC, &remainder),
+			   -sys_tz.tz_minuteswest * 60, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+			"%04ld-%02d-%02d %02d:%02d:%02d.%06llu: ",
+			comm, entry->pid, iter->cpu,
+			tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+			tm.tm_hour, tm.tm_min, tm.tm_sec,
+			ns2usecs(remainder));
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+			comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3] ftrace: Add timer-source of walltime
  2009-09-17  7:10                           ` Zhaolei
  2009-11-04  9:39                             ` [PATCH v3] ftrace: Add timer-source of walltime Zhaolei
  2009-11-04  9:39                             ` Zhaolei
@ 2009-11-04  9:41                             ` Zhaolei
  2 siblings, 0 replies; 46+ messages in thread
From: Zhaolei @ 2009-11-04  9:41 UTC (permalink / raw)
  To: rostedt, Frederic Weisbecker, Thomas Gleixner, KOSAKI Motohiro,
	Ingo Molnar
  Cc: LKML

We can add a timer-source for ftrace to support wall-time display.

We can get following output:
 # cd /mnt/debugfs/tracing/
 # cat trace_clock
 [local] global walltime
 # echo function > current_tracer
 # head trace
 # tracer: function
 #
 #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
 #              | |       |          |         |
             sshd-2644  [000] 15567.508188: fsnotify <-fsnotify_access
             sshd-2644  [000] 15567.508191: fput_light <-sys_read
             sshd-2644  [000] 15567.508299: sys_select <-sysenter_do_call
             sshd-2644  [000] 15567.508304: core_sys_select <-sys_select
             sshd-2644  [000] 15567.508306: get_fd_set <-core_sys_select
             sshd-2644  [000] 15567.508309: copy_from_user <-get_fd_set
 # echo walltime > trace_clock
 # head trace
 # tracer: function
 #
 #           TASK-PID    CPU#                  TIMESTAMP  FUNCTION
 #              | |       |                        |         |
             sshd-2644  [000] 2009-11-04 15:53:55.454003: __spin_unlock_irq <-run_timer_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: rcu_bh_qs <-__do_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: _local_bh_enable <-__do_softirq
             sshd-2644  [000] 2009-11-04 15:53:55.454003: rcu_irq_exit <-irq_exit
             sshd-2644  [000] 2009-11-04 15:53:55.454003: idle_cpu <-irq_exit
             sshd-2644  [000] 2009-11-04 15:53:55.454003: fput_light <-do_select

Changelog:
v1->v2: Rebase to top of tip tree.
v2->v3: Use lockless __current_kernel_time() instead of xtime tracepoint

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 include/linux/trace_clock.h |    1 +
 kernel/trace/trace.c        |   24 ++++++++++++++++++------
 kernel/trace/trace.h        |    8 +++++++-
 kernel/trace/trace_clock.c  |    7 +++++++
 kernel/trace/trace_output.c |   23 ++++++++++++++++++-----
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/include/linux/trace_clock.h b/include/linux/trace_clock.h
index 7a81303..322707e 100644
--- a/include/linux/trace_clock.h
+++ b/include/linux/trace_clock.h
@@ -15,5 +15,6 @@
 extern u64 notrace trace_clock_local(void);
 extern u64 notrace trace_clock(void);
 extern u64 notrace trace_clock_global(void);
+extern u64 notrace trace_clock_walltime(void);
 
 #endif /* _LINUX_TRACE_CLOCK_H */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9d3067a..7885005 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -334,9 +334,10 @@ static struct {
 } trace_clocks[] = {
 	{ trace_clock_local,	"local" },
 	{ trace_clock_global,	"global" },
+	{ trace_clock_walltime,	"walltime" },
 };
 
-int trace_clock_id;
+enum TRACE_CLOCK_TYPE trace_clock_type = TRACE_CLOCK_LOCAL;
 
 /*
  * trace_parser_get_init - gets the buffer for trace parser
@@ -1642,8 +1643,17 @@ static void print_lat_help_header(struct seq_file *m)
 
 static void print_func_help_header(struct seq_file *m)
 {
-	seq_puts(m, "#           TASK-PID    CPU#    TIMESTAMP  FUNCTION\n");
-	seq_puts(m, "#              | |       |          |         |\n");
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"                 TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"                     |         |\n");
+	} else {
+		seq_puts(m, "#           TASK-PID    CPU# "
+			"   TIMESTAMP  FUNCTION\n");
+		seq_puts(m, "#              | |       |   "
+			"       |         |\n");
+	}
 }
 
 
@@ -3374,8 +3384,8 @@ static ssize_t tracing_clock_read(struct file *filp, char __user *ubuf,
 	for (i = 0; i < ARRAY_SIZE(trace_clocks); i++)
 		bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter,
 			"%s%s%s%s", i ? " " : "",
-			i == trace_clock_id ? "[" : "", trace_clocks[i].name,
-			i == trace_clock_id ? "]" : "");
+			i == trace_clock_type ? "[" : "", trace_clocks[i].name,
+			i == trace_clock_type ? "]" : "");
 	bufiter += snprintf(buf + bufiter, sizeof(buf) - bufiter, "\n");
 
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, bufiter);
@@ -3405,7 +3415,7 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 	if (i == ARRAY_SIZE(trace_clocks))
 		return -EINVAL;
 
-	trace_clock_id = i;
+	trace_clock_type = i;
 
 	mutex_lock(&trace_types_lock);
 
@@ -3415,6 +3425,8 @@ static ssize_t tracing_clock_write(struct file *filp, const char __user *ubuf,
 
 	mutex_unlock(&trace_types_lock);
 
+	tracing_reset_online_cpus(&global_trace);
+
 	*fpos += cnt;
 
 	return cnt;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c30556a..52be403 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -495,7 +495,13 @@ int trace_array_printk(struct trace_array *tr,
 
 extern unsigned long trace_flags;
 
-extern int trace_clock_id;
+enum TRACE_CLOCK_TYPE {
+	TRACE_CLOCK_LOCAL,
+	TRACE_CLOCK_GLOBAL,
+	TRACE_CLOCK_WALLTIME,
+};
+
+extern enum TRACE_CLOCK_TYPE trace_clock_type;
 
 /* Standard output formatting function used for function return traces */
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index 20c5f92..a93a6ca 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -111,3 +111,10 @@ u64 notrace trace_clock_global(void)
 
 	return now;
 }
+
+u64 notrace trace_clock_walltime(void)
+{
+	struct timespec timespec;
+	timespec = __current_kernel_time();
+	return (u64)timespec.tv_sec * NSEC_PER_SEC + timespec.tv_nsec;
+}
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index b6c12c6..08249e3 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -532,15 +532,28 @@ int trace_print_context(struct trace_iterator *iter)
 {
 	struct trace_seq *s = &iter->seq;
 	struct trace_entry *entry = iter->ent;
-	unsigned long long t = ns2usecs(iter->ts);
-	unsigned long usec_rem = do_div(t, USEC_PER_SEC);
-	unsigned long secs = (unsigned long)t;
 	char comm[TASK_COMM_LEN];
 
 	trace_find_cmdline(entry->pid, comm);
 
-	return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
-				comm, entry->pid, iter->cpu, secs, usec_rem);
+	if (trace_clock_type == TRACE_CLOCK_WALLTIME) {
+		struct tm tm;
+		u32 remainder;
+		time_to_tm(div_u64_rem(iter->ts, NSEC_PER_SEC, &remainder),
+			   -sys_tz.tz_minuteswest * 60, &tm);
+		return trace_seq_printf(s, "%16s-%-5d [%03d] "
+			"%04ld-%02d-%02d %02d:%02d:%02d.%06llu: ",
+			comm, entry->pid, iter->cpu,
+			tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+			tm.tm_hour, tm.tm_min, tm.tm_sec,
+			ns2usecs(remainder));
+	} else {
+		unsigned long long t = ns2usecs(iter->ts);
+		unsigned long usec_rem = do_div(t, USEC_PER_SEC);
+		unsigned long secs = (unsigned long)t;
+		return trace_seq_printf(s, "%16s-%-5d [%03d] %5lu.%06lu: ",
+			comm, entry->pid, iter->cpu, secs, usec_rem);
+	}
 }
 
 int trace_print_lat_context(struct trace_iterator *iter)
-- 
1.5.5.3



^ permalink raw reply related	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2009-11-04  9:38 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-24 10:40 [PATCH 0/3] Add walltime support for ring-buffer Zhaolei
2009-07-24 10:42 ` [PATCH] " Zhaolei
2009-07-24 10:47   ` Zhaolei
2009-07-24 10:43 ` [RFC PATCH 1/3] " Zhaolei
2009-07-24 10:43 ` [RFC PATCH 2/3] Apply walltime-supporting functions to trace system Zhaolei
2009-07-24 10:44 ` [RFC PATCH 3/3] Make ftrace display walltime in output Zhaolei
2009-07-24 13:05 ` [PATCH 0/3] Add walltime support for ring-buffer Steven Rostedt
2009-07-28  1:43   ` KOSAKI Motohiro
2009-07-28  1:53     ` Frederic Weisbecker
2009-07-28  2:19       ` Steven Rostedt
2009-08-17  9:22         ` [RFC PATCH] Add timer-source of walltime for ftrace Zhaolei
2009-08-17 16:49           ` Frederic Weisbecker
2009-08-18  2:09             ` Zhaolei
2009-08-18 18:52               ` Steven Rostedt
2009-08-18 15:57           ` KOSAKI Motohiro
2009-08-18 18:58             ` Steven Rostedt
2009-08-19  9:16               ` Zhaolei
2009-08-25  8:12               ` [PATCH 0/3] ftrace: " Zhaolei
2009-08-25  8:12                 ` [PATCH 1/3] ftrace: Move setting of clock-source out of options Zhaolei
2009-08-26  2:35                   ` Steven Rostedt
2009-08-26  7:23                   ` [tip:tracing/core] " tip-bot for Zhaolei
2009-08-25  8:14                 ` [PATCH 2/3] ftrace: add tracepoint for xtime Zhaolei
2009-08-26  2:39                   ` Steven Rostedt
2009-09-01  8:03                     ` Zhaolei
2009-09-16 19:56                   ` john stultz
2009-09-16 19:58                     ` john stultz
2009-09-16 20:32                       ` Steven Rostedt
2009-09-16 20:49                         ` john stultz
2009-09-17  6:34                           ` Zhaolei
2009-08-25  8:15                 ` [PATCH 3/3] ftrace: Add timer-source of walltime for ftrace Zhaolei
2009-08-26  2:52                   ` Steven Rostedt
2009-09-16  5:25                 ` [PATCH v2 0/2] " Zhaolei
2009-09-16  5:27                   ` [PATCH v2 1/2] ftrace: add tracepoint for xtime Zhaolei
2009-09-16 19:33                     ` Steven Rostedt
2009-09-16  5:29                   ` [PATCH v2 2/2] ftrace: Add timer-source of walltime for ftrace Zhaolei
2009-09-16  5:59                     ` Frederic Weisbecker
2009-09-16  6:40                       ` Zhaolei
2009-09-16 19:37                         ` Steven Rostedt
2009-09-17  7:10                           ` Zhaolei
2009-11-04  9:39                             ` [PATCH v3] ftrace: Add timer-source of walltime Zhaolei
2009-11-04  9:39                             ` Zhaolei
2009-11-04  9:41                             ` Zhaolei
2009-07-28  2:23       ` [PATCH 0/3] Add walltime support for ring-buffer KOSAKI Motohiro
2009-08-03  7:22         ` Ingo Molnar
2009-08-03  9:32           ` KOSAKI Motohiro
2009-08-04 14:38             ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.