intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] additional intel_gpu_top profiling features
@ 2011-09-05 20:19 Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 1/6] intel_gpu_top: account for time spent in syscalls Eugeni Dodonov
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This patchset adds a number of small but useful features into intel_gpu_top
tool, as asked by Chris Wilson:

 - support for getopt to customize internal parameters at runtime (such as
   number of samplings per second).
 - possibility of non-interactive execution to collect GPU usage into a log
   file for future analysis
 - possibility to profile a specific command, leaving when the profiled
   command reaches its end of the execution
 - collection of initial statistics before the profile starts.
 - account for the time spent within syscalls to provide more appropriate
   logging and more precise screen updates (e.g., instead of refreshing once
   a second, we could potentially take way longer than that due to the
   system calls overhead)

Eugeni Dodonov (6):
  intel_gpu_top: account for time spent in syscalls
  intel_gpu_top: suport command line parameters and variable samples
    per     second
  intel_gpu_tool: initial support for non-screen output
  intel_gpu_top: initialize monitoring statistics at startup
  intel_gpu_top: support non-interactive mode
  intel_gpu_top: support profiling user-specified commands

 man/intel_gpu_top.1   |   22 ++++
 tools/intel_gpu_top.c |  303 +++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 280 insertions(+), 45 deletions(-)

-- 
1.7.6.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/6] intel_gpu_top: account for time spent in syscalls
  2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
@ 2011-09-05 20:19 ` Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second Eugeni Dodonov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This allows intel_gpu_top to properly account for time spent inside system
calls. Effectively, with previous implementation, intel_gpu_top could
spent longer than 1s between consecutive measures. This attempts to
minimize the extra time spent while polling for data.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 tools/intel_gpu_top.c |   53 +++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index e9fbf43..64ce828 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -1,5 +1,6 @@
 /*
  * Copyright © 2007 Intel Corporation
+ * Copyright © 2011 Intel Corporation
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -22,6 +23,7 @@
  *
  * Authors:
  *    Eric Anholt <eric@anholt.net>
+ *    Eugeni Dodonov <eugeni.dodonov@intel.com>
  *
  */
 
@@ -30,6 +32,7 @@
 #include <stdio.h>
 #include <err.h>
 #include <sys/ioctl.h>
+#include <sys/time.h>
 #include "intel_gpu_tools.h"
 #include "instdone.h"
 
@@ -104,6 +107,14 @@ const char *stats_reg_names[STATS_COUNT] = {
 uint64_t stats[STATS_COUNT];
 uint64_t last_stats[STATS_COUNT];
 
+static unsigned long
+gettime(void)
+{
+    struct timeval t;
+    gettimeofday(&t, NULL);
+    return (t.tv_usec + (t.tv_sec * 1000000));
+}
+
 static int
 top_bits_sort(const void *a, const void *b)
 {
@@ -362,21 +373,23 @@ static void ring_sample(struct ring *ring)
 	ring->full += full;
 }
 
-static void ring_print(struct ring *ring)
+static void ring_print(struct ring *ring, unsigned long samples_per_sec)
 {
-	int percent, len;
+	int samples_to_percent_ratio, percent, len;
 
 	if (!ring->size)
 		return;
 
-	percent = 100 - ring->idle / SAMPLES_TO_PERCENT_RATIO;
+	/* Calculate current value of samples_to_percent_ratio */
+	samples_to_percent_ratio = (ring->idle * 100) / samples_per_sec;
+	percent = 100 - samples_to_percent_ratio;
 	len = printf("%25s busy: %3d%%: ", ring->name, percent);
 	print_percentage_bar (percent, len);
 	printf("%24s space: %d/%d (%d%%)\n",
 	       ring->name,
-	       (int)(ring->full / SAMPLES_PER_SEC),
+	       (int)(ring->full / samples_per_sec),
 	       ring->size,
-	       (int)((ring->full / SAMPLES_TO_PERCENT_RATIO) / ring->size));
+	       (int)((ring->full / samples_to_percent_ratio) / ring->size));
 }
 
 int main(int argc, char **argv)
@@ -418,18 +431,25 @@ int main(int argc, char **argv)
 
 	for (;;) {
 		int j;
+		unsigned long long t1, ti, tf;
+		unsigned long long def_sleep = 1000000 / SAMPLES_PER_SEC;
+		unsigned long long last_samples_per_sec = SAMPLES_PER_SEC;
 		char clear_screen[] = {0x1b, '[', 'H',
 				       0x1b, '[', 'J',
 				       0x0};
 		int percent;
 		int len;
 
+		t1 = gettime();
+
 		ring_reset(&render_ring);
 		ring_reset(&bsd_ring);
 		ring_reset(&bsd6_ring);
 		ring_reset(&blt_ring);
 
 		for (i = 0; i < SAMPLES_PER_SEC; i++) {
+			long long interval;
+			ti = gettime();
 			if (IS_965(devid)) {
 				instdone = INREG(INST_DONE_I965);
 				instdone1 = INREG(INST_DONE_1);
@@ -443,7 +463,16 @@ int main(int argc, char **argv)
 			ring_sample(&bsd_ring);
 			ring_sample(&bsd6_ring);
 			ring_sample(&blt_ring);
-			usleep(1000000 / SAMPLES_PER_SEC);
+
+			tf = gettime();
+			if (tf - t1 >= 1000000) {
+				/* We are out of sync, bail out */
+				last_samples_per_sec = i+1;
+				break;
+			}
+			interval = def_sleep - (tf - ti);
+			if (interval > 0)
+				usleep(interval);
 		}
 
 		if (HAS_STATS_REGS(devid)) {
@@ -477,16 +506,16 @@ int main(int argc, char **argv)
 
 		print_clock_info(pci_dev);
 
-		ring_print(&render_ring);
-		ring_print(&bsd_ring);
-		ring_print(&bsd6_ring);
-		ring_print(&blt_ring);
+		ring_print(&render_ring, last_samples_per_sec);
+		ring_print(&bsd_ring, last_samples_per_sec);
+		ring_print(&bsd6_ring, last_samples_per_sec);
+		ring_print(&blt_ring, last_samples_per_sec);
 
 		printf("\n%30s  %s\n", "task", "percent busy");
 		for (i = 0; i < max_lines; i++) {
 			if (top_bits_sorted[i]->count > 0) {
-				percent = top_bits_sorted[i]->count /
-					SAMPLES_TO_PERCENT_RATIO;
+				percent = (top_bits_sorted[i]->count * 100) /
+					last_samples_per_sec;
 				len = printf("%30s: %3d%%: ",
 					     top_bits_sorted[i]->bit->name,
 					     percent);
-- 
1.7.6.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second
  2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 1/6] intel_gpu_top: account for time spent in syscalls Eugeni Dodonov
@ 2011-09-05 20:19 ` Eugeni Dodonov
  2011-09-05 21:44   ` Chris Wilson
  2011-09-05 20:19 ` [PATCH 3/6] intel_gpu_tool: initial support for non-screen output Eugeni Dodonov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This patch adds support for getopt, and adds two default parameters to it:
-h to show usage notes; and -s to allow user to define number of samples
to acquire per second.

Manpage documentation is also adjusted accordingly.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 man/intel_gpu_top.1   |    9 ++++++++
 tools/intel_gpu_top.c |   52 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/man/intel_gpu_top.1 b/man/intel_gpu_top.1
index 79c9c0e..2cbbec9 100644
--- a/man/intel_gpu_top.1
+++ b/man/intel_gpu_top.1
@@ -4,11 +4,20 @@
 .SH NAME
 intel_gpu_top \- Display a top-like summary of Intel GPU usage
 .SH SYNOPSIS
+.nf
 .B intel_gpu_top
+.B intel_gpu_top [ parameters ]
 .SH DESCRIPTION
 .B intel_gpu_top
 is a tool to display usage information of an Intel GPU.  It requires root
 privilege to map the graphics device.
+.SS Options
+.TP
+.B -s [samples per second]
+number of samples to acquire per second
+.TP
+.B -h
+show usage notes
 .PP
 Note that idle units are not
 displayed, so an entirely idle GPU will only display the ring status and
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index 64ce828..abe9d4b 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -392,6 +392,23 @@ static void ring_print(struct ring *ring, unsigned long samples_per_sec)
 	       (int)((ring->full / samples_to_percent_ratio) / ring->size));
 }
 
+static void
+usage(const char *appname)
+{
+	printf("intel_gpu_top - Display a top-like summary of Intel GPU usage\n"
+			"\n"
+			"usage: %s [parameters]\n"
+			"\n"
+			"The following parameters apply:\n"
+			"[-s <samples>]       samples per seconds (default %d)\n"
+			"[-h]                 show this help screen\n"
+			"\n",
+			appname,
+			SAMPLES_PER_SEC
+		  );
+	return;
+}
+
 int main(int argc, char **argv)
 {
 	struct pci_device *pci_dev;
@@ -408,7 +425,34 @@ int main(int argc, char **argv)
 		.name = "blitter",
 		.mmio = 0x22030,
 	};
-	int i;
+	int i, ch;
+	int samples_per_sec = SAMPLES_PER_SEC;
+
+	/* Parse options? */
+	while ((ch = getopt(argc, argv, "s:h")) != -1)
+	{
+		switch (ch)
+		{
+			case 's': samples_per_sec = atoi(optarg);
+					  if (samples_per_sec < 100) {
+						  fprintf(stderr, "Error: samples per second must be >= 100\n");
+						  exit(1);
+					  }
+					  break;
+			case 'h':
+				  usage(argv[0]);
+				  exit(0);
+				  break;
+			default:
+				  fprintf(stderr, "Invalid flag %c!\n", (char)optopt);
+				  usage(argv[0]);
+				  exit(1);
+				  break;
+		}
+
+	}
+	argc -= optind;
+	argv += optind;
 
 	pci_dev = intel_get_pci_device();
 	devid = pci_dev->device_id;
@@ -432,8 +476,8 @@ int main(int argc, char **argv)
 	for (;;) {
 		int j;
 		unsigned long long t1, ti, tf;
-		unsigned long long def_sleep = 1000000 / SAMPLES_PER_SEC;
-		unsigned long long last_samples_per_sec = SAMPLES_PER_SEC;
+		unsigned long long def_sleep = 1000000 / samples_per_sec;
+		unsigned long long last_samples_per_sec = samples_per_sec;
 		char clear_screen[] = {0x1b, '[', 'H',
 				       0x1b, '[', 'J',
 				       0x0};
@@ -447,7 +491,7 @@ int main(int argc, char **argv)
 		ring_reset(&bsd6_ring);
 		ring_reset(&blt_ring);
 
-		for (i = 0; i < SAMPLES_PER_SEC; i++) {
+		for (i = 0; i < samples_per_sec; i++) {
 			long long interval;
 			ti = gettime();
 			if (IS_965(devid)) {
-- 
1.7.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/6] intel_gpu_tool: initial support for non-screen output
  2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 1/6] intel_gpu_top: account for time spent in syscalls Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second Eugeni Dodonov
@ 2011-09-05 20:19 ` Eugeni Dodonov
  2011-09-05 22:56   ` Łukasz Kuryło
  2011-09-05 20:19 ` [PATCH 4/6] intel_gpu_top: initialize monitoring statistics at startup Eugeni Dodonov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This patch adds initial support for non-stdio output, to be used for
non-interactive monitoring.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 tools/intel_gpu_top.c |   28 +++++++++++++++-------------
 1 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index abe9d4b..edb4a82 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -373,7 +373,8 @@ static void ring_sample(struct ring *ring)
 	ring->full += full;
 }
 
-static void ring_print(struct ring *ring, unsigned long samples_per_sec)
+static void ring_print(struct ring *ring, unsigned long samples_per_sec,
+		FILE *output)
 {
 	int samples_to_percent_ratio, percent, len;
 
@@ -383,9 +384,9 @@ static void ring_print(struct ring *ring, unsigned long samples_per_sec)
 	/* Calculate current value of samples_to_percent_ratio */
 	samples_to_percent_ratio = (ring->idle * 100) / samples_per_sec;
 	percent = 100 - samples_to_percent_ratio;
-	len = printf("%25s busy: %3d%%: ", ring->name, percent);
+	len = fprintf(output, "%25s busy: %3d%%: ", ring->name, percent);
 	print_percentage_bar (percent, len);
-	printf("%24s space: %d/%d (%d%%)\n",
+	fprintf(output, "%24s space: %d/%d (%d%%)\n",
 	       ring->name,
 	       (int)(ring->full / samples_per_sec),
 	       ring->size,
@@ -427,6 +428,7 @@ int main(int argc, char **argv)
 	};
 	int i, ch;
 	int samples_per_sec = SAMPLES_PER_SEC;
+	FILE *output = stdout;
 
 	/* Parse options? */
 	while ((ch = getopt(argc, argv, "s:h")) != -1)
@@ -546,30 +548,30 @@ int main(int argc, char **argv)
 		if (max_lines >= num_instdone_bits)
 			max_lines = num_instdone_bits;
 
-		printf("%s", clear_screen);
+		fprintf(output, "%s", clear_screen);
 
 		print_clock_info(pci_dev);
 
-		ring_print(&render_ring, last_samples_per_sec);
-		ring_print(&bsd_ring, last_samples_per_sec);
-		ring_print(&bsd6_ring, last_samples_per_sec);
-		ring_print(&blt_ring, last_samples_per_sec);
+		ring_print(&render_ring, last_samples_per_sec, output);
+		ring_print(&bsd_ring, last_samples_per_sec, output);
+		ring_print(&bsd6_ring, last_samples_per_sec, output);
+		ring_print(&blt_ring, last_samples_per_sec, output);
 
-		printf("\n%30s  %s\n", "task", "percent busy");
+		fprintf(output, "\n%30s  %s\n", "task", "percent busy");
 		for (i = 0; i < max_lines; i++) {
 			if (top_bits_sorted[i]->count > 0) {
 				percent = (top_bits_sorted[i]->count * 100) /
 					last_samples_per_sec;
-				len = printf("%30s: %3d%%: ",
+				len = fprintf(output, "%30s: %3d%%: ",
 					     top_bits_sorted[i]->bit->name,
 					     percent);
 				print_percentage_bar (percent, len);
 			} else {
-				printf("%*s", PERCENTAGE_BAR_END, "");
+				fprintf(output, "%*s", PERCENTAGE_BAR_END, "");
 			}
 
 			if (i < STATS_COUNT && HAS_STATS_REGS(devid)) {
-				printf("%13s: %llu (%lld/sec)",
+				fprintf(output, "%13s: %llu (%lld/sec)",
 				       stats_reg_names[i],
 				       stats[i],
 				       stats[i] - last_stats[i]);
@@ -578,7 +580,7 @@ int main(int argc, char **argv)
 				if (!top_bits_sorted[i]->count)
 					break;
 			}
-			printf("\n");
+			fprintf(output, "\n");
 		}
 
 		for (i = 0; i < num_instdone_bits; i++) {
-- 
1.7.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/6] intel_gpu_top: initialize monitoring statistics at startup
  2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
                   ` (2 preceding siblings ...)
  2011-09-05 20:19 ` [PATCH 3/6] intel_gpu_tool: initial support for non-screen output Eugeni Dodonov
@ 2011-09-05 20:19 ` Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 5/6] intel_gpu_top: support non-interactive mode Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 6/6] intel_gpu_top: support profiling user-specified commands Eugeni Dodonov
  5 siblings, 0 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This patch initializes the last_stats[] for registers prior to starting
the monitoring itself. This way, the first measure will already contain
the difference from the previous value instead of non-initialized value.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 tools/intel_gpu_top.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index edb4a82..e2dd173 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -475,6 +475,22 @@ int main(int argc, char **argv)
 		ring_init(&blt_ring);
 	}
 
+    /* Initialize GPU stats */
+    if (HAS_STATS_REGS(devid)) {
+        for (i = 0; i < STATS_COUNT; i++) {
+            uint32_t stats_high, stats_low, stats_high_2;
+
+            do {
+                stats_high = INREG(stats_regs[i] + 4);
+                stats_low = INREG(stats_regs[i]);
+                stats_high_2 = INREG(stats_regs[i] + 4);
+            } while (stats_high != stats_high_2);
+
+            last_stats[i] = (uint64_t)stats_high << 32 |
+                stats_low;
+        }
+    }
+
 	for (;;) {
 		int j;
 		unsigned long long t1, ti, tf;
-- 
1.7.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/6] intel_gpu_top: support non-interactive mode
  2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
                   ` (3 preceding siblings ...)
  2011-09-05 20:19 ` [PATCH 4/6] intel_gpu_top: initialize monitoring statistics at startup Eugeni Dodonov
@ 2011-09-05 20:19 ` Eugeni Dodonov
  2011-09-05 20:19 ` [PATCH 6/6] intel_gpu_top: support profiling user-specified commands Eugeni Dodonov
  5 siblings, 0 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This patch adds support for non-interactive mode, invoked by running with
'-o output' switch. In this case, no interactive output is being
performed, but the execution statistics are being saved into the output
file.

The output file is generated in both human and gnuplot-readable format.

Unlike interactive mode, where non-supported pipes and non-active
registers are skipped, the content of such pipes and registers is recorded
into the log file to simplify parsing and standardize the list of columns.

Also, unlike interactive mode, the registers are not sorted according to
the usage - this way, their variation over time can be analysed offline.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 man/intel_gpu_top.1   |    3 +
 tools/intel_gpu_top.c |  150 +++++++++++++++++++++++++++++++++++-------------
 2 files changed, 112 insertions(+), 41 deletions(-)

diff --git a/man/intel_gpu_top.1 b/man/intel_gpu_top.1
index 2cbbec9..bca83f0 100644
--- a/man/intel_gpu_top.1
+++ b/man/intel_gpu_top.1
@@ -16,6 +16,9 @@ privilege to map the graphics device.
 .B -s [samples per second]
 number of samples to acquire per second
 .TP
+.B -o [output file]
+run non-interactively and collect usage statistics to [file]
+.TP
 .B -h
 show usage notes
 .PP
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index e2dd173..88f7157 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -373,24 +373,39 @@ static void ring_sample(struct ring *ring)
 	ring->full += full;
 }
 
+static void ring_print_header(FILE *out, struct ring *ring)
+{
+    fprintf(out, "%.6s%%\tops\t",
+            ring->name
+          );
+}
+
 static void ring_print(struct ring *ring, unsigned long samples_per_sec,
 		FILE *output)
 {
 	int samples_to_percent_ratio, percent, len;
 
-	if (!ring->size)
-		return;
-
 	/* Calculate current value of samples_to_percent_ratio */
 	samples_to_percent_ratio = (ring->idle * 100) / samples_per_sec;
 	percent = 100 - samples_to_percent_ratio;
-	len = fprintf(output, "%25s busy: %3d%%: ", ring->name, percent);
-	print_percentage_bar (percent, len);
-	fprintf(output, "%24s space: %d/%d (%d%%)\n",
-	       ring->name,
-	       (int)(ring->full / samples_per_sec),
-	       ring->size,
-	       (int)((ring->full / samples_to_percent_ratio) / ring->size));
+
+	if (output == stdout) {
+		if (!ring->size)
+			return;
+
+		len = fprintf(output, "%25s busy: %3d%%: ", ring->name, percent);
+		print_percentage_bar (percent, len);
+		fprintf(output, "%24s space: %d/%d (%d%%)\n",
+			   ring->name,
+			   (int)(ring->full / samples_per_sec),
+			   ring->size,
+			   (int)((ring->full / samples_to_percent_ratio) / ring->size));
+	} else {
+		fprintf(output, "%3d\t%d\t",
+			   (ring->size) ? 100 - ring->idle / samples_to_percent_ratio : -1,
+			   (ring->size) ? (int)(ring->full / samples_per_sec) : -1
+			   );
+	}
 }
 
 static void
@@ -402,6 +417,7 @@ usage(const char *appname)
 			"\n"
 			"The following parameters apply:\n"
 			"[-s <samples>]       samples per seconds (default %d)\n"
+            "[-o <file>]          output to file (default to stdio)\n"
 			"[-h]                 show this help screen\n"
 			"\n",
 			appname,
@@ -429,9 +445,11 @@ int main(int argc, char **argv)
 	int i, ch;
 	int samples_per_sec = SAMPLES_PER_SEC;
 	FILE *output = stdout;
+	double elapsed_time=0;
+	int print_headers=1;
 
 	/* Parse options? */
-	while ((ch = getopt(argc, argv, "s:h")) != -1)
+	while ((ch = getopt(argc, argv, "s:o:h")) != -1)
 	{
 		switch (ch)
 		{
@@ -441,6 +459,13 @@ int main(int argc, char **argv)
 						  exit(1);
 					  }
 					  break;
+			case 'o': output = fopen(optarg, "w");
+					  if (!output)
+					  {
+						  perror("fopen");
+						  exit(1);
+					  }
+					  break;
 			case 'h':
 				  usage(argv[0]);
 				  exit(0);
@@ -493,7 +518,7 @@ int main(int argc, char **argv)
 
 	for (;;) {
 		int j;
-		unsigned long long t1, ti, tf;
+		unsigned long long t1, ti, tf, t2;
 		unsigned long long def_sleep = 1000000 / samples_per_sec;
 		unsigned long long last_samples_per_sec = samples_per_sec;
 		char clear_screen[] = {0x1b, '[', 'H',
@@ -564,39 +589,82 @@ int main(int argc, char **argv)
 		if (max_lines >= num_instdone_bits)
 			max_lines = num_instdone_bits;
 
-		fprintf(output, "%s", clear_screen);
-
-		print_clock_info(pci_dev);
-
-		ring_print(&render_ring, last_samples_per_sec, output);
-		ring_print(&bsd_ring, last_samples_per_sec, output);
-		ring_print(&bsd6_ring, last_samples_per_sec, output);
-		ring_print(&blt_ring, last_samples_per_sec, output);
-
-		fprintf(output, "\n%30s  %s\n", "task", "percent busy");
-		for (i = 0; i < max_lines; i++) {
-			if (top_bits_sorted[i]->count > 0) {
-				percent = (top_bits_sorted[i]->count * 100) /
-					last_samples_per_sec;
-				len = fprintf(output, "%30s: %3d%%: ",
-					     top_bits_sorted[i]->bit->name,
-					     percent);
-				print_percentage_bar (percent, len);
-			} else {
-				fprintf(output, "%*s", PERCENTAGE_BAR_END, "");
+        t2 = gettime();
+        elapsed_time += (t2 - t1) / 1000000.0;
+
+		if (output == stdout) {
+			fprintf(output, "%s", clear_screen);
+			print_clock_info(pci_dev);
+
+			ring_print(&render_ring, last_samples_per_sec, output);
+			ring_print(&bsd_ring, last_samples_per_sec, output);
+			ring_print(&bsd6_ring, last_samples_per_sec, output);
+			ring_print(&blt_ring, last_samples_per_sec, output);
+
+			fprintf(output, "\n%30s  %s\n", "task", "percent busy");
+			for (i = 0; i < max_lines; i++) {
+				if (top_bits_sorted[i]->count > 0) {
+					percent = (top_bits_sorted[i]->count * 100) /
+						last_samples_per_sec;
+					len = fprintf(output, "%30s: %3d%%: ",
+							 top_bits_sorted[i]->bit->name,
+							 percent);
+					print_percentage_bar (percent, len);
+				} else {
+					fprintf(output, "%*s", PERCENTAGE_BAR_END, "");
+				}
+
+				if (i < STATS_COUNT && HAS_STATS_REGS(devid)) {
+					fprintf(output, "%13s: %llu (%lld/sec)",
+						   stats_reg_names[i],
+						   stats[i],
+						   stats[i] - last_stats[i]);
+					last_stats[i] = stats[i];
+				} else {
+					if (!top_bits_sorted[i]->count)
+						break;
+				}
+				fprintf(output, "\n");
+			}
+		} else {
+			/* Print headers for columns at first run */
+			if (print_headers) {
+				fprintf(output, "# time\t");
+				ring_print_header(output, &render_ring);
+				ring_print_header(output, &bsd_ring);
+				ring_print_header(output, &bsd6_ring);
+				ring_print_header(output, &blt_ring);
+				for (i = 0; i < MAX_NUM_TOP_BITS; i++) {
+					if (i < STATS_COUNT && HAS_STATS_REGS(devid)) {
+						fprintf(output, "%.6s\t",
+							   stats_reg_names[i]
+							   );
+					}
+					if (!top_bits[i].count)
+						continue;
+				}
+				fprintf(output, "\n");
+				print_headers = 0;
 			}
 
-			if (i < STATS_COUNT && HAS_STATS_REGS(devid)) {
-				fprintf(output, "%13s: %llu (%lld/sec)",
-				       stats_reg_names[i],
-				       stats[i],
-				       stats[i] - last_stats[i]);
-				last_stats[i] = stats[i];
-			} else {
-				if (!top_bits_sorted[i]->count)
-					break;
+			/* Print statistics */
+			fprintf(output, "%.2f\t", elapsed_time);
+			ring_print(&render_ring, last_samples_per_sec, output);
+			ring_print(&bsd_ring, last_samples_per_sec, output);
+			ring_print(&bsd6_ring, last_samples_per_sec, output);
+			ring_print(&blt_ring, last_samples_per_sec, output);
+
+			for (i = 0; i < MAX_NUM_TOP_BITS; i++) {
+				if (i < STATS_COUNT && HAS_STATS_REGS(devid)) {
+					fprintf(output, "%lu\t",
+						   stats[i] - last_stats[i]);
+					last_stats[i] = stats[i];
+				}
+					if (!top_bits[i].count)
+						continue;
 			}
 			fprintf(output, "\n");
+			fflush(output);
 		}
 
 		for (i = 0; i < num_instdone_bits; i++) {
-- 
1.7.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/6] intel_gpu_top: support profiling user-specified commands
  2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
                   ` (4 preceding siblings ...)
  2011-09-05 20:19 ` [PATCH 5/6] intel_gpu_top: support non-interactive mode Eugeni Dodonov
@ 2011-09-05 20:19 ` Eugeni Dodonov
  5 siblings, 0 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 20:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Eugeni Dodonov

From: Eugeni Dodonov <eugeni.dodonov@intel.com>

This patch adds support for running intel_gpu_top to profile specific
commands. The required command will be carried out in separate process,
and main intel_gpu_top will leave when the child process will exit.

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
---
 man/intel_gpu_top.1   |   10 ++++++++
 tools/intel_gpu_top.c |   56 ++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 65 insertions(+), 1 deletions(-)

diff --git a/man/intel_gpu_top.1 b/man/intel_gpu_top.1
index bca83f0..db2f362 100644
--- a/man/intel_gpu_top.1
+++ b/man/intel_gpu_top.1
@@ -19,8 +19,18 @@ number of samples to acquire per second
 .B -o [output file]
 run non-interactively and collect usage statistics to [file]
 .TP
+.B -e ["command to profile"]
+execute a command, and leave when it is finished. Note that the entire command
+with all parameters should be included as one parameter.
+.TP
 .B -h
 show usage notes
+.SH EXAMPLES
+.TP
+intel_gpu_top -o "cairo-trace-gvim.log" -s 100 -e "cairo-perf-trace /tmp/gvim"
+will run cairo-perf-trace with /tmp/gvim trace, non-interactively, saving the
+statistics into cairo-trace-gvim.log file, and collecting 100 samples per
+second.
 .PP
 Note that idle units are not
 displayed, so an entirely idle GPU will only display the ring status and
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index 88f7157..3619d1b 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -33,6 +33,8 @@
 #include <err.h>
 #include <sys/ioctl.h>
 #include <sys/time.h>
+#include <sys/wait.h>
+#include <string.h>
 #include "intel_gpu_tools.h"
 #include "instdone.h"
 
@@ -447,12 +449,17 @@ int main(int argc, char **argv)
 	FILE *output = stdout;
 	double elapsed_time=0;
 	int print_headers=1;
+	pid_t child_pid=-1;
+	int child_stat;
+	char *cmd=NULL;
 
 	/* Parse options? */
-	while ((ch = getopt(argc, argv, "s:o:h")) != -1)
+	while ((ch = getopt(argc, argv, "s:o:e:h")) != -1)
 	{
 		switch (ch)
 		{
+			case 'e': cmd = strdup(optarg);
+					  break;
 			case 's': samples_per_sec = atoi(optarg);
 					  if (samples_per_sec < 100) {
 						  fprintf(stderr, "Error: samples per second must be >= 100\n");
@@ -481,6 +488,37 @@ int main(int argc, char **argv)
 	argc -= optind;
 	argv += optind;
 
+	/* Do we have a command to run? */
+	if (cmd != NULL)
+	{
+		if (output != stdout) {
+			fprintf(output, "# Profiling: %s\n", cmd);
+			fflush(output);
+		}
+		child_pid = fork();
+		if (child_pid < 0)
+		{
+			perror("fork");
+			exit(1);
+		}
+		else if (child_pid == 0) {
+			int res;
+			res = system(cmd);
+            free(cmd);
+			if (res < 0)
+				perror("running command");
+			if (output != stdout) {
+				fflush(output);
+				fprintf(output, "# %s exited with status %d\n", cmd, res);
+				fflush(output);
+			}
+			exit(0);
+		}
+        else {
+            free(cmd);
+        }
+	}
+
 	pci_dev = intel_get_pci_device();
 	devid = pci_dev->device_id;
 	intel_get_mmio(pci_dev);
@@ -673,7 +711,23 @@ int main(int argc, char **argv)
 			if (i < STATS_COUNT)
 				last_stats[i] = stats[i];
 		}
+
+		/* Check if child has gone */
+		if (child_pid > 0)
+		{
+			int res;
+			if ((res = waitpid(child_pid, &child_stat, WNOHANG)) == -1) {
+				perror("waitpid");
+				exit(1);
+			}
+			if (res == 0)
+				continue;
+			if (WIFEXITED(child_stat))
+				break;
+		}
 	}
 
+	fclose(output);
+
 	return 0;
 }
-- 
1.7.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second
  2011-09-05 20:19 ` [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second Eugeni Dodonov
@ 2011-09-05 21:44   ` Chris Wilson
  2011-09-05 22:16     ` Eugeni Dodonov
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2011-09-05 21:44 UTC (permalink / raw)
  To: Eugeni Dodonov, intel-gfx; +Cc: Eugeni Dodonov

On Mon,  5 Sep 2011 17:19:29 -0300, Eugeni Dodonov <eugeni@dodonov.net> wrote:
> From: Eugeni Dodonov <eugeni.dodonov@intel.com>
> 
> This patch adds support for getopt, and adds two default parameters to it:
> -h to show usage notes; and -s to allow user to define number of samples
> to acquire per second.

Just a minor style issue, otherwise it looks good. All I need is
someway to correlate GPU activity with batches (and especially the
contents of those batches) and with even higher level code and then I'd
be happy. Oh, and integrated with a timeline of CPU activity, of course.
:-)
 
> +	/* Parse options? */
> +	while ((ch = getopt(argc, argv, "s:h")) != -1)
> +	{
> +		switch (ch)
> +		{
> +			case 's': samples_per_sec = atoi(optarg);

In the modules we own, we have adopted the kernel CODING_STYLE as our
standard. 8 space indents, 80 cols line (except where readibility is
improved by going over), braces on the same line as the control flow,
/*
 * This style of long comments.
 */
and case statements should being at the same indentation as the switch
and so should the parameters of a multiline function...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second
  2011-09-05 21:44   ` Chris Wilson
@ 2011-09-05 22:16     ` Eugeni Dodonov
  0 siblings, 0 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-05 22:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Eugeni Dodonov


[-- Attachment #1.1: Type: text/plain, Size: 360 bytes --]

On Mon, Sep 5, 2011 at 18:44, Chris Wilson <chris@chris-wilson.co.uk> wrote:

> In the modules we own, we have adopted the kernel CODING_STYLE as our
> standard. 8 space indents, 80 cols line (except where readibility is
> improved by going over), braces on the same line as the control flow,
>

Fixed, thanks!

-- 
Eugeni Dodonov <http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 665 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/6] intel_gpu_tool: initial support for non-screen output
  2011-09-05 20:19 ` [PATCH 3/6] intel_gpu_tool: initial support for non-screen output Eugeni Dodonov
@ 2011-09-05 22:56   ` Łukasz Kuryło
  2011-09-05 23:48     ` Łukasz Kuryło
  2011-09-06  0:05     ` Eugeni Dodonov
  0 siblings, 2 replies; 12+ messages in thread
From: Łukasz Kuryło @ 2011-09-05 22:56 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 66 bytes --]

Is that really necessary? Isn't "intel_gpu_tool > logfile" enough?

[-- Attachment #1.1.2: Type: text/html, Size: 542 bytes --]

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/6] intel_gpu_tool: initial support for non-screen output
  2011-09-05 22:56   ` Łukasz Kuryło
@ 2011-09-05 23:48     ` Łukasz Kuryło
  2011-09-06  0:05     ` Eugeni Dodonov
  1 sibling, 0 replies; 12+ messages in thread
From: Łukasz Kuryło @ 2011-09-05 23:48 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 36 bytes --]

My bad, it is necessary to add file.

[-- Attachment #1.1.2: Type: text/html, Size: 499 bytes --]

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/6] intel_gpu_tool: initial support for non-screen output
  2011-09-05 22:56   ` Łukasz Kuryło
  2011-09-05 23:48     ` Łukasz Kuryło
@ 2011-09-06  0:05     ` Eugeni Dodonov
  1 sibling, 0 replies; 12+ messages in thread
From: Eugeni Dodonov @ 2011-09-06  0:05 UTC (permalink / raw)
  To: Łukasz Kuryło; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 467 bytes --]

On Mon, Sep 5, 2011 at 19:56, Łukasz Kuryło <Lukasz.Kurylo@gmail.com> wrote:

> **
>
> Is that really necessary? Isn't "intel_gpu_tool > logfile" enough?
>

It would work too, but it won't be parseable easily. The idea here was to
have different ways of outputting data in stdout mode (with nice ascii art
and CLI features), and in file-out mode (in tab-separated gnuplot and
parsing-friendly format).

-- 
Eugeni Dodonov <http://eugeni.dodonov.net/>

[-- Attachment #1.2: Type: text/html, Size: 956 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-09-06  0:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-05 20:19 [PATCH 0/6] additional intel_gpu_top profiling features Eugeni Dodonov
2011-09-05 20:19 ` [PATCH 1/6] intel_gpu_top: account for time spent in syscalls Eugeni Dodonov
2011-09-05 20:19 ` [PATCH 2/6] intel_gpu_top: suport command line parameters and variable samples per second Eugeni Dodonov
2011-09-05 21:44   ` Chris Wilson
2011-09-05 22:16     ` Eugeni Dodonov
2011-09-05 20:19 ` [PATCH 3/6] intel_gpu_tool: initial support for non-screen output Eugeni Dodonov
2011-09-05 22:56   ` Łukasz Kuryło
2011-09-05 23:48     ` Łukasz Kuryło
2011-09-06  0:05     ` Eugeni Dodonov
2011-09-05 20:19 ` [PATCH 4/6] intel_gpu_top: initialize monitoring statistics at startup Eugeni Dodonov
2011-09-05 20:19 ` [PATCH 5/6] intel_gpu_top: support non-interactive mode Eugeni Dodonov
2011-09-05 20:19 ` [PATCH 6/6] intel_gpu_top: support profiling user-specified commands Eugeni Dodonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).