All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tursulin@ursulin.net>
To: igt-dev@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Subject: [PATCH i-g-t 2/4] trace.pl: Fix request split mode
Date: Thu, 19 Jul 2018 10:35:59 +0100	[thread overview]
Message-ID: <20180719093601.11788-3-tvrtko.ursulin@linux.intel.com> (raw)
In-Reply-To: <20180719093601.11788-1-tvrtko.ursulin@linux.intel.com>

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Request split mode had several bugs, both in the original version and also
after the recent refactorings.

One big one was that it wasn't considering different submit ports as a
reason to split execution, and also that it was too time based instead of
looking at relevant timelines.

In this refactoring we address the former by using the engine timelines
introduced in the previous patch. Secondary port submissions are moved
to follow the preceding submission as a first step in the correction
process.

In the second step, we add context timelines and use then in a similar
fashion to separate start and end time of coalesced requests. For each
coalesced request we know its boundaries by looking at the engine
timeline (via global seqnos), and we know the previous request it should
only start after, by looking at the context timeline.

v2:
 * Remove some dead code.
 * Fix !port0 shifting logic.

v3:
 * Refactor for less list walking as with incomplete handling.

v4:
 * Database of context timelines should not contain duplicates!
   (Converted from array into a hash.)

v5:
 * Avoid over-accounting runnable time for a coalesced group by recording
   the time first request entered the GPU and ending the execute delay at
   that point for the whole group.

v6:
 * Update for engine class:instance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: John Harrison <John.C.Harrison@intel.com>
---
 scripts/trace.pl | 138 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 108 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 41bedeefb776..59f6d32dc3c8 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,7 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
 my @freqs;
 
 my $max_items = 3000;
@@ -418,6 +418,7 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
@@ -573,41 +574,113 @@ sub sortStart {
 	return $val;
 }
 
-my @sorted_keys = sort sortStart keys %db;
-my $re_sort = 0;
+my $re_sort = 1;
+my @sorted_keys;
 
-die "Database changed size?!" unless scalar(@sorted_keys) == $key_count;
+sub maybe_sort_keys
+{
+	if ($re_sort) {
+		@sorted_keys = sort sortStart keys %db;
+		$re_sort = 0;
+		die "Database changed size?!" unless scalar(@sorted_keys) ==
+						     $key_count;
+	}
+}
 
-foreach my $key (@sorted_keys) {
-	my $ring = $db{$key}->{'ring'};
-	my $end = $db{$key}->{'end'};
+maybe_sort_keys();
+
+my %ctx_timelines;
+
+sub sortContext {
+	my $as = $db{$a}->{'seqno'};
+	my $bs = $db{$b}->{'seqno'};
+	my $val;
+
+	$val = $as <=> $bs;
+
+	die if $val == 0;
+
+	return $val;
+}
+
+sub get_ctx_timeline {
+	my ($ctx, $ring, $key) = @_;
+	my @timeline;
+
+	return $ctx_timelines{$key} if exists $ctx_timelines{$key};
+
+	@timeline = grep { $db{$_}->{'ring'} eq $ring and
+			   $db{$_}->{'ctx'} == $ctx } @sorted_keys;
+	# FIXME seqno restart
+	@timeline = sort sortContext @timeline;
+
+	$ctx_timelines{$key} = \@timeline;
+
+	return \@timeline;
+}
+
+# Split out merged batches if requested.
+if ($correct_durations) {
+	# Shift !port0 requests start time to after the previous context on the
+	# same timeline has finished.
+	foreach my $gid (sort keys %rings) {
+		my $ring = $ringmap{$rings{$gid}};
+		my $timeline = get_engine_timeline($ring);
+		my $complete;
+
+		foreach my $pos (0..$#{$timeline}) {
+			my $key = @{$timeline}[$pos];
+			my $prev = $complete;
+			my $pkey;
+
+			$complete = $key unless exists $db{$key}->{'no-end'};
+			$pkey = $complete;
+
+			next if $db{$key}->{'port'} == 0;
+
+			$pkey = $prev if $complete eq $key;
+
+			die unless defined $pkey;
+
+			$db{$key}->{'start'} = $db{$pkey}->{'end'};
+			$db{$key}->{'start'} = $db{$pkey}->{'notify'} if $db{$key}->{'start'} > $db{$key}->{'end'};
+
+			die if $db{$key}->{'start'} > $db{$key}->{'end'};
 
-	# correct duration of merged batches
-	if ($correct_durations and exists $db{$key}->{'no-end'}) {
-		my $ctx = $db{$key}->{'ctx'};
-		my $seqno = $db{$key}->{'seqno'};
-		my $start = $db{$key}->{'start'};
-		my $next_key;
-		my $i = 1;
-
-		do {
-			$next_key = db_key($ring, $ctx, $seqno + $i);
-			$i++;
-		} until (exists $db{$next_key} or $i > $key_count);  # ugly stop hack
-
-		# 20us tolerance
-		if (exists $db{$next_key} and $db{$next_key}->{'start'} < $start + 20) {
-			my $notify = $db{$key}->{'notify'};
 			$re_sort = 1;
-			$db{$next_key}->{'start'} = $notify;
-			$db{$next_key}->{'start'} = $db{$next_key}->{'end'} if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};
-			die if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};
 		}
-		die if $start > $end;
+	}
+
+	maybe_sort_keys();
+
+	# Batch with no-end (no request_out) means it was submitted as part of
+	# coalesced context. This means it's start time should be set to the end
+	# time of a following request on this context timeline.
+	foreach my $tkey (sort keys %ctxtimelines) {
+		my ($ctx, $ring) = split '/', $tkey;
+		my $timeline = get_ctx_timeline($ctx, $ring, $tkey);
+		my $last_complete = -1;
+		my $complete;
+
+		foreach my $pos (0..$#{$timeline}) {
+			my $key = @{$timeline}[$pos];
+			my $next_key;
+
+			next unless exists $db{$key}->{'no-end'};
+			last if $pos == $#{$timeline};
+
+			# Shift following request to start after the current one
+			$next_key = ${$timeline}[$pos + 1];
+			if (exists $db{$key}->{'notify'}) {
+				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
+				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
+				$re_sort = 1;
+			}
+		}
 	}
 }
 
-@sorted_keys = sort sortStart keys %db if $re_sort;
+maybe_sort_keys();
 
 # GPU time accounting
 my (%running, %runnable, %queued, %batch_avg, %batch_total_avg, %batch_count);
@@ -621,6 +694,7 @@ foreach my $key (@sorted_keys) {
 	my $ring = $db{$key}->{'ring'};
 	my $end = $db{$key}->{'end'};
 	my $start = $db{$key}->{'start'};
+	my $engine_start = $db{$key}->{'engine_start'};
 	my $notify = $db{$key}->{'notify'};
 
 	$first_ts = $db{$key}->{'queue'} if not defined $first_ts or $db{$key}->{'queue'} < $first_ts;
@@ -633,7 +707,9 @@ foreach my $key (@sorted_keys) {
 	} else {
 		$db{$key}->{'context-complete-delay'} = 0;
 	}
-	$db{$key}->{'execute-delay'} = $start - $db{$key}->{'submit'};
+
+	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
+	$db{$key}->{'execute-delay'} = $engine_start - $db{$key}->{'submit'};
 	$db{$key}->{'submit-delay'} = $db{$key}->{'submit'} - $db{$key}->{'queue'};
 	unless (exists $db{$key}->{'no-notify'}) {
 		$db{$key}->{'duration'} = $notify - $start;
@@ -1059,6 +1135,7 @@ my $i = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
+	my $engine_start = $db{$key}->{'engine-start'};
 	my $submit = $queue + $db{$key}->{'submit-delay'};
 	my ($content, $style);
 	my $group = $engine_start_id + $rings{$db{$key}->{'ring'}};
@@ -1078,11 +1155,12 @@ foreach my $key (sort sortQueue keys %db) {
 	}
 
 	# execute to start
+	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
 	unless (exists $skip_box{'ready'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
-		$startend = 'start: ' . $submit . ', end: ' . $start;
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
 		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
 		$i++;
 	}
-- 
2.17.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Tvrtko Ursulin <tursulin@ursulin.net>
To: igt-dev@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Subject: [igt-dev] [PATCH i-g-t 2/4] trace.pl: Fix request split mode
Date: Thu, 19 Jul 2018 10:35:59 +0100	[thread overview]
Message-ID: <20180719093601.11788-3-tvrtko.ursulin@linux.intel.com> (raw)
In-Reply-To: <20180719093601.11788-1-tvrtko.ursulin@linux.intel.com>

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Request split mode had several bugs, both in the original version and also
after the recent refactorings.

One big one was that it wasn't considering different submit ports as a
reason to split execution, and also that it was too time based instead of
looking at relevant timelines.

In this refactoring we address the former by using the engine timelines
introduced in the previous patch. Secondary port submissions are moved
to follow the preceding submission as a first step in the correction
process.

In the second step, we add context timelines and use then in a similar
fashion to separate start and end time of coalesced requests. For each
coalesced request we know its boundaries by looking at the engine
timeline (via global seqnos), and we know the previous request it should
only start after, by looking at the context timeline.

v2:
 * Remove some dead code.
 * Fix !port0 shifting logic.

v3:
 * Refactor for less list walking as with incomplete handling.

v4:
 * Database of context timelines should not contain duplicates!
   (Converted from array into a hash.)

v5:
 * Avoid over-accounting runnable time for a coalesced group by recording
   the time first request entered the GPU and ending the execute delay at
   that point for the whole group.

v6:
 * Update for engine class:instance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: John Harrison <John.C.Harrison@intel.com>
---
 scripts/trace.pl | 138 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 108 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 41bedeefb776..59f6d32dc3c8 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,7 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
 my @freqs;
 
 my $max_items = 3000;
@@ -418,6 +418,7 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
@@ -573,41 +574,113 @@ sub sortStart {
 	return $val;
 }
 
-my @sorted_keys = sort sortStart keys %db;
-my $re_sort = 0;
+my $re_sort = 1;
+my @sorted_keys;
 
-die "Database changed size?!" unless scalar(@sorted_keys) == $key_count;
+sub maybe_sort_keys
+{
+	if ($re_sort) {
+		@sorted_keys = sort sortStart keys %db;
+		$re_sort = 0;
+		die "Database changed size?!" unless scalar(@sorted_keys) ==
+						     $key_count;
+	}
+}
 
-foreach my $key (@sorted_keys) {
-	my $ring = $db{$key}->{'ring'};
-	my $end = $db{$key}->{'end'};
+maybe_sort_keys();
+
+my %ctx_timelines;
+
+sub sortContext {
+	my $as = $db{$a}->{'seqno'};
+	my $bs = $db{$b}->{'seqno'};
+	my $val;
+
+	$val = $as <=> $bs;
+
+	die if $val == 0;
+
+	return $val;
+}
+
+sub get_ctx_timeline {
+	my ($ctx, $ring, $key) = @_;
+	my @timeline;
+
+	return $ctx_timelines{$key} if exists $ctx_timelines{$key};
+
+	@timeline = grep { $db{$_}->{'ring'} eq $ring and
+			   $db{$_}->{'ctx'} == $ctx } @sorted_keys;
+	# FIXME seqno restart
+	@timeline = sort sortContext @timeline;
+
+	$ctx_timelines{$key} = \@timeline;
+
+	return \@timeline;
+}
+
+# Split out merged batches if requested.
+if ($correct_durations) {
+	# Shift !port0 requests start time to after the previous context on the
+	# same timeline has finished.
+	foreach my $gid (sort keys %rings) {
+		my $ring = $ringmap{$rings{$gid}};
+		my $timeline = get_engine_timeline($ring);
+		my $complete;
+
+		foreach my $pos (0..$#{$timeline}) {
+			my $key = @{$timeline}[$pos];
+			my $prev = $complete;
+			my $pkey;
+
+			$complete = $key unless exists $db{$key}->{'no-end'};
+			$pkey = $complete;
+
+			next if $db{$key}->{'port'} == 0;
+
+			$pkey = $prev if $complete eq $key;
+
+			die unless defined $pkey;
+
+			$db{$key}->{'start'} = $db{$pkey}->{'end'};
+			$db{$key}->{'start'} = $db{$pkey}->{'notify'} if $db{$key}->{'start'} > $db{$key}->{'end'};
+
+			die if $db{$key}->{'start'} > $db{$key}->{'end'};
 
-	# correct duration of merged batches
-	if ($correct_durations and exists $db{$key}->{'no-end'}) {
-		my $ctx = $db{$key}->{'ctx'};
-		my $seqno = $db{$key}->{'seqno'};
-		my $start = $db{$key}->{'start'};
-		my $next_key;
-		my $i = 1;
-
-		do {
-			$next_key = db_key($ring, $ctx, $seqno + $i);
-			$i++;
-		} until (exists $db{$next_key} or $i > $key_count);  # ugly stop hack
-
-		# 20us tolerance
-		if (exists $db{$next_key} and $db{$next_key}->{'start'} < $start + 20) {
-			my $notify = $db{$key}->{'notify'};
 			$re_sort = 1;
-			$db{$next_key}->{'start'} = $notify;
-			$db{$next_key}->{'start'} = $db{$next_key}->{'end'} if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};
-			die if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};
 		}
-		die if $start > $end;
+	}
+
+	maybe_sort_keys();
+
+	# Batch with no-end (no request_out) means it was submitted as part of
+	# coalesced context. This means it's start time should be set to the end
+	# time of a following request on this context timeline.
+	foreach my $tkey (sort keys %ctxtimelines) {
+		my ($ctx, $ring) = split '/', $tkey;
+		my $timeline = get_ctx_timeline($ctx, $ring, $tkey);
+		my $last_complete = -1;
+		my $complete;
+
+		foreach my $pos (0..$#{$timeline}) {
+			my $key = @{$timeline}[$pos];
+			my $next_key;
+
+			next unless exists $db{$key}->{'no-end'};
+			last if $pos == $#{$timeline};
+
+			# Shift following request to start after the current one
+			$next_key = ${$timeline}[$pos + 1];
+			if (exists $db{$key}->{'notify'}) {
+				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
+				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
+				$re_sort = 1;
+			}
+		}
 	}
 }
 
-@sorted_keys = sort sortStart keys %db if $re_sort;
+maybe_sort_keys();
 
 # GPU time accounting
 my (%running, %runnable, %queued, %batch_avg, %batch_total_avg, %batch_count);
@@ -621,6 +694,7 @@ foreach my $key (@sorted_keys) {
 	my $ring = $db{$key}->{'ring'};
 	my $end = $db{$key}->{'end'};
 	my $start = $db{$key}->{'start'};
+	my $engine_start = $db{$key}->{'engine_start'};
 	my $notify = $db{$key}->{'notify'};
 
 	$first_ts = $db{$key}->{'queue'} if not defined $first_ts or $db{$key}->{'queue'} < $first_ts;
@@ -633,7 +707,9 @@ foreach my $key (@sorted_keys) {
 	} else {
 		$db{$key}->{'context-complete-delay'} = 0;
 	}
-	$db{$key}->{'execute-delay'} = $start - $db{$key}->{'submit'};
+
+	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
+	$db{$key}->{'execute-delay'} = $engine_start - $db{$key}->{'submit'};
 	$db{$key}->{'submit-delay'} = $db{$key}->{'submit'} - $db{$key}->{'queue'};
 	unless (exists $db{$key}->{'no-notify'}) {
 		$db{$key}->{'duration'} = $notify - $start;
@@ -1059,6 +1135,7 @@ my $i = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
+	my $engine_start = $db{$key}->{'engine-start'};
 	my $submit = $queue + $db{$key}->{'submit-delay'};
 	my ($content, $style);
 	my $group = $engine_start_id + $rings{$db{$key}->{'ring'}};
@@ -1078,11 +1155,12 @@ foreach my $key (sort sortQueue keys %db) {
 	}
 
 	# execute to start
+	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
 	unless (exists $skip_box{'ready'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
-		$startend = 'start: ' . $submit . ', end: ' . $start;
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
 		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
 		$i++;
 	}
-- 
2.17.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

  parent reply	other threads:[~2018-07-19  9:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-19  9:35 [PATCH i-g-t 0/4] trace.pl fixes and improvements Tvrtko Ursulin
2018-07-19  9:35 ` [Intel-gfx] " Tvrtko Ursulin
2018-07-19  9:35 ` [PATCH i-g-t 1/4] trace.pl: Context save only applies to last request of a bunch Tvrtko Ursulin
2018-07-19  9:35   ` [igt-dev] " Tvrtko Ursulin
2018-07-27 21:37   ` John Harrison
2018-07-27 21:37     ` [igt-dev] [Intel-gfx] " John Harrison
2018-08-02 10:29     ` [igt-dev] " Tvrtko Ursulin
2018-08-02 10:29       ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
2018-07-19  9:35 ` Tvrtko Ursulin [this message]
2018-07-19  9:35   ` [igt-dev] [PATCH i-g-t 2/4] trace.pl: Fix request split mode Tvrtko Ursulin
2018-07-27 21:43   ` John Harrison
2018-07-27 21:43     ` [igt-dev] " John Harrison
2018-07-19  9:36 ` [PATCH i-g-t 3/4] trace.pl: Bring back timeline stacking Tvrtko Ursulin
2018-07-19  9:36   ` [igt-dev] " Tvrtko Ursulin
2018-07-27 21:47   ` John Harrison
2018-08-02 10:38     ` Tvrtko Ursulin
2018-07-19  9:36 ` [PATCH i-g-t 4/4] trace.pl: Fix frequency timeline Tvrtko Ursulin
2018-07-19  9:36   ` [igt-dev] " Tvrtko Ursulin
2018-07-27 22:17   ` John Harrison
2018-08-02 10:42     ` Tvrtko Ursulin
2018-07-19 13:16 ` [igt-dev] ✓ Fi.CI.BAT: success for trace.pl fixes and improvements (rev13) Patchwork
2018-07-19 16:57 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180719093601.11788-3-tvrtko.ursulin@linux.intel.com \
    --to=tursulin@ursulin.net \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.