All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 00/21] Media scalability tooling
@ 2019-05-08 12:10 ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Mostly work to support Virtual Engine in trace.pl and gem_wsim complementing the
set of IGTs written by Chris.

Also includes trace.pl update for after engine seqno removal and engine
discovery tests.

Altogether it allows benchamrking and tracing the simulated media workloads in
combination with Virtual Engine (and frame split) on Gen9, but also needs follow
up work to add support for new Icelake vcs2 engine.

Tvrtko Ursulin (21):
  scripts/trace.pl: Fix after intel_engine_notify removal
  headers: bump
  trace.pl: Virtual engine support
  trace.pl: Virtual engine preemption support
  wsim/media-bench: i915 balancing
  gem_wsim: Use IGT uapi headers
  gem_wsim: Factor out common error handling
  gem_wsim: More wsim_err
  gem_wsim: Submit fence support
  gem_wsim: Extract str to engine lookup
  gem_wsim: Engine map support
  gem_wsim: Save some lines by changing to implicit NULL checking
  gem_wsim: Compact int command parsing with a macro
  gem_wsim: Engine map load balance command
  gem_wsim: Engine bond command
  gem_wsim: Some more example workloads
  gem_wsim: Infinite batch support
  gem_wsim: Command line switch for specifying low slice count workloads
  gem_wsim: Per context SSEU control
  gem_wsim: Allow RCS virtual engine with SSEU control
  tests/i915_query: Engine discovery tests

 benchmarks/gem_wsim.c                       | 1207 ++++++++++++++-----
 benchmarks/wsim/README                      |  134 +-
 benchmarks/wsim/frame-split-60fps.wsim      |   18 +
 benchmarks/wsim/high-composited-game.wsim   |   11 +
 benchmarks/wsim/media-1080p-player.wsim     |    5 +
 benchmarks/wsim/medium-composited-game.wsim |    9 +
 include/drm-uapi/amdgpu_drm.h               |   52 +-
 include/drm-uapi/drm.h                      |   36 +
 include/drm-uapi/drm_mode.h                 |    4 +-
 include/drm-uapi/i915_drm.h                 |  209 +++-
 include/drm-uapi/lima_drm.h                 |  169 +++
 include/drm-uapi/msm_drm.h                  |   14 +
 include/drm-uapi/nouveau_drm.h              |   51 +
 include/drm-uapi/panfrost_drm.h             |  142 +++
 include/drm-uapi/v3d_drm.h                  |   28 +
 scripts/media-bench.pl                      |    9 +-
 scripts/trace.pl                            |  318 +++--
 tests/i915/i915_query.c                     |  247 ++++
 18 files changed, 2246 insertions(+), 417 deletions(-)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim
 create mode 100644 include/drm-uapi/lima_drm.h
 create mode 100644 include/drm-uapi/panfrost_drm.h

-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 00/21] Media scalability tooling
@ 2019-05-08 12:10 ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Mostly work to support Virtual Engine in trace.pl and gem_wsim complementing the
set of IGTs written by Chris.

Also includes trace.pl update for after engine seqno removal and engine
discovery tests.

Altogether it allows benchamrking and tracing the simulated media workloads in
combination with Virtual Engine (and frame split) on Gen9, but also needs follow
up work to add support for new Icelake vcs2 engine.

Tvrtko Ursulin (21):
  scripts/trace.pl: Fix after intel_engine_notify removal
  headers: bump
  trace.pl: Virtual engine support
  trace.pl: Virtual engine preemption support
  wsim/media-bench: i915 balancing
  gem_wsim: Use IGT uapi headers
  gem_wsim: Factor out common error handling
  gem_wsim: More wsim_err
  gem_wsim: Submit fence support
  gem_wsim: Extract str to engine lookup
  gem_wsim: Engine map support
  gem_wsim: Save some lines by changing to implicit NULL checking
  gem_wsim: Compact int command parsing with a macro
  gem_wsim: Engine map load balance command
  gem_wsim: Engine bond command
  gem_wsim: Some more example workloads
  gem_wsim: Infinite batch support
  gem_wsim: Command line switch for specifying low slice count workloads
  gem_wsim: Per context SSEU control
  gem_wsim: Allow RCS virtual engine with SSEU control
  tests/i915_query: Engine discovery tests

 benchmarks/gem_wsim.c                       | 1207 ++++++++++++++-----
 benchmarks/wsim/README                      |  134 +-
 benchmarks/wsim/frame-split-60fps.wsim      |   18 +
 benchmarks/wsim/high-composited-game.wsim   |   11 +
 benchmarks/wsim/media-1080p-player.wsim     |    5 +
 benchmarks/wsim/medium-composited-game.wsim |    9 +
 include/drm-uapi/amdgpu_drm.h               |   52 +-
 include/drm-uapi/drm.h                      |   36 +
 include/drm-uapi/drm_mode.h                 |    4 +-
 include/drm-uapi/i915_drm.h                 |  209 +++-
 include/drm-uapi/lima_drm.h                 |  169 +++
 include/drm-uapi/msm_drm.h                  |   14 +
 include/drm-uapi/nouveau_drm.h              |   51 +
 include/drm-uapi/panfrost_drm.h             |  142 +++
 include/drm-uapi/v3d_drm.h                  |   28 +
 scripts/media-bench.pl                      |    9 +-
 scripts/trace.pl                            |  318 +++--
 tests/i915/i915_query.c                     |  247 ++++
 18 files changed, 2246 insertions(+), 417 deletions(-)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim
 create mode 100644 include/drm-uapi/lima_drm.h
 create mode 100644 include/drm-uapi/panfrost_drm.h

-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

After the removal of engine global seqnos and the corresponding
intel_engine_notify tracepoints the script needs to be adjusted to cope
with the new state of things.

To keep working it switches over using the dma_fence:dma_fence_signaled:
tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
with engines.

It also needs to key the completion events on the full engine/ctx/seqno
tokens, and adjust correspondingly the timeline sorting logic.

v2:
 * Do not use late notifications (received after context complete) when
   splitting up coalesced requests. They are now much more likely and can
   not be used.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 82 ++++++++++++++++++++++++------------------------
 1 file changed, 41 insertions(+), 41 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 18f9f3b18396..95dc3a645e8e 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,8 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+    %ctxtimelines, %ctxengines);
 my @freqs;
 
 my $max_items = 3000;
@@ -66,7 +67,7 @@ Notes:
 			       i915:i915_request_submit, \
 			       i915:i915_request_in, \
 			       i915:i915_request_out, \
-			       i915:intel_engine_notify, \
+			       dma_fence:dma_fence_signaled, \
 			       i915:i915_request_wait_begin, \
 			       i915:i915_request_wait_end \
 			       [command-to-be-profiled]
@@ -161,7 +162,7 @@ sub arg_trace
 		       'i915:i915_request_submit',
 		       'i915:i915_request_in',
 		       'i915:i915_request_out',
-		       'i915:intel_engine_notify',
+		       'dma_fence:dma_fence_signaled',
 		       'i915:i915_request_wait_begin',
 		       'i915:i915_request_wait_end' );
 
@@ -312,13 +313,6 @@ sub db_key
 	return $ring . '/' . $ctx . '/' . $seqno;
 }
 
-sub global_key
-{
-	my ($ring, $seqno) = @_;
-
-	return $ring . '/' . $seqno;
-}
-
 sub sanitize_ctx
 {
 	my ($ctx, $ring) = @_;
@@ -419,6 +413,8 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
+		$ctxengines{$ctx} = $ring;
 		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
@@ -429,16 +425,29 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey = global_key($ring, $tp{'global'});
+		my $gkey;
+
+		die unless exists $ctxengines{$ctx};
+
+		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
+		if ($tp{'completed?'}) {
+			die unless exists $db{$key};
+			die unless exists $db{$key}->{'start'};
+			die if exists $db{$key}->{'end'};
+
+			$db{$key}->{'end'} = $time;
+			$db{$key}->{'notify'} = $notify{$gkey}
+						if exists $notify{$gkey};
+		} else {
+			delete $db{$key};
+		}
+	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
+		my $gkey;
 
-		die unless exists $db{$key};
-		die unless exists $db{$key}->{'start'};
-		die if exists $db{$key}->{'end'};
+		die unless exists $ctxengines{$tp{'context'}};
 
-		$db{$key}->{'end'} = $time;
-		$db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
-	} elsif ($tp_name eq 'i915:intel_engine_notify:') {
-		my $gkey = global_key($ring, $seqno);
+		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
 
 		$notify{$gkey} = $time unless exists $notify{$gkey};
 	} elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
@@ -452,7 +461,7 @@ while (<>) {
 # find the largest seqno to be used for timeline sorting purposes.
 my $max_seqno = 0;
 foreach my $key (keys %db) {
-	my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
+	my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 
 	die unless exists $db{$key}->{'start'};
 
@@ -478,14 +487,13 @@ my $key_count = scalar(keys %db);
 
 my %engine_timelines;
 
-sub sortEngine {
-	my $as = $db{$a}->{'global'};
-	my $bs = $db{$b}->{'global'};
+sub sortStart {
+	my $as = $db{$a}->{'start'};
+	my $bs = $db{$b}->{'start'};
 	my $val;
 
 	$val = $as <=> $bs;
-
-	die if $val == 0;
+	$val = $a cmp $b if $val == 0;
 
 	return $val;
 }
@@ -497,9 +505,7 @@ sub get_engine_timeline {
 	return $engine_timelines{$ring} if exists $engine_timelines{$ring};
 
 	@timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
-	# FIXME seqno restart
-	@timeline = sort sortEngine @timeline;
-
+	@timeline = sort sortStart @timeline;
 	$engine_timelines{$ring} = \@timeline;
 
 	return \@timeline;
@@ -561,20 +567,10 @@ foreach my $gid (sort keys %rings) {
 			$db{$key}->{'no-notify'} = 1;
 		}
 		$db{$key}->{'end'} = $end;
+		$db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
 	}
 }
 
-sub sortStart {
-	my $as = $db{$a}->{'start'};
-	my $bs = $db{$b}->{'start'};
-	my $val;
-
-	$val = $as <=> $bs;
-	$val = $a cmp $b if $val == 0;
-
-	return $val;
-}
-
 my $re_sort = 1;
 my @sorted_keys;
 
@@ -670,9 +666,13 @@ if ($correct_durations) {
 			next unless exists $db{$key}->{'no-end'};
 			last if $pos == $#{$timeline};
 
-			# Shift following request to start after the current one
+			# Shift following request to start after the current
+			# one, but only if that wouldn't make it zero duration,
+			# which would indicate notify arrived after context
+			# complete.
 			$next_key = ${$timeline}[$pos + 1];
-			if (exists $db{$key}->{'notify'}) {
+			if (exists $db{$key}->{'notify'} and
+			    $db{$key}->{'notify'} < $db{$key}->{'end'}) {
 				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
 				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
 				$re_sort = 1;
@@ -750,9 +750,9 @@ foreach my $gid (sort keys %rings) {
 	# Extract all GPU busy intervals and sort them.
 	foreach my $key (@sorted_keys) {
 		next unless $db{$key}->{'ring'} eq $ring;
+		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 		push @s_, $db{$key}->{'start'};
 		push @e_, $db{$key}->{'end'};
-		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 	}
 
 	die unless $#s_ == $#e_;
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

After the removal of engine global seqnos and the corresponding
intel_engine_notify tracepoints the script needs to be adjusted to cope
with the new state of things.

To keep working it switches over using the dma_fence:dma_fence_signaled:
tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
with engines.

It also needs to key the completion events on the full engine/ctx/seqno
tokens, and adjust correspondingly the timeline sorting logic.

v2:
 * Do not use late notifications (received after context complete) when
   splitting up coalesced requests. They are now much more likely and can
   not be used.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 82 ++++++++++++++++++++++++------------------------
 1 file changed, 41 insertions(+), 41 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 18f9f3b18396..95dc3a645e8e 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,8 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+    %ctxtimelines, %ctxengines);
 my @freqs;
 
 my $max_items = 3000;
@@ -66,7 +67,7 @@ Notes:
 			       i915:i915_request_submit, \
 			       i915:i915_request_in, \
 			       i915:i915_request_out, \
-			       i915:intel_engine_notify, \
+			       dma_fence:dma_fence_signaled, \
 			       i915:i915_request_wait_begin, \
 			       i915:i915_request_wait_end \
 			       [command-to-be-profiled]
@@ -161,7 +162,7 @@ sub arg_trace
 		       'i915:i915_request_submit',
 		       'i915:i915_request_in',
 		       'i915:i915_request_out',
-		       'i915:intel_engine_notify',
+		       'dma_fence:dma_fence_signaled',
 		       'i915:i915_request_wait_begin',
 		       'i915:i915_request_wait_end' );
 
@@ -312,13 +313,6 @@ sub db_key
 	return $ring . '/' . $ctx . '/' . $seqno;
 }
 
-sub global_key
-{
-	my ($ring, $seqno) = @_;
-
-	return $ring . '/' . $seqno;
-}
-
 sub sanitize_ctx
 {
 	my ($ctx, $ring) = @_;
@@ -419,6 +413,8 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
+		$ctxengines{$ctx} = $ring;
 		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
@@ -429,16 +425,29 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey = global_key($ring, $tp{'global'});
+		my $gkey;
+
+		die unless exists $ctxengines{$ctx};
+
+		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
+		if ($tp{'completed?'}) {
+			die unless exists $db{$key};
+			die unless exists $db{$key}->{'start'};
+			die if exists $db{$key}->{'end'};
+
+			$db{$key}->{'end'} = $time;
+			$db{$key}->{'notify'} = $notify{$gkey}
+						if exists $notify{$gkey};
+		} else {
+			delete $db{$key};
+		}
+	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
+		my $gkey;
 
-		die unless exists $db{$key};
-		die unless exists $db{$key}->{'start'};
-		die if exists $db{$key}->{'end'};
+		die unless exists $ctxengines{$tp{'context'}};
 
-		$db{$key}->{'end'} = $time;
-		$db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
-	} elsif ($tp_name eq 'i915:intel_engine_notify:') {
-		my $gkey = global_key($ring, $seqno);
+		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
 
 		$notify{$gkey} = $time unless exists $notify{$gkey};
 	} elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
@@ -452,7 +461,7 @@ while (<>) {
 # find the largest seqno to be used for timeline sorting purposes.
 my $max_seqno = 0;
 foreach my $key (keys %db) {
-	my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
+	my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 
 	die unless exists $db{$key}->{'start'};
 
@@ -478,14 +487,13 @@ my $key_count = scalar(keys %db);
 
 my %engine_timelines;
 
-sub sortEngine {
-	my $as = $db{$a}->{'global'};
-	my $bs = $db{$b}->{'global'};
+sub sortStart {
+	my $as = $db{$a}->{'start'};
+	my $bs = $db{$b}->{'start'};
 	my $val;
 
 	$val = $as <=> $bs;
-
-	die if $val == 0;
+	$val = $a cmp $b if $val == 0;
 
 	return $val;
 }
@@ -497,9 +505,7 @@ sub get_engine_timeline {
 	return $engine_timelines{$ring} if exists $engine_timelines{$ring};
 
 	@timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
-	# FIXME seqno restart
-	@timeline = sort sortEngine @timeline;
-
+	@timeline = sort sortStart @timeline;
 	$engine_timelines{$ring} = \@timeline;
 
 	return \@timeline;
@@ -561,20 +567,10 @@ foreach my $gid (sort keys %rings) {
 			$db{$key}->{'no-notify'} = 1;
 		}
 		$db{$key}->{'end'} = $end;
+		$db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
 	}
 }
 
-sub sortStart {
-	my $as = $db{$a}->{'start'};
-	my $bs = $db{$b}->{'start'};
-	my $val;
-
-	$val = $as <=> $bs;
-	$val = $a cmp $b if $val == 0;
-
-	return $val;
-}
-
 my $re_sort = 1;
 my @sorted_keys;
 
@@ -670,9 +666,13 @@ if ($correct_durations) {
 			next unless exists $db{$key}->{'no-end'};
 			last if $pos == $#{$timeline};
 
-			# Shift following request to start after the current one
+			# Shift following request to start after the current
+			# one, but only if that wouldn't make it zero duration,
+			# which would indicate notify arrived after context
+			# complete.
 			$next_key = ${$timeline}[$pos + 1];
-			if (exists $db{$key}->{'notify'}) {
+			if (exists $db{$key}->{'notify'} and
+			    $db{$key}->{'notify'} < $db{$key}->{'end'}) {
 				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
 				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
 				$re_sort = 1;
@@ -750,9 +750,9 @@ foreach my $gid (sort keys %rings) {
 	# Extract all GPU busy intervals and sort them.
 	foreach my $key (@sorted_keys) {
 		next unless $db{$key}->{'ring'} eq $ring;
+		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 		push @s_, $db{$key}->{'start'};
 		push @e_, $db{$key}->{'end'};
-		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 	}
 
 	die unless $#s_ == $#e_;
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 02/21] headers: bump
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Catch up to drm-tip headers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 include/drm-uapi/amdgpu_drm.h   |  52 +++++++-
 include/drm-uapi/drm.h          |  36 ++++++
 include/drm-uapi/drm_mode.h     |   4 +-
 include/drm-uapi/i915_drm.h     | 209 +++++++++++++++++++++++++++++++-
 include/drm-uapi/lima_drm.h     | 169 ++++++++++++++++++++++++++
 include/drm-uapi/msm_drm.h      |  14 +++
 include/drm-uapi/nouveau_drm.h  |  51 ++++++++
 include/drm-uapi/panfrost_drm.h | 142 ++++++++++++++++++++++
 include/drm-uapi/v3d_drm.h      |  28 +++++
 9 files changed, 699 insertions(+), 6 deletions(-)
 create mode 100644 include/drm-uapi/lima_drm.h
 create mode 100644 include/drm-uapi/panfrost_drm.h

diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h
index be84e43c1e19..4788730dbe78 100644
--- a/include/drm-uapi/amdgpu_drm.h
+++ b/include/drm-uapi/amdgpu_drm.h
@@ -210,6 +210,9 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_QUERY2_FLAGS_VRAMLOST (1<<1)
 /* indicate some job from this context once cause gpu hang */
 #define AMDGPU_CTX_QUERY2_FLAGS_GUILTY   (1<<2)
+/* indicate some errors are detected by RAS */
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_CE   (1<<3)
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_UE   (1<<4)
 
 /* Context priority level */
 #define AMDGPU_CTX_PRIORITY_UNSET       -2048
@@ -272,13 +275,14 @@ union drm_amdgpu_vm {
 
 /* sched ioctl */
 #define AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE	1
+#define AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE	2
 
 struct drm_amdgpu_sched_in {
 	/* AMDGPU_SCHED_OP_* */
 	__u32	op;
 	__u32	fd;
 	__s32	priority;
-	__u32	flags;
+	__u32   ctx_id;
 };
 
 union drm_amdgpu_sched {
@@ -523,6 +527,9 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN      0x04
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT     0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES      0x06
+#define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES	0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT    0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
 
 struct drm_amdgpu_cs_chunk {
 	__u32		chunk_id;
@@ -565,6 +572,11 @@ union drm_amdgpu_cs {
  * caches (L2/vL1/sL1/I$). */
 #define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
 
+/* Set GDS_COMPUTE_MAX_WAVE_ID = DEFAULT before PACKET3_INDIRECT_BUFFER.
+ * This will reset wave ID counters for the IB.
+ */
+#define AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID (1 << 4)
+
 struct drm_amdgpu_cs_chunk_ib {
 	__u32 _pad;
 	/** AMDGPU_IB_FLAG_* */
@@ -598,6 +610,12 @@ struct drm_amdgpu_cs_chunk_sem {
 	__u32 handle;
 };
 
+struct drm_amdgpu_cs_chunk_syncobj {
+       __u32 handle;
+       __u32 flags;
+       __u64 point;
+};
+
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ	0
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD	1
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD	2
@@ -673,6 +691,7 @@ struct drm_amdgpu_cs_chunk_data {
 	#define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11
 	/* Subquery id: Query DMCU firmware version */
 	#define AMDGPU_INFO_FW_DMCU		0x12
+	#define AMDGPU_INFO_FW_TA		0x13
 /* number of bytes moved for TTM migration */
 #define AMDGPU_INFO_NUM_BYTES_MOVED		0x0f
 /* the used VRAM size */
@@ -726,6 +745,37 @@ struct drm_amdgpu_cs_chunk_data {
 /* Number of VRAM page faults on CPU access. */
 #define AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS	0x1E
 #define AMDGPU_INFO_VRAM_LOST_COUNTER		0x1F
+/* query ras mask of enabled features*/
+#define AMDGPU_INFO_RAS_ENABLED_FEATURES	0x20
+
+/* RAS MASK: UMC (VRAM) */
+#define AMDGPU_INFO_RAS_ENABLED_UMC			(1 << 0)
+/* RAS MASK: SDMA */
+#define AMDGPU_INFO_RAS_ENABLED_SDMA			(1 << 1)
+/* RAS MASK: GFX */
+#define AMDGPU_INFO_RAS_ENABLED_GFX			(1 << 2)
+/* RAS MASK: MMHUB */
+#define AMDGPU_INFO_RAS_ENABLED_MMHUB			(1 << 3)
+/* RAS MASK: ATHUB */
+#define AMDGPU_INFO_RAS_ENABLED_ATHUB			(1 << 4)
+/* RAS MASK: PCIE */
+#define AMDGPU_INFO_RAS_ENABLED_PCIE			(1 << 5)
+/* RAS MASK: HDP */
+#define AMDGPU_INFO_RAS_ENABLED_HDP			(1 << 6)
+/* RAS MASK: XGMI */
+#define AMDGPU_INFO_RAS_ENABLED_XGMI			(1 << 7)
+/* RAS MASK: DF */
+#define AMDGPU_INFO_RAS_ENABLED_DF			(1 << 8)
+/* RAS MASK: SMN */
+#define AMDGPU_INFO_RAS_ENABLED_SMN			(1 << 9)
+/* RAS MASK: SEM */
+#define AMDGPU_INFO_RAS_ENABLED_SEM			(1 << 10)
+/* RAS MASK: MP0 */
+#define AMDGPU_INFO_RAS_ENABLED_MP0			(1 << 11)
+/* RAS MASK: MP1 */
+#define AMDGPU_INFO_RAS_ENABLED_MP1			(1 << 12)
+/* RAS MASK: FUSE */
+#define AMDGPU_INFO_RAS_ENABLED_FUSE			(1 << 13)
 
 #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT	0
 #define AMDGPU_INFO_MMR_SE_INDEX_MASK	0xff
diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index 85c685a2075e..c893f3b4a895 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -729,8 +729,18 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+struct drm_syncobj_transfer {
+	__u32 src_handle;
+	__u32 dst_handle;
+	__u64 src_point;
+	__u64 dst_point;
+	__u32 flags;
+	__u32 pad;
+};
+
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE (1 << 2) /* wait for time point to become available */
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
@@ -741,12 +751,33 @@ struct drm_syncobj_wait {
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_wait {
+	__u64 handles;
+	/* wait on specific timeline point for every handles*/
+	__u64 points;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
+
 struct drm_syncobj_array {
 	__u64 handles;
 	__u32 count_handles;
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_array {
+	__u64 handles;
+	__u64 points;
+	__u32 count_handles;
+	__u32 pad;
+};
+
+
 /* Query current scanout sequence number */
 struct drm_crtc_get_sequence {
 	__u32 crtc_id;		/* requested crtc_id */
@@ -903,6 +934,11 @@ extern "C" {
 #define DRM_IOCTL_MODE_GET_LEASE	DRM_IOWR(0xC8, struct drm_mode_get_lease)
 #define DRM_IOCTL_MODE_REVOKE_LEASE	DRM_IOWR(0xC9, struct drm_mode_revoke_lease)
 
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT	DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait)
+#define DRM_IOCTL_SYNCOBJ_QUERY		DRM_IOWR(0xCB, struct drm_syncobj_timeline_array)
+#define DRM_IOCTL_SYNCOBJ_TRANSFER	DRM_IOWR(0xCC, struct drm_syncobj_transfer)
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL	DRM_IOWR(0xCD, struct drm_syncobj_timeline_array)
+
 /**
  * Device specific ioctls should only be in their respective headers
  * The device specific ioctl range is from 0x40 to 0x9f.
diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h
index a439c2e67896..83cd1636b9be 100644
--- a/include/drm-uapi/drm_mode.h
+++ b/include/drm-uapi/drm_mode.h
@@ -33,7 +33,6 @@
 extern "C" {
 #endif
 
-#define DRM_DISPLAY_INFO_LEN	32
 #define DRM_CONNECTOR_NAME_LEN	32
 #define DRM_DISPLAY_MODE_LEN	32
 #define DRM_PROP_NAME_LEN	32
@@ -622,7 +621,8 @@ struct drm_color_ctm {
 
 struct drm_color_lut {
 	/*
-	 * Data is U0.16 fixed point format.
+	 * Values are mapped linearly to 0.0 - 1.0 range, with 0x0 == 0.0 and
+	 * 0xffff == 1.0.
 	 */
 	__u16 red;
 	__u16 green;
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index e01b3e1fd6d6..761517f15368 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -136,6 +136,8 @@ enum drm_i915_gem_engine_class {
 struct i915_engine_class_instance {
 	__u16 engine_class; /* see enum drm_i915_gem_engine_class */
 	__u16 engine_instance;
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL -2
 };
 
 /**
@@ -355,6 +357,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
 #define DRM_I915_QUERY			0x39
+#define DRM_I915_GEM_VM_CREATE		0x3a
+#define DRM_I915_GEM_VM_DESTROY		0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -415,6 +419,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -598,6 +604,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1120,7 +1132,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT		(1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1464,8 +1485,9 @@ struct drm_i915_gem_context_create_ext {
 	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
 	__u64 extensions;
 };
 
@@ -1507,6 +1529,41 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE	0x8
+
+	/*
+	 * The id of the associated virtual memory address space (ppGTT) of
+	 * this context. Can be retrieved and passed to another context
+	 * (on the same fd) for both to use the same ppGTT and so share
+	 * address layouts, and avoid reloading the page tables on context
+	 * switches between themselves.
+	 *
+	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+	 */
+#define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1540,9 +1597,10 @@ struct drm_i915_gem_context_param_sseu {
 	struct i915_engine_class_instance engine;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * Unknown flags must be cleared to zero.
 	 */
 	__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
 
 	/*
 	 * Mask of slices to enable for the context. Valid values are a subset
@@ -1570,12 +1628,115 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 num_siblings;
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 mbz64; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(name__, N__) struct { \
+	struct i915_user_extension base; \
+	__u16 engine_index; \
+	__u16 num_siblings; \
+	__u32 flags; \
+	__u64 mbz64; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by specifying bonding pairs, for any given master engine we will
+ * only execute on one of the corresponding siblings within the virtual engine.
+ *
+ * To execute a request in parallel on the master engine and a sibling requires
+ * coordination with a I915_EXEC_FENCE_SUBMIT.
+ */
+struct i915_context_engines_bond {
+	struct i915_user_extension base;
+
+	struct i915_engine_class_instance master;
+
+	__u16 virtual_index; /* index of virtual engine in ctx->engines[] */
+	__u16 num_bonds;
+
+	__u64 flags; /* all undefined flags must be zero */
+	__u64 mbz64[4]; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_BOND(name__, N__) struct { \
+	struct i915_user_extension base; \
+	struct i915_engine_class_instance master; \
+	__u16 virtual_index; \
+	__u16 num_bonds; \
+	__u64 flags; \
+	__u64 mbz64[4]; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see i915_context_engines_load_balance */
+#define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
+#define I915_CONTEXT_CLONE_VM		(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
@@ -1821,6 +1982,7 @@ struct drm_i915_perf_oa_config {
 struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
+#define DRM_I915_QUERY_ENGINE_INFO	2
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -1919,6 +2081,47 @@ struct drm_i915_query_topology_info {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_engine_info
+ *
+ * Describes one engine and it's capabilities as known to the driver.
+ */
+struct drm_i915_engine_info {
+	/** Engine class and instance. */
+	struct i915_engine_class_instance engine;
+
+	/** Reserved field. */
+	__u32 rsvd0;
+
+	/** Engine flags. */
+	__u64 flags;
+
+	/** Capabilities of this engine. */
+	__u64 capabilities;
+#define I915_VIDEO_CLASS_CAPABILITY_HEVC		(1 << 0)
+#define I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC	(1 << 1)
+
+	/** Reserved fields. */
+	__u64 rsvd1[4];
+};
+
+/**
+ * struct drm_i915_query_engine_info
+ *
+ * Engine info query enumerates all engines known to the driver by filling in
+ * an array of struct drm_i915_engine_info structures.
+ */
+struct drm_i915_query_engine_info {
+	/** Number of struct drm_i915_engine_info structs following. */
+	__u32 num_engines;
+
+	/** MBZ */
+	__u32 rsvd[3];
+
+	/** Marker for drm_i915_engine_info structures. */
+	struct drm_i915_engine_info engines[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/drm-uapi/lima_drm.h b/include/drm-uapi/lima_drm.h
new file mode 100644
index 000000000000..95a00fb867e6
--- /dev/null
+++ b/include/drm-uapi/lima_drm.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */
+/* Copyright 2017-2018 Qiang Yu <yuq825@gmail.com> */
+
+#ifndef __LIMA_DRM_H__
+#define __LIMA_DRM_H__
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+enum drm_lima_param_gpu_id {
+	DRM_LIMA_PARAM_GPU_ID_UNKNOWN,
+	DRM_LIMA_PARAM_GPU_ID_MALI400,
+	DRM_LIMA_PARAM_GPU_ID_MALI450,
+};
+
+enum drm_lima_param {
+	DRM_LIMA_PARAM_GPU_ID,
+	DRM_LIMA_PARAM_NUM_PP,
+	DRM_LIMA_PARAM_GP_VERSION,
+	DRM_LIMA_PARAM_PP_VERSION,
+};
+
+/**
+ * get various information of the GPU
+ */
+struct drm_lima_get_param {
+	__u32 param; /* in, value in enum drm_lima_param */
+	__u32 pad;   /* pad, must be zero */
+	__u64 value; /* out, parameter value */
+};
+
+/**
+ * create a buffer for used by GPU
+ */
+struct drm_lima_gem_create {
+	__u32 size;    /* in, buffer size */
+	__u32 flags;   /* in, currently no flags, must be zero */
+	__u32 handle;  /* out, GEM buffer handle */
+	__u32 pad;     /* pad, must be zero */
+};
+
+/**
+ * get information of a buffer
+ */
+struct drm_lima_gem_info {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 va;      /* out, virtual address mapped into GPU MMU */
+	__u64 offset;  /* out, used to mmap this buffer to CPU */
+};
+
+#define LIMA_SUBMIT_BO_READ   0x01
+#define LIMA_SUBMIT_BO_WRITE  0x02
+
+/* buffer information used by one task */
+struct drm_lima_gem_submit_bo {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 flags;   /* in, buffer read/write by GPU */
+};
+
+#define LIMA_GP_FRAME_REG_NUM 6
+
+/* frame used to setup GP for each task */
+struct drm_lima_gp_frame {
+	__u32 frame[LIMA_GP_FRAME_REG_NUM];
+};
+
+#define LIMA_PP_FRAME_REG_NUM 23
+#define LIMA_PP_WB_REG_NUM 12
+
+/* frame used to setup mali400 GPU PP for each task */
+struct drm_lima_m400_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 plbu_array_address[4];
+	__u32 fragment_stack_address[4];
+};
+
+/* frame used to setup mali450 GPU PP for each task */
+struct drm_lima_m450_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 use_dlbu;
+	__u32 _pad;
+	union {
+		__u32 plbu_array_address[8];
+		__u32 dlbu_regs[4];
+	};
+	__u32 fragment_stack_address[8];
+};
+
+#define LIMA_PIPE_GP  0x00
+#define LIMA_PIPE_PP  0x01
+
+#define LIMA_SUBMIT_FLAG_EXPLICIT_FENCE (1 << 0)
+
+/**
+ * submit a task to GPU
+ *
+ * User can always merge multi sync_file and drm_syncobj
+ * into one drm_syncobj as in_sync[0], but we reserve
+ * in_sync[1] for another task's out_sync to avoid the
+ * export/import/merge pass when explicit sync.
+ */
+struct drm_lima_gem_submit {
+	__u32 ctx;         /* in, context handle task is submitted to */
+	__u32 pipe;        /* in, which pipe to use, GP/PP */
+	__u32 nr_bos;      /* in, array length of bos field */
+	__u32 frame_size;  /* in, size of frame field */
+	__u64 bos;         /* in, array of drm_lima_gem_submit_bo */
+	__u64 frame;       /* in, GP/PP frame */
+	__u32 flags;       /* in, submit flags */
+	__u32 out_sync;    /* in, drm_syncobj handle used to wait task finish after submission */
+	__u32 in_sync[2];  /* in, drm_syncobj handle used to wait before start this task */
+};
+
+#define LIMA_GEM_WAIT_READ   0x01
+#define LIMA_GEM_WAIT_WRITE  0x02
+
+/**
+ * wait pending GPU task finish of a buffer
+ */
+struct drm_lima_gem_wait {
+	__u32 handle;      /* in, GEM buffer handle */
+	__u32 op;          /* in, CPU want to read/write this buffer */
+	__s64 timeout_ns;  /* in, wait timeout in absulute time */
+};
+
+/**
+ * create a context
+ */
+struct drm_lima_ctx_create {
+	__u32 id;          /* out, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+/**
+ * free a context
+ */
+struct drm_lima_ctx_free {
+	__u32 id;          /* in, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+#define DRM_LIMA_GET_PARAM   0x00
+#define DRM_LIMA_GEM_CREATE  0x01
+#define DRM_LIMA_GEM_INFO    0x02
+#define DRM_LIMA_GEM_SUBMIT  0x03
+#define DRM_LIMA_GEM_WAIT    0x04
+#define DRM_LIMA_CTX_CREATE  0x05
+#define DRM_LIMA_CTX_FREE    0x06
+
+#define DRM_IOCTL_LIMA_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GET_PARAM, struct drm_lima_get_param)
+#define DRM_IOCTL_LIMA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_CREATE, struct drm_lima_gem_create)
+#define DRM_IOCTL_LIMA_GEM_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_INFO, struct drm_lima_gem_info)
+#define DRM_IOCTL_LIMA_GEM_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_SUBMIT, struct drm_lima_gem_submit)
+#define DRM_IOCTL_LIMA_GEM_WAIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_WAIT, struct drm_lima_gem_wait)
+#define DRM_IOCTL_LIMA_CTX_CREATE DRM_IOR(DRM_COMMAND_BASE + DRM_LIMA_CTX_CREATE, struct drm_lima_ctx_create)
+#define DRM_IOCTL_LIMA_CTX_FREE DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_CTX_FREE, struct drm_lima_ctx_free)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* __LIMA_DRM_H__ */
diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h
index 91a16b333c69..0b85ed6a3710 100644
--- a/include/drm-uapi/msm_drm.h
+++ b/include/drm-uapi/msm_drm.h
@@ -74,6 +74,8 @@ struct drm_msm_timespec {
 #define MSM_PARAM_TIMESTAMP  0x05
 #define MSM_PARAM_GMEM_BASE  0x06
 #define MSM_PARAM_NR_RINGS   0x07
+#define MSM_PARAM_PP_PGTABLE 0x08  /* => 1 for per-process pagetables, else 0 */
+#define MSM_PARAM_FAULTS     0x09
 
 struct drm_msm_param {
 	__u32 pipe;           /* in, MSM_PIPE_x */
@@ -286,6 +288,16 @@ struct drm_msm_submitqueue {
 	__u32 id;      /* out, identifier */
 };
 
+#define MSM_SUBMITQUEUE_PARAM_FAULTS   0
+
+struct drm_msm_submitqueue_query {
+	__u64 data;
+	__u32 id;
+	__u32 param;
+	__u32 len;
+	__u32 pad;
+};
+
 #define DRM_MSM_GET_PARAM              0x00
 /* placeholder:
 #define DRM_MSM_SET_PARAM              0x01
@@ -302,6 +314,7 @@ struct drm_msm_submitqueue {
  */
 #define DRM_MSM_SUBMITQUEUE_NEW        0x0A
 #define DRM_MSM_SUBMITQUEUE_CLOSE      0x0B
+#define DRM_MSM_SUBMITQUEUE_QUERY      0x0C
 
 #define DRM_IOCTL_MSM_GET_PARAM        DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param)
 #define DRM_IOCTL_MSM_GEM_NEW          DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_NEW, struct drm_msm_gem_new)
@@ -313,6 +326,7 @@ struct drm_msm_submitqueue {
 #define DRM_IOCTL_MSM_GEM_MADVISE      DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_MADVISE, struct drm_msm_gem_madvise)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW    DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32)
+#define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query)
 
 #if defined(__cplusplus)
 }
diff --git a/include/drm-uapi/nouveau_drm.h b/include/drm-uapi/nouveau_drm.h
index 259588a4b61b..9459a6e3bc1f 100644
--- a/include/drm-uapi/nouveau_drm.h
+++ b/include/drm-uapi/nouveau_drm.h
@@ -133,12 +133,63 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_NOTIFIEROBJ_ALLOC  0x05 /* deprecated */
 #define DRM_NOUVEAU_GPUOBJ_FREE        0x06 /* deprecated */
 #define DRM_NOUVEAU_NVIF               0x07
+#define DRM_NOUVEAU_SVM_INIT           0x08
+#define DRM_NOUVEAU_SVM_BIND           0x09
 #define DRM_NOUVEAU_GEM_NEW            0x40
 #define DRM_NOUVEAU_GEM_PUSHBUF        0x41
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
 
+struct drm_nouveau_svm_init {
+	__u64 unmanaged_addr;
+	__u64 unmanaged_size;
+};
+
+struct drm_nouveau_svm_bind {
+	__u64 header;
+	__u64 va_start;
+	__u64 va_end;
+	__u64 npages;
+	__u64 stride;
+	__u64 result;
+	__u64 reserved0;
+	__u64 reserved1;
+};
+
+#define NOUVEAU_SVM_BIND_COMMAND_SHIFT          0
+#define NOUVEAU_SVM_BIND_COMMAND_BITS           8
+#define NOUVEAU_SVM_BIND_COMMAND_MASK           ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_PRIORITY_SHIFT         8
+#define NOUVEAU_SVM_BIND_PRIORITY_BITS          8
+#define NOUVEAU_SVM_BIND_PRIORITY_MASK          ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_TARGET_SHIFT           16
+#define NOUVEAU_SVM_BIND_TARGET_BITS            32
+#define NOUVEAU_SVM_BIND_TARGET_MASK            0xffffffff
+
+/*
+ * Below is use to validate ioctl argument, userspace can also use it to make
+ * sure that no bit are set beyond known fields for a given kernel version.
+ */
+#define NOUVEAU_SVM_BIND_VALID_BITS     48
+#define NOUVEAU_SVM_BIND_VALID_MASK     ((1ULL << NOUVEAU_SVM_BIND_VALID_BITS) - 1)
+
+
+/*
+ * NOUVEAU_BIND_COMMAND__MIGRATE: synchronous migrate to target memory.
+ * result: number of page successfuly migrate to the target memory.
+ */
+#define NOUVEAU_SVM_BIND_COMMAND__MIGRATE               0
+
+/*
+ * NOUVEAU_SVM_BIND_HEADER_TARGET__GPU_VRAM: target the GPU VRAM memory.
+ */
+#define NOUVEAU_SVM_BIND_TARGET__GPU_VRAM               (1UL << 31)
+
+
+#define DRM_IOCTL_NOUVEAU_SVM_INIT           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_INIT, struct drm_nouveau_svm_init)
+#define DRM_IOCTL_NOUVEAU_SVM_BIND           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_BIND, struct drm_nouveau_svm_bind)
+
 #define DRM_IOCTL_NOUVEAU_GEM_NEW            DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_NEW, struct drm_nouveau_gem_new)
 #define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF        DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_PUSHBUF, struct drm_nouveau_gem_pushbuf)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
diff --git a/include/drm-uapi/panfrost_drm.h b/include/drm-uapi/panfrost_drm.h
new file mode 100644
index 000000000000..a52e0283b90d
--- /dev/null
+++ b/include/drm-uapi/panfrost_drm.h
@@ -0,0 +1,142 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2014-2018 Broadcom
+ * Copyright © 2019 Collabora ltd.
+ */
+#ifndef _PANFROST_DRM_H_
+#define _PANFROST_DRM_H_
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+#define DRM_PANFROST_SUBMIT			0x00
+#define DRM_PANFROST_WAIT_BO			0x01
+#define DRM_PANFROST_CREATE_BO			0x02
+#define DRM_PANFROST_MMAP_BO			0x03
+#define DRM_PANFROST_GET_PARAM			0x04
+#define DRM_PANFROST_GET_BO_OFFSET		0x05
+
+#define DRM_IOCTL_PANFROST_SUBMIT		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
+#define DRM_IOCTL_PANFROST_WAIT_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
+#define DRM_IOCTL_PANFROST_CREATE_BO		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_CREATE_BO, struct drm_panfrost_create_bo)
+#define DRM_IOCTL_PANFROST_MMAP_BO		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MMAP_BO, struct drm_panfrost_mmap_bo)
+#define DRM_IOCTL_PANFROST_GET_PARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_PARAM, struct drm_panfrost_get_param)
+#define DRM_IOCTL_PANFROST_GET_BO_OFFSET	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_BO_OFFSET, struct drm_panfrost_get_bo_offset)
+
+#define PANFROST_JD_REQ_FS (1 << 0)
+/**
+ * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
+ * engine.
+ *
+ * This asks the kernel to have the GPU execute a render command list.
+ */
+struct drm_panfrost_submit {
+
+	/** Address to GPU mapping of job descriptor */
+	__u64 jc;
+
+	/** An optional array of sync objects to wait on before starting this job. */
+	__u64 in_syncs;
+
+	/** Number of sync objects to wait on before starting this job. */
+	__u32 in_sync_count;
+
+	/** An optional sync object to place the completion fence in. */
+	__u32 out_sync;
+
+	/** Pointer to a u32 array of the BOs that are referenced by the job. */
+	__u64 bo_handles;
+
+	/** Number of BO handles passed in (size is that times 4). */
+	__u32 bo_handle_count;
+
+	/** A combination of PANFROST_JD_REQ_* */
+	__u32 requirements;
+};
+
+/**
+ * struct drm_panfrost_wait_bo - ioctl argument for waiting for
+ * completion of the last DRM_PANFROST_SUBMIT on a BO.
+ *
+ * This is useful for cases where multiple processes might be
+ * rendering to a BO and you want to wait for all rendering to be
+ * completed.
+ */
+struct drm_panfrost_wait_bo {
+	__u32 handle;
+	__u32 pad;
+	__s64 timeout_ns;	/* absolute */
+};
+
+/**
+ * struct drm_panfrost_create_bo - ioctl argument for creating Panfrost BOs.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_panfrost_create_bo {
+	__u32 size;
+	__u32 flags;
+	/** Returned GEM handle for the BO. */
+	__u32 handle;
+	/* Pad, must be zero-filled. */
+	__u32 pad;
+	/**
+	 * Returned offset for the BO in the GPU address space.  This offset
+	 * is private to the DRM fd and is valid for the lifetime of the GEM
+	 * handle.
+	 *
+	 * This offset value will always be nonzero, since various HW
+	 * units treat 0 specially.
+	 */
+	__u64 offset;
+};
+
+/**
+ * struct drm_panfrost_mmap_bo - ioctl argument for mapping Panfrost BOs.
+ *
+ * This doesn't actually perform an mmap.  Instead, it returns the
+ * offset you need to use in an mmap on the DRM device node.  This
+ * means that tools like valgrind end up knowing about the mapped
+ * memory.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_panfrost_mmap_bo {
+	/** Handle for the object being mapped. */
+	__u32 handle;
+	__u32 flags;
+	/** offset into the drm node to use for subsequent mmap call. */
+	__u64 offset;
+};
+
+enum drm_panfrost_param {
+	DRM_PANFROST_PARAM_GPU_PROD_ID,
+};
+
+struct drm_panfrost_get_param {
+	__u32 param;
+	__u32 pad;
+	__u64 value;
+};
+
+/**
+ * Returns the offset for the BO in the GPU address space for this DRM fd.
+ * This is the same value returned by drm_panfrost_create_bo, if that was called
+ * from this DRM fd.
+ */
+struct drm_panfrost_get_bo_offset {
+	__u32 handle;
+	__u32 pad;
+	__u64 offset;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _PANFROST_DRM_H_ */
diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h
index ea70669d2138..58fbe48c91e9 100644
--- a/include/drm-uapi/v3d_drm.h
+++ b/include/drm-uapi/v3d_drm.h
@@ -37,6 +37,7 @@ extern "C" {
 #define DRM_V3D_GET_PARAM                         0x04
 #define DRM_V3D_GET_BO_OFFSET                     0x05
 #define DRM_V3D_SUBMIT_TFU                        0x06
+#define DRM_V3D_SUBMIT_CSD                        0x07
 
 #define DRM_IOCTL_V3D_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
 #define DRM_IOCTL_V3D_WAIT_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
@@ -45,6 +46,7 @@ extern "C" {
 #define DRM_IOCTL_V3D_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
 #define DRM_IOCTL_V3D_GET_BO_OFFSET       DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
 #define DRM_IOCTL_V3D_SUBMIT_TFU          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_TFU, struct drm_v3d_submit_tfu)
+#define DRM_IOCTL_V3D_SUBMIT_CSD          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CSD, struct drm_v3d_submit_csd)
 
 /**
  * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
@@ -190,6 +192,7 @@ enum drm_v3d_param {
 	DRM_V3D_PARAM_V3D_CORE0_IDENT1,
 	DRM_V3D_PARAM_V3D_CORE0_IDENT2,
 	DRM_V3D_PARAM_SUPPORTS_TFU,
+	DRM_V3D_PARAM_SUPPORTS_CSD,
 };
 
 struct drm_v3d_get_param {
@@ -230,6 +233,31 @@ struct drm_v3d_submit_tfu {
 	__u32 out_sync;
 };
 
+/* Submits a compute shader for dispatch.  This job will block on any
+ * previous compute shaders submitted on this fd, and any other
+ * synchronization must be performed with in_sync/out_sync.
+ */
+struct drm_v3d_submit_csd {
+	__u32 cfg[7];
+	__u32 coef[4];
+
+	/* Pointer to a u32 array of the BOs that are referenced by the job.
+	 */
+	__u64 bo_handles;
+
+	/* Number of BO handles passed in (size is that times 4). */
+	__u32 bo_handle_count;
+
+	/* sync object to block on before running the CSD job.  Each
+	 * CSD job will execute in the order submitted to its FD.
+	 * Synchronization against rendering/TFU jobs or CSD from
+	 * other fds requires using sync objects.
+	 */
+	__u32 in_sync;
+	/* Sync object to signal when the CSD job is done. */
+	__u32 out_sync;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 02/21] headers: bump
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Catch up to drm-tip headers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 include/drm-uapi/amdgpu_drm.h   |  52 +++++++-
 include/drm-uapi/drm.h          |  36 ++++++
 include/drm-uapi/drm_mode.h     |   4 +-
 include/drm-uapi/i915_drm.h     | 209 +++++++++++++++++++++++++++++++-
 include/drm-uapi/lima_drm.h     | 169 ++++++++++++++++++++++++++
 include/drm-uapi/msm_drm.h      |  14 +++
 include/drm-uapi/nouveau_drm.h  |  51 ++++++++
 include/drm-uapi/panfrost_drm.h | 142 ++++++++++++++++++++++
 include/drm-uapi/v3d_drm.h      |  28 +++++
 9 files changed, 699 insertions(+), 6 deletions(-)
 create mode 100644 include/drm-uapi/lima_drm.h
 create mode 100644 include/drm-uapi/panfrost_drm.h

diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h
index be84e43c1e19..4788730dbe78 100644
--- a/include/drm-uapi/amdgpu_drm.h
+++ b/include/drm-uapi/amdgpu_drm.h
@@ -210,6 +210,9 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_QUERY2_FLAGS_VRAMLOST (1<<1)
 /* indicate some job from this context once cause gpu hang */
 #define AMDGPU_CTX_QUERY2_FLAGS_GUILTY   (1<<2)
+/* indicate some errors are detected by RAS */
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_CE   (1<<3)
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_UE   (1<<4)
 
 /* Context priority level */
 #define AMDGPU_CTX_PRIORITY_UNSET       -2048
@@ -272,13 +275,14 @@ union drm_amdgpu_vm {
 
 /* sched ioctl */
 #define AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE	1
+#define AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE	2
 
 struct drm_amdgpu_sched_in {
 	/* AMDGPU_SCHED_OP_* */
 	__u32	op;
 	__u32	fd;
 	__s32	priority;
-	__u32	flags;
+	__u32   ctx_id;
 };
 
 union drm_amdgpu_sched {
@@ -523,6 +527,9 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN      0x04
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT     0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES      0x06
+#define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES	0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT    0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
 
 struct drm_amdgpu_cs_chunk {
 	__u32		chunk_id;
@@ -565,6 +572,11 @@ union drm_amdgpu_cs {
  * caches (L2/vL1/sL1/I$). */
 #define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
 
+/* Set GDS_COMPUTE_MAX_WAVE_ID = DEFAULT before PACKET3_INDIRECT_BUFFER.
+ * This will reset wave ID counters for the IB.
+ */
+#define AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID (1 << 4)
+
 struct drm_amdgpu_cs_chunk_ib {
 	__u32 _pad;
 	/** AMDGPU_IB_FLAG_* */
@@ -598,6 +610,12 @@ struct drm_amdgpu_cs_chunk_sem {
 	__u32 handle;
 };
 
+struct drm_amdgpu_cs_chunk_syncobj {
+       __u32 handle;
+       __u32 flags;
+       __u64 point;
+};
+
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ	0
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD	1
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD	2
@@ -673,6 +691,7 @@ struct drm_amdgpu_cs_chunk_data {
 	#define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11
 	/* Subquery id: Query DMCU firmware version */
 	#define AMDGPU_INFO_FW_DMCU		0x12
+	#define AMDGPU_INFO_FW_TA		0x13
 /* number of bytes moved for TTM migration */
 #define AMDGPU_INFO_NUM_BYTES_MOVED		0x0f
 /* the used VRAM size */
@@ -726,6 +745,37 @@ struct drm_amdgpu_cs_chunk_data {
 /* Number of VRAM page faults on CPU access. */
 #define AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS	0x1E
 #define AMDGPU_INFO_VRAM_LOST_COUNTER		0x1F
+/* query ras mask of enabled features*/
+#define AMDGPU_INFO_RAS_ENABLED_FEATURES	0x20
+
+/* RAS MASK: UMC (VRAM) */
+#define AMDGPU_INFO_RAS_ENABLED_UMC			(1 << 0)
+/* RAS MASK: SDMA */
+#define AMDGPU_INFO_RAS_ENABLED_SDMA			(1 << 1)
+/* RAS MASK: GFX */
+#define AMDGPU_INFO_RAS_ENABLED_GFX			(1 << 2)
+/* RAS MASK: MMHUB */
+#define AMDGPU_INFO_RAS_ENABLED_MMHUB			(1 << 3)
+/* RAS MASK: ATHUB */
+#define AMDGPU_INFO_RAS_ENABLED_ATHUB			(1 << 4)
+/* RAS MASK: PCIE */
+#define AMDGPU_INFO_RAS_ENABLED_PCIE			(1 << 5)
+/* RAS MASK: HDP */
+#define AMDGPU_INFO_RAS_ENABLED_HDP			(1 << 6)
+/* RAS MASK: XGMI */
+#define AMDGPU_INFO_RAS_ENABLED_XGMI			(1 << 7)
+/* RAS MASK: DF */
+#define AMDGPU_INFO_RAS_ENABLED_DF			(1 << 8)
+/* RAS MASK: SMN */
+#define AMDGPU_INFO_RAS_ENABLED_SMN			(1 << 9)
+/* RAS MASK: SEM */
+#define AMDGPU_INFO_RAS_ENABLED_SEM			(1 << 10)
+/* RAS MASK: MP0 */
+#define AMDGPU_INFO_RAS_ENABLED_MP0			(1 << 11)
+/* RAS MASK: MP1 */
+#define AMDGPU_INFO_RAS_ENABLED_MP1			(1 << 12)
+/* RAS MASK: FUSE */
+#define AMDGPU_INFO_RAS_ENABLED_FUSE			(1 << 13)
 
 #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT	0
 #define AMDGPU_INFO_MMR_SE_INDEX_MASK	0xff
diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index 85c685a2075e..c893f3b4a895 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -729,8 +729,18 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+struct drm_syncobj_transfer {
+	__u32 src_handle;
+	__u32 dst_handle;
+	__u64 src_point;
+	__u64 dst_point;
+	__u32 flags;
+	__u32 pad;
+};
+
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE (1 << 2) /* wait for time point to become available */
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
@@ -741,12 +751,33 @@ struct drm_syncobj_wait {
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_wait {
+	__u64 handles;
+	/* wait on specific timeline point for every handles*/
+	__u64 points;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
+
 struct drm_syncobj_array {
 	__u64 handles;
 	__u32 count_handles;
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_array {
+	__u64 handles;
+	__u64 points;
+	__u32 count_handles;
+	__u32 pad;
+};
+
+
 /* Query current scanout sequence number */
 struct drm_crtc_get_sequence {
 	__u32 crtc_id;		/* requested crtc_id */
@@ -903,6 +934,11 @@ extern "C" {
 #define DRM_IOCTL_MODE_GET_LEASE	DRM_IOWR(0xC8, struct drm_mode_get_lease)
 #define DRM_IOCTL_MODE_REVOKE_LEASE	DRM_IOWR(0xC9, struct drm_mode_revoke_lease)
 
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT	DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait)
+#define DRM_IOCTL_SYNCOBJ_QUERY		DRM_IOWR(0xCB, struct drm_syncobj_timeline_array)
+#define DRM_IOCTL_SYNCOBJ_TRANSFER	DRM_IOWR(0xCC, struct drm_syncobj_transfer)
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL	DRM_IOWR(0xCD, struct drm_syncobj_timeline_array)
+
 /**
  * Device specific ioctls should only be in their respective headers
  * The device specific ioctl range is from 0x40 to 0x9f.
diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h
index a439c2e67896..83cd1636b9be 100644
--- a/include/drm-uapi/drm_mode.h
+++ b/include/drm-uapi/drm_mode.h
@@ -33,7 +33,6 @@
 extern "C" {
 #endif
 
-#define DRM_DISPLAY_INFO_LEN	32
 #define DRM_CONNECTOR_NAME_LEN	32
 #define DRM_DISPLAY_MODE_LEN	32
 #define DRM_PROP_NAME_LEN	32
@@ -622,7 +621,8 @@ struct drm_color_ctm {
 
 struct drm_color_lut {
 	/*
-	 * Data is U0.16 fixed point format.
+	 * Values are mapped linearly to 0.0 - 1.0 range, with 0x0 == 0.0 and
+	 * 0xffff == 1.0.
 	 */
 	__u16 red;
 	__u16 green;
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index e01b3e1fd6d6..761517f15368 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -136,6 +136,8 @@ enum drm_i915_gem_engine_class {
 struct i915_engine_class_instance {
 	__u16 engine_class; /* see enum drm_i915_gem_engine_class */
 	__u16 engine_instance;
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL -2
 };
 
 /**
@@ -355,6 +357,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
 #define DRM_I915_QUERY			0x39
+#define DRM_I915_GEM_VM_CREATE		0x3a
+#define DRM_I915_GEM_VM_DESTROY		0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -415,6 +419,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -598,6 +604,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1120,7 +1132,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT		(1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1464,8 +1485,9 @@ struct drm_i915_gem_context_create_ext {
 	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
 	__u64 extensions;
 };
 
@@ -1507,6 +1529,41 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE	0x8
+
+	/*
+	 * The id of the associated virtual memory address space (ppGTT) of
+	 * this context. Can be retrieved and passed to another context
+	 * (on the same fd) for both to use the same ppGTT and so share
+	 * address layouts, and avoid reloading the page tables on context
+	 * switches between themselves.
+	 *
+	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+	 */
+#define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1540,9 +1597,10 @@ struct drm_i915_gem_context_param_sseu {
 	struct i915_engine_class_instance engine;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * Unknown flags must be cleared to zero.
 	 */
 	__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
 
 	/*
 	 * Mask of slices to enable for the context. Valid values are a subset
@@ -1570,12 +1628,115 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 num_siblings;
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 mbz64; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(name__, N__) struct { \
+	struct i915_user_extension base; \
+	__u16 engine_index; \
+	__u16 num_siblings; \
+	__u32 flags; \
+	__u64 mbz64; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by specifying bonding pairs, for any given master engine we will
+ * only execute on one of the corresponding siblings within the virtual engine.
+ *
+ * To execute a request in parallel on the master engine and a sibling requires
+ * coordination with a I915_EXEC_FENCE_SUBMIT.
+ */
+struct i915_context_engines_bond {
+	struct i915_user_extension base;
+
+	struct i915_engine_class_instance master;
+
+	__u16 virtual_index; /* index of virtual engine in ctx->engines[] */
+	__u16 num_bonds;
+
+	__u64 flags; /* all undefined flags must be zero */
+	__u64 mbz64[4]; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_BOND(name__, N__) struct { \
+	struct i915_user_extension base; \
+	struct i915_engine_class_instance master; \
+	__u16 virtual_index; \
+	__u16 num_bonds; \
+	__u64 flags; \
+	__u64 mbz64[4]; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see i915_context_engines_load_balance */
+#define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
+#define I915_CONTEXT_CLONE_VM		(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
@@ -1821,6 +1982,7 @@ struct drm_i915_perf_oa_config {
 struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
+#define DRM_I915_QUERY_ENGINE_INFO	2
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -1919,6 +2081,47 @@ struct drm_i915_query_topology_info {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_engine_info
+ *
+ * Describes one engine and it's capabilities as known to the driver.
+ */
+struct drm_i915_engine_info {
+	/** Engine class and instance. */
+	struct i915_engine_class_instance engine;
+
+	/** Reserved field. */
+	__u32 rsvd0;
+
+	/** Engine flags. */
+	__u64 flags;
+
+	/** Capabilities of this engine. */
+	__u64 capabilities;
+#define I915_VIDEO_CLASS_CAPABILITY_HEVC		(1 << 0)
+#define I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC	(1 << 1)
+
+	/** Reserved fields. */
+	__u64 rsvd1[4];
+};
+
+/**
+ * struct drm_i915_query_engine_info
+ *
+ * Engine info query enumerates all engines known to the driver by filling in
+ * an array of struct drm_i915_engine_info structures.
+ */
+struct drm_i915_query_engine_info {
+	/** Number of struct drm_i915_engine_info structs following. */
+	__u32 num_engines;
+
+	/** MBZ */
+	__u32 rsvd[3];
+
+	/** Marker for drm_i915_engine_info structures. */
+	struct drm_i915_engine_info engines[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/drm-uapi/lima_drm.h b/include/drm-uapi/lima_drm.h
new file mode 100644
index 000000000000..95a00fb867e6
--- /dev/null
+++ b/include/drm-uapi/lima_drm.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */
+/* Copyright 2017-2018 Qiang Yu <yuq825@gmail.com> */
+
+#ifndef __LIMA_DRM_H__
+#define __LIMA_DRM_H__
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+enum drm_lima_param_gpu_id {
+	DRM_LIMA_PARAM_GPU_ID_UNKNOWN,
+	DRM_LIMA_PARAM_GPU_ID_MALI400,
+	DRM_LIMA_PARAM_GPU_ID_MALI450,
+};
+
+enum drm_lima_param {
+	DRM_LIMA_PARAM_GPU_ID,
+	DRM_LIMA_PARAM_NUM_PP,
+	DRM_LIMA_PARAM_GP_VERSION,
+	DRM_LIMA_PARAM_PP_VERSION,
+};
+
+/**
+ * get various information of the GPU
+ */
+struct drm_lima_get_param {
+	__u32 param; /* in, value in enum drm_lima_param */
+	__u32 pad;   /* pad, must be zero */
+	__u64 value; /* out, parameter value */
+};
+
+/**
+ * create a buffer for used by GPU
+ */
+struct drm_lima_gem_create {
+	__u32 size;    /* in, buffer size */
+	__u32 flags;   /* in, currently no flags, must be zero */
+	__u32 handle;  /* out, GEM buffer handle */
+	__u32 pad;     /* pad, must be zero */
+};
+
+/**
+ * get information of a buffer
+ */
+struct drm_lima_gem_info {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 va;      /* out, virtual address mapped into GPU MMU */
+	__u64 offset;  /* out, used to mmap this buffer to CPU */
+};
+
+#define LIMA_SUBMIT_BO_READ   0x01
+#define LIMA_SUBMIT_BO_WRITE  0x02
+
+/* buffer information used by one task */
+struct drm_lima_gem_submit_bo {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 flags;   /* in, buffer read/write by GPU */
+};
+
+#define LIMA_GP_FRAME_REG_NUM 6
+
+/* frame used to setup GP for each task */
+struct drm_lima_gp_frame {
+	__u32 frame[LIMA_GP_FRAME_REG_NUM];
+};
+
+#define LIMA_PP_FRAME_REG_NUM 23
+#define LIMA_PP_WB_REG_NUM 12
+
+/* frame used to setup mali400 GPU PP for each task */
+struct drm_lima_m400_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 plbu_array_address[4];
+	__u32 fragment_stack_address[4];
+};
+
+/* frame used to setup mali450 GPU PP for each task */
+struct drm_lima_m450_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 use_dlbu;
+	__u32 _pad;
+	union {
+		__u32 plbu_array_address[8];
+		__u32 dlbu_regs[4];
+	};
+	__u32 fragment_stack_address[8];
+};
+
+#define LIMA_PIPE_GP  0x00
+#define LIMA_PIPE_PP  0x01
+
+#define LIMA_SUBMIT_FLAG_EXPLICIT_FENCE (1 << 0)
+
+/**
+ * submit a task to GPU
+ *
+ * User can always merge multi sync_file and drm_syncobj
+ * into one drm_syncobj as in_sync[0], but we reserve
+ * in_sync[1] for another task's out_sync to avoid the
+ * export/import/merge pass when explicit sync.
+ */
+struct drm_lima_gem_submit {
+	__u32 ctx;         /* in, context handle task is submitted to */
+	__u32 pipe;        /* in, which pipe to use, GP/PP */
+	__u32 nr_bos;      /* in, array length of bos field */
+	__u32 frame_size;  /* in, size of frame field */
+	__u64 bos;         /* in, array of drm_lima_gem_submit_bo */
+	__u64 frame;       /* in, GP/PP frame */
+	__u32 flags;       /* in, submit flags */
+	__u32 out_sync;    /* in, drm_syncobj handle used to wait task finish after submission */
+	__u32 in_sync[2];  /* in, drm_syncobj handle used to wait before start this task */
+};
+
+#define LIMA_GEM_WAIT_READ   0x01
+#define LIMA_GEM_WAIT_WRITE  0x02
+
+/**
+ * wait pending GPU task finish of a buffer
+ */
+struct drm_lima_gem_wait {
+	__u32 handle;      /* in, GEM buffer handle */
+	__u32 op;          /* in, CPU want to read/write this buffer */
+	__s64 timeout_ns;  /* in, wait timeout in absulute time */
+};
+
+/**
+ * create a context
+ */
+struct drm_lima_ctx_create {
+	__u32 id;          /* out, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+/**
+ * free a context
+ */
+struct drm_lima_ctx_free {
+	__u32 id;          /* in, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+#define DRM_LIMA_GET_PARAM   0x00
+#define DRM_LIMA_GEM_CREATE  0x01
+#define DRM_LIMA_GEM_INFO    0x02
+#define DRM_LIMA_GEM_SUBMIT  0x03
+#define DRM_LIMA_GEM_WAIT    0x04
+#define DRM_LIMA_CTX_CREATE  0x05
+#define DRM_LIMA_CTX_FREE    0x06
+
+#define DRM_IOCTL_LIMA_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GET_PARAM, struct drm_lima_get_param)
+#define DRM_IOCTL_LIMA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_CREATE, struct drm_lima_gem_create)
+#define DRM_IOCTL_LIMA_GEM_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_INFO, struct drm_lima_gem_info)
+#define DRM_IOCTL_LIMA_GEM_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_SUBMIT, struct drm_lima_gem_submit)
+#define DRM_IOCTL_LIMA_GEM_WAIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_WAIT, struct drm_lima_gem_wait)
+#define DRM_IOCTL_LIMA_CTX_CREATE DRM_IOR(DRM_COMMAND_BASE + DRM_LIMA_CTX_CREATE, struct drm_lima_ctx_create)
+#define DRM_IOCTL_LIMA_CTX_FREE DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_CTX_FREE, struct drm_lima_ctx_free)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* __LIMA_DRM_H__ */
diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h
index 91a16b333c69..0b85ed6a3710 100644
--- a/include/drm-uapi/msm_drm.h
+++ b/include/drm-uapi/msm_drm.h
@@ -74,6 +74,8 @@ struct drm_msm_timespec {
 #define MSM_PARAM_TIMESTAMP  0x05
 #define MSM_PARAM_GMEM_BASE  0x06
 #define MSM_PARAM_NR_RINGS   0x07
+#define MSM_PARAM_PP_PGTABLE 0x08  /* => 1 for per-process pagetables, else 0 */
+#define MSM_PARAM_FAULTS     0x09
 
 struct drm_msm_param {
 	__u32 pipe;           /* in, MSM_PIPE_x */
@@ -286,6 +288,16 @@ struct drm_msm_submitqueue {
 	__u32 id;      /* out, identifier */
 };
 
+#define MSM_SUBMITQUEUE_PARAM_FAULTS   0
+
+struct drm_msm_submitqueue_query {
+	__u64 data;
+	__u32 id;
+	__u32 param;
+	__u32 len;
+	__u32 pad;
+};
+
 #define DRM_MSM_GET_PARAM              0x00
 /* placeholder:
 #define DRM_MSM_SET_PARAM              0x01
@@ -302,6 +314,7 @@ struct drm_msm_submitqueue {
  */
 #define DRM_MSM_SUBMITQUEUE_NEW        0x0A
 #define DRM_MSM_SUBMITQUEUE_CLOSE      0x0B
+#define DRM_MSM_SUBMITQUEUE_QUERY      0x0C
 
 #define DRM_IOCTL_MSM_GET_PARAM        DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param)
 #define DRM_IOCTL_MSM_GEM_NEW          DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_NEW, struct drm_msm_gem_new)
@@ -313,6 +326,7 @@ struct drm_msm_submitqueue {
 #define DRM_IOCTL_MSM_GEM_MADVISE      DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_MADVISE, struct drm_msm_gem_madvise)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW    DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32)
+#define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query)
 
 #if defined(__cplusplus)
 }
diff --git a/include/drm-uapi/nouveau_drm.h b/include/drm-uapi/nouveau_drm.h
index 259588a4b61b..9459a6e3bc1f 100644
--- a/include/drm-uapi/nouveau_drm.h
+++ b/include/drm-uapi/nouveau_drm.h
@@ -133,12 +133,63 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_NOTIFIEROBJ_ALLOC  0x05 /* deprecated */
 #define DRM_NOUVEAU_GPUOBJ_FREE        0x06 /* deprecated */
 #define DRM_NOUVEAU_NVIF               0x07
+#define DRM_NOUVEAU_SVM_INIT           0x08
+#define DRM_NOUVEAU_SVM_BIND           0x09
 #define DRM_NOUVEAU_GEM_NEW            0x40
 #define DRM_NOUVEAU_GEM_PUSHBUF        0x41
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
 
+struct drm_nouveau_svm_init {
+	__u64 unmanaged_addr;
+	__u64 unmanaged_size;
+};
+
+struct drm_nouveau_svm_bind {
+	__u64 header;
+	__u64 va_start;
+	__u64 va_end;
+	__u64 npages;
+	__u64 stride;
+	__u64 result;
+	__u64 reserved0;
+	__u64 reserved1;
+};
+
+#define NOUVEAU_SVM_BIND_COMMAND_SHIFT          0
+#define NOUVEAU_SVM_BIND_COMMAND_BITS           8
+#define NOUVEAU_SVM_BIND_COMMAND_MASK           ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_PRIORITY_SHIFT         8
+#define NOUVEAU_SVM_BIND_PRIORITY_BITS          8
+#define NOUVEAU_SVM_BIND_PRIORITY_MASK          ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_TARGET_SHIFT           16
+#define NOUVEAU_SVM_BIND_TARGET_BITS            32
+#define NOUVEAU_SVM_BIND_TARGET_MASK            0xffffffff
+
+/*
+ * Below is use to validate ioctl argument, userspace can also use it to make
+ * sure that no bit are set beyond known fields for a given kernel version.
+ */
+#define NOUVEAU_SVM_BIND_VALID_BITS     48
+#define NOUVEAU_SVM_BIND_VALID_MASK     ((1ULL << NOUVEAU_SVM_BIND_VALID_BITS) - 1)
+
+
+/*
+ * NOUVEAU_BIND_COMMAND__MIGRATE: synchronous migrate to target memory.
+ * result: number of page successfuly migrate to the target memory.
+ */
+#define NOUVEAU_SVM_BIND_COMMAND__MIGRATE               0
+
+/*
+ * NOUVEAU_SVM_BIND_HEADER_TARGET__GPU_VRAM: target the GPU VRAM memory.
+ */
+#define NOUVEAU_SVM_BIND_TARGET__GPU_VRAM               (1UL << 31)
+
+
+#define DRM_IOCTL_NOUVEAU_SVM_INIT           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_INIT, struct drm_nouveau_svm_init)
+#define DRM_IOCTL_NOUVEAU_SVM_BIND           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_BIND, struct drm_nouveau_svm_bind)
+
 #define DRM_IOCTL_NOUVEAU_GEM_NEW            DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_NEW, struct drm_nouveau_gem_new)
 #define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF        DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_PUSHBUF, struct drm_nouveau_gem_pushbuf)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
diff --git a/include/drm-uapi/panfrost_drm.h b/include/drm-uapi/panfrost_drm.h
new file mode 100644
index 000000000000..a52e0283b90d
--- /dev/null
+++ b/include/drm-uapi/panfrost_drm.h
@@ -0,0 +1,142 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2014-2018 Broadcom
+ * Copyright © 2019 Collabora ltd.
+ */
+#ifndef _PANFROST_DRM_H_
+#define _PANFROST_DRM_H_
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+#define DRM_PANFROST_SUBMIT			0x00
+#define DRM_PANFROST_WAIT_BO			0x01
+#define DRM_PANFROST_CREATE_BO			0x02
+#define DRM_PANFROST_MMAP_BO			0x03
+#define DRM_PANFROST_GET_PARAM			0x04
+#define DRM_PANFROST_GET_BO_OFFSET		0x05
+
+#define DRM_IOCTL_PANFROST_SUBMIT		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
+#define DRM_IOCTL_PANFROST_WAIT_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
+#define DRM_IOCTL_PANFROST_CREATE_BO		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_CREATE_BO, struct drm_panfrost_create_bo)
+#define DRM_IOCTL_PANFROST_MMAP_BO		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MMAP_BO, struct drm_panfrost_mmap_bo)
+#define DRM_IOCTL_PANFROST_GET_PARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_PARAM, struct drm_panfrost_get_param)
+#define DRM_IOCTL_PANFROST_GET_BO_OFFSET	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_BO_OFFSET, struct drm_panfrost_get_bo_offset)
+
+#define PANFROST_JD_REQ_FS (1 << 0)
+/**
+ * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
+ * engine.
+ *
+ * This asks the kernel to have the GPU execute a render command list.
+ */
+struct drm_panfrost_submit {
+
+	/** Address to GPU mapping of job descriptor */
+	__u64 jc;
+
+	/** An optional array of sync objects to wait on before starting this job. */
+	__u64 in_syncs;
+
+	/** Number of sync objects to wait on before starting this job. */
+	__u32 in_sync_count;
+
+	/** An optional sync object to place the completion fence in. */
+	__u32 out_sync;
+
+	/** Pointer to a u32 array of the BOs that are referenced by the job. */
+	__u64 bo_handles;
+
+	/** Number of BO handles passed in (size is that times 4). */
+	__u32 bo_handle_count;
+
+	/** A combination of PANFROST_JD_REQ_* */
+	__u32 requirements;
+};
+
+/**
+ * struct drm_panfrost_wait_bo - ioctl argument for waiting for
+ * completion of the last DRM_PANFROST_SUBMIT on a BO.
+ *
+ * This is useful for cases where multiple processes might be
+ * rendering to a BO and you want to wait for all rendering to be
+ * completed.
+ */
+struct drm_panfrost_wait_bo {
+	__u32 handle;
+	__u32 pad;
+	__s64 timeout_ns;	/* absolute */
+};
+
+/**
+ * struct drm_panfrost_create_bo - ioctl argument for creating Panfrost BOs.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_panfrost_create_bo {
+	__u32 size;
+	__u32 flags;
+	/** Returned GEM handle for the BO. */
+	__u32 handle;
+	/* Pad, must be zero-filled. */
+	__u32 pad;
+	/**
+	 * Returned offset for the BO in the GPU address space.  This offset
+	 * is private to the DRM fd and is valid for the lifetime of the GEM
+	 * handle.
+	 *
+	 * This offset value will always be nonzero, since various HW
+	 * units treat 0 specially.
+	 */
+	__u64 offset;
+};
+
+/**
+ * struct drm_panfrost_mmap_bo - ioctl argument for mapping Panfrost BOs.
+ *
+ * This doesn't actually perform an mmap.  Instead, it returns the
+ * offset you need to use in an mmap on the DRM device node.  This
+ * means that tools like valgrind end up knowing about the mapped
+ * memory.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_panfrost_mmap_bo {
+	/** Handle for the object being mapped. */
+	__u32 handle;
+	__u32 flags;
+	/** offset into the drm node to use for subsequent mmap call. */
+	__u64 offset;
+};
+
+enum drm_panfrost_param {
+	DRM_PANFROST_PARAM_GPU_PROD_ID,
+};
+
+struct drm_panfrost_get_param {
+	__u32 param;
+	__u32 pad;
+	__u64 value;
+};
+
+/**
+ * Returns the offset for the BO in the GPU address space for this DRM fd.
+ * This is the same value returned by drm_panfrost_create_bo, if that was called
+ * from this DRM fd.
+ */
+struct drm_panfrost_get_bo_offset {
+	__u32 handle;
+	__u32 pad;
+	__u64 offset;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _PANFROST_DRM_H_ */
diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h
index ea70669d2138..58fbe48c91e9 100644
--- a/include/drm-uapi/v3d_drm.h
+++ b/include/drm-uapi/v3d_drm.h
@@ -37,6 +37,7 @@ extern "C" {
 #define DRM_V3D_GET_PARAM                         0x04
 #define DRM_V3D_GET_BO_OFFSET                     0x05
 #define DRM_V3D_SUBMIT_TFU                        0x06
+#define DRM_V3D_SUBMIT_CSD                        0x07
 
 #define DRM_IOCTL_V3D_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
 #define DRM_IOCTL_V3D_WAIT_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
@@ -45,6 +46,7 @@ extern "C" {
 #define DRM_IOCTL_V3D_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
 #define DRM_IOCTL_V3D_GET_BO_OFFSET       DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
 #define DRM_IOCTL_V3D_SUBMIT_TFU          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_TFU, struct drm_v3d_submit_tfu)
+#define DRM_IOCTL_V3D_SUBMIT_CSD          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CSD, struct drm_v3d_submit_csd)
 
 /**
  * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
@@ -190,6 +192,7 @@ enum drm_v3d_param {
 	DRM_V3D_PARAM_V3D_CORE0_IDENT1,
 	DRM_V3D_PARAM_V3D_CORE0_IDENT2,
 	DRM_V3D_PARAM_SUPPORTS_TFU,
+	DRM_V3D_PARAM_SUPPORTS_CSD,
 };
 
 struct drm_v3d_get_param {
@@ -230,6 +233,31 @@ struct drm_v3d_submit_tfu {
 	__u32 out_sync;
 };
 
+/* Submits a compute shader for dispatch.  This job will block on any
+ * previous compute shaders submitted on this fd, and any other
+ * synchronization must be performed with in_sync/out_sync.
+ */
+struct drm_v3d_submit_csd {
+	__u32 cfg[7];
+	__u32 coef[4];
+
+	/* Pointer to a u32 array of the BOs that are referenced by the job.
+	 */
+	__u64 bo_handles;
+
+	/* Number of BO handles passed in (size is that times 4). */
+	__u32 bo_handle_count;
+
+	/* sync object to block on before running the CSD job.  Each
+	 * CSD job will execute in the order submitted to its FD.
+	 * Synchronization against rendering/TFU jobs or CSD from
+	 * other fds requires using sync objects.
+	 */
+	__u32 in_sync;
+	/* Sync object to signal when the CSD job is done. */
+	__u32 out_sync;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 03/21] trace.pl: Virtual engine support
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add virtual/queue timelines to both stdout and HTML output.

A new timeline is created for each queue/virtual engine to display
associated requests in queued and runnable states. Once requests are
submitted to a real engine for executing they show up on the physical
engine timeline.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 scripts/trace.pl | 238 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 208 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 95dc3a645e8e..6cc332bb6e2a 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,11 +27,16 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+my (%db, %vdb, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
     %ctxtimelines, %ctxengines);
+my (%cids, %ctxmap);
+my $cid = 0;
+my %queues;
 my @freqs;
 
-my $max_items = 3000;
+use constant VENG => '255:254';
+
+my $max_requests = 1000;
 my $width_us = 32000;
 my $correct_durations = 0;
 my %ignore_ring;
@@ -181,21 +186,21 @@ sub arg_trace
 	return @_;
 }
 
-sub arg_max_items
+sub arg_max_requests
 {
 	my $val;
 
 	return unless scalar(@_);
 
-	if ($_[0] eq '--max-items' or $_[0] eq '-m') {
+	if ($_[0] eq '--max-requests' or $_[0] eq '-m') {
 		shift @_;
 		$val = shift @_;
-	} elsif ($_[0] =~ /--max-items=(\d+)/) {
+	} elsif ($_[0] =~ /--max-requests=(\d+)/) {
 		shift @_;
 		$val = $1;
 	}
 
-	$max_items = int($val) if defined $val;
+	$max_requests = int($val) if defined $val;
 
 	return @_;
 }
@@ -292,7 +297,7 @@ while (@args) {
 	@args = arg_avg_delay_stats(@args);
 	@args = arg_gpu_timeline(@args);
 	@args = arg_trace(@args);
-	@args = arg_max_items(@args);
+	@args = arg_max_requests(@args);
 	@args = arg_zoom_width(@args);
 	@args = arg_split_requests(@args);
 	@args = arg_ignore_ring(@args);
@@ -324,6 +329,13 @@ sub sanitize_ctx
 	}
 }
 
+sub is_veng
+{
+	my ($class, $instance) = split ':', shift;
+
+	return $instance eq '254';
+}
+
 # Main input loop - parse lines and build the internal representation of the
 # trace using a hash of requests and some auxilliary data structures.
 my $prev_freq = 0;
@@ -366,6 +378,7 @@ while (<>) {
 			$ctx = $tp{'ctx'};
 			$orig_ctx = $ctx;
 			$ctx = sanitize_ctx($ctx, $ring);
+			$ring = VENG if is_veng($ring);
 			$key = db_key($ring, $ctx, $seqno);
 		}
 	}
@@ -374,6 +387,7 @@ while (<>) {
 		my %rw;
 
 		next if exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$rw{'key'} = $key;
 		$rw{'ring'} = $ring;
@@ -382,9 +396,19 @@ while (<>) {
 		$rw{'start'} = $time;
 		$reqwait{$key} = \%rw;
 	} elsif ($tp_name eq 'i915:i915_request_wait_end:') {
-		next unless exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
-		$reqwait{$key}->{'end'} = $time;
+		if (exists $reqwait{$key}) {
+			$reqwait{$key}->{'end'} = $time;
+		} else { # Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+
+			die unless exists $reqwait{$vkey};
+
+			# If the wait started on the virtual engine, attribute
+			# it to it completely.
+			$reqwait{$vkey}->{'end'} = $time;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_add:') {
 		if (exists $queue{$key}) {
 			$ctxdb{$orig_ctx}++;
@@ -395,19 +419,52 @@ while (<>) {
 		}
 
 		$queue{$key} = $time;
+		if ($ring eq VENG and not exists $queues{$ctx}) {
+			$queues{$ctx} = 1 ;
+			$cids{$ctx} = $cid++;
+			$ctxmap{$cids{$ctx}} = $ctx;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_submit:') {
 		die if exists $submit{$key};
 		die unless exists $queue{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$submit{$key} = $time;
 	} elsif ($tp_name eq 'i915:i915_request_in:') {
+		my ($q, $s);
 		my %req;
 
 		# preemption
 		delete $db{$key} if exists $db{$key};
 
-		die unless exists $queue{$key};
-		die unless exists $submit{$key};
+		unless (exists $queue{$key}) {
+			# Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+			my %req;
+
+			die unless exists $queues{$ctx};
+			die unless exists $queue{$vkey};
+			die unless exists $submit{$vkey};
+
+			# Create separate request record on the queue timeline
+			$q = $queue{$vkey};
+			$s = $submit{$vkey};
+			$req{'queue'} = $q;
+			$req{'submit'} = $s;
+			$req{'start'} = $time;
+			$req{'end'} = $time;
+			$req{'ring'} = VENG;
+			$req{'seqno'} = $seqno;
+			$req{'ctx'} = $ctx;
+			$req{'name'} = $ctx . '/' . $seqno;
+			$req{'global'} = $tp{'global'};
+			$req{'port'} = $tp{'port'};
+
+			$vdb{$vkey} = \%req;
+		} else {
+			$q = $queue{$key};
+			$s = $submit{$key};
+		}
 
 		$req{'start'} = $time;
 		$req{'ring'} = $ring;
@@ -419,8 +476,9 @@ while (<>) {
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
-		$req{'queue'} = $queue{$key};
-		$req{'submit'} = $submit{$key};
+		$req{'queue'} = $q;
+		$req{'submit'} = $s;
+		$req{'virtual'} = 1 if exists $queues{$ctx};
 		$rings{$ring} = $gid++ unless exists $rings{$ring};
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
@@ -720,8 +778,10 @@ foreach my $key (@sorted_keys) {
 
 	$running{$ring} += $end - $start if $correct_durations or
 					    not exists $db{$key}->{'no-end'};
-	$runnable{$ring} += $db{$key}->{'execute-delay'};
-	$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	unless (exists $db{$key}->{'virtual'}) {
+		$runnable{$ring} += $db{$key}->{'execute-delay'};
+		$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	}
 
 	$batch_count{$ring}++;
 
@@ -840,6 +900,12 @@ foreach my $key (keys %reqwait) {
 	$reqw{$reqwait{$key}->{'ring'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
 }
 
+# Add up all request waits per virtual engine
+my %vreqw;
+foreach my $key (keys %reqwait) {
+	$vreqw{$reqwait{$key}->{'ctx'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
+}
+
 say sprintf('GPU: %.2f%% idle, %.2f%% busy',
 	     $flat_busy{'gpu-idle'}, $flat_busy{'gpu-busy'}) unless $html;
 
@@ -961,18 +1027,24 @@ ENDHTML
 sub html_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $name;
 
-	$name = 'Ring' . $group;
+	$name = $veng ? 'Virtual' : 'Ring';
+	$name .= $group;
 	$name .= '<br><small><br>';
-	$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
-	$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
+		$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	}
 	$name .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable<br>';
 	$name .= sprintf('%.2f', $stats->{'queued'}) . '% queued<br><br>';
 	$name .= sprintf('%.2f', $stats->{'wait'}) . '% wait<br><br>';
 	$name .= $stats->{'count'} . ' batches<br>';
-	$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
-	$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
+		$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	}
 	$name .= '</small>';
 
 	print "\t{id: $id, content: '$name'},\n";
@@ -981,17 +1053,24 @@ sub html_stats
 sub stdio_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $str;
 
-	$str = 'Ring' . $group . ': ';
+	$str = $veng ? 'Virtual' : 'Ring';
+	$str .= $group . ': ';
 	$str .= $stats->{'count'} . ' batches, ';
-	$str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
-	$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
-	$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	unless ($veng) {
+		$str .= sprintf('%.2f (%.2f) avg batch us, ',
+				$stats->{'avg'}, $stats->{'total-avg'});
+		$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
+		$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	}
+
 	$str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
 	$str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
 	$str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';
-	if ($avg_delay_stats) {
+
+	if ($avg_delay_stats and not $veng) {
 		$str .= ', submit/execute/save-avg=(';
 		$str .= sprintf('%.2f/%.2f/%.2f)', $stats->{'submit'}, $stats->{'execute'}, $stats->{'save'});
 	}
@@ -1013,8 +1092,16 @@ foreach my $group (sort keys %rings) {
 
 	$stats{'idle'} = (1.0 - $flat_busy{$ring} / $elapsed) * 100.0;
 	$stats{'busy'} = $running{$ring} / $elapsed * 100.0;
-	$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
-	$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	if (exists $runnable{$ring}) {
+		$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $queued{$ring}) {
+		$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
 	$reqw{$ring} = 0 unless exists $reqw{$ring};
 	$stats{'wait'} = $reqw{$ring} / $elapsed * 100.0;
 	$stats{'count'} = $batch_count{$ring};
@@ -1031,6 +1118,59 @@ foreach my $group (sort keys %rings) {
 	}
 }
 
+sub sortVQueue {
+	my $as = $vdb{$a}->{'queue'};
+	my $bs = $vdb{$b}->{'queue'};
+	my $val;
+
+	$val = $as <=> $bs;
+	$val = $a cmp $b if $val == 0;
+
+	return $val;
+}
+
+my @sorted_vkeys = sort sortVQueue keys %vdb;
+my (%vqueued, %vrunnable);
+
+foreach my $key (@sorted_vkeys) {
+	my $ctx = $vdb{$key}->{'ctx'};
+
+	$vdb{$key}->{'submit-delay'} = $vdb{$key}->{'submit'} - $vdb{$key}->{'queue'};
+	$vdb{$key}->{'execute-delay'} = $vdb{$key}->{'start'} - $vdb{$key}->{'submit'};
+
+	$vqueued{$ctx} += $vdb{$key}->{'submit-delay'};
+	$vrunnable{$ctx} += $vdb{$key}->{'execute-delay'};
+}
+
+my $veng_id = $engine_start_id + scalar(keys %rings);
+
+foreach my $cid (sort keys %ctxmap) {
+	my $ctx = $ctxmap{$cid};
+	my $elapsed = $last_ts - $first_ts;
+	my %stats;
+
+	$stats{'virtual'} = 1;
+	if (exists $vrunnable{$ctx}) {
+		$stats{'runnable'} = $vrunnable{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $vqueued{$ctx}) {
+		$stats{'queued'} = $vqueued{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
+	$vreqw{$ctx} = 0 unless exists $vreqw{$ctx};
+	$stats{'wait'} = $vreqw{$ctx} / $elapsed * 100.0;
+	$stats{'count'} = scalar(grep {$ctx == $vdb{$_}->{'ctx'}} keys %vdb);
+
+	if ($html) {
+		html_stats(\%stats, $cid, $veng_id++);
+	} else {
+		stdio_stats(\%stats, $cid, $veng_id++);
+	}
+}
+
 exit 0 unless $html;
 
 print <<ENDHTML;
@@ -1134,6 +1274,7 @@ sub box_style
 }
 
 my $i = 0;
+my $req = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
@@ -1147,7 +1288,7 @@ foreach my $key (sort sortQueue keys %db) {
 	my $skey;
 
 	# submit to execute
-	unless (exists $skip_box{'queue'}) {
+	unless (exists $skip_box{'queue'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
 		$style = box_style($ctx, 'queue');
 		$content = "$name<br>$db{$key}->{'submit-delay'}us <small>($db{$key}->{'execute-delay'}us)</small>";
@@ -1158,7 +1299,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	# execute to start
 	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
-	unless (exists $skip_box{'ready'}) {
+	unless (exists $skip_box{'ready'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
@@ -1199,7 +1340,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	$last_ts = $end;
 
-	last if $i > $max_items;
+	last if ++$req > $max_requests;
 }
 
 push @freqs, [$prev_freq_ts, $last_ts, $prev_freq] if $prev_freq;
@@ -1232,6 +1373,43 @@ if ($gpu_timeline) {
 	}
 }
 
+$req = 0;
+$veng_id = $engine_start_id + scalar(keys %rings);
+foreach my $key (@sorted_vkeys) {
+	my ($name, $ctx, $seqno) = ($vdb{$key}->{'name'}, $vdb{$key}->{'ctx'}, $vdb{$key}->{'seqno'});
+	my $queue = $vdb{$key}->{'queue'};
+	my $submit = $vdb{$key}->{'submit'};
+	my $engine_start = $db{$key}->{'engine-start'};
+	my ($content, $style, $startend, $skey);
+	my $group = $veng_id + $cids{$ctx};
+	my $subgroup = $ctx - $min_ctx;
+	my $type = ' type: \'range\',';
+	my $duration;
+
+	# submit to execute
+	unless (exists $skip_box{'queue'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
+		$style = box_style($ctx, 'queue');
+		$content = "$name<br>$vdb{$key}->{'submit-delay'}us <small>($vdb{$key}->{'execute-delay'}us)</small>";
+		$startend = 'start: ' . $queue . ', end: ' . $submit;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	# execute to start
+	$engine_start = $vdb{$key}->{'start'} unless defined $engine_start;
+	unless (exists $skip_box{'ready'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
+		$style = box_style($ctx, 'ready');
+		$content = "<small>$name<br>$vdb{$key}->{'execute-delay'}us</small>";
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	last if ++$req > $max_requests;
+}
+
 my $end_ts = $first_ts + $width_us;
 $first_ts = $first_ts;
 
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 03/21] trace.pl: Virtual engine support
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add virtual/queue timelines to both stdout and HTML output.

A new timeline is created for each queue/virtual engine to display
associated requests in queued and runnable states. Once requests are
submitted to a real engine for executing they show up on the physical
engine timeline.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 scripts/trace.pl | 238 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 208 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 95dc3a645e8e..6cc332bb6e2a 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,11 +27,16 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+my (%db, %vdb, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
     %ctxtimelines, %ctxengines);
+my (%cids, %ctxmap);
+my $cid = 0;
+my %queues;
 my @freqs;
 
-my $max_items = 3000;
+use constant VENG => '255:254';
+
+my $max_requests = 1000;
 my $width_us = 32000;
 my $correct_durations = 0;
 my %ignore_ring;
@@ -181,21 +186,21 @@ sub arg_trace
 	return @_;
 }
 
-sub arg_max_items
+sub arg_max_requests
 {
 	my $val;
 
 	return unless scalar(@_);
 
-	if ($_[0] eq '--max-items' or $_[0] eq '-m') {
+	if ($_[0] eq '--max-requests' or $_[0] eq '-m') {
 		shift @_;
 		$val = shift @_;
-	} elsif ($_[0] =~ /--max-items=(\d+)/) {
+	} elsif ($_[0] =~ /--max-requests=(\d+)/) {
 		shift @_;
 		$val = $1;
 	}
 
-	$max_items = int($val) if defined $val;
+	$max_requests = int($val) if defined $val;
 
 	return @_;
 }
@@ -292,7 +297,7 @@ while (@args) {
 	@args = arg_avg_delay_stats(@args);
 	@args = arg_gpu_timeline(@args);
 	@args = arg_trace(@args);
-	@args = arg_max_items(@args);
+	@args = arg_max_requests(@args);
 	@args = arg_zoom_width(@args);
 	@args = arg_split_requests(@args);
 	@args = arg_ignore_ring(@args);
@@ -324,6 +329,13 @@ sub sanitize_ctx
 	}
 }
 
+sub is_veng
+{
+	my ($class, $instance) = split ':', shift;
+
+	return $instance eq '254';
+}
+
 # Main input loop - parse lines and build the internal representation of the
 # trace using a hash of requests and some auxilliary data structures.
 my $prev_freq = 0;
@@ -366,6 +378,7 @@ while (<>) {
 			$ctx = $tp{'ctx'};
 			$orig_ctx = $ctx;
 			$ctx = sanitize_ctx($ctx, $ring);
+			$ring = VENG if is_veng($ring);
 			$key = db_key($ring, $ctx, $seqno);
 		}
 	}
@@ -374,6 +387,7 @@ while (<>) {
 		my %rw;
 
 		next if exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$rw{'key'} = $key;
 		$rw{'ring'} = $ring;
@@ -382,9 +396,19 @@ while (<>) {
 		$rw{'start'} = $time;
 		$reqwait{$key} = \%rw;
 	} elsif ($tp_name eq 'i915:i915_request_wait_end:') {
-		next unless exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
-		$reqwait{$key}->{'end'} = $time;
+		if (exists $reqwait{$key}) {
+			$reqwait{$key}->{'end'} = $time;
+		} else { # Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+
+			die unless exists $reqwait{$vkey};
+
+			# If the wait started on the virtual engine, attribute
+			# it to it completely.
+			$reqwait{$vkey}->{'end'} = $time;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_add:') {
 		if (exists $queue{$key}) {
 			$ctxdb{$orig_ctx}++;
@@ -395,19 +419,52 @@ while (<>) {
 		}
 
 		$queue{$key} = $time;
+		if ($ring eq VENG and not exists $queues{$ctx}) {
+			$queues{$ctx} = 1 ;
+			$cids{$ctx} = $cid++;
+			$ctxmap{$cids{$ctx}} = $ctx;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_submit:') {
 		die if exists $submit{$key};
 		die unless exists $queue{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$submit{$key} = $time;
 	} elsif ($tp_name eq 'i915:i915_request_in:') {
+		my ($q, $s);
 		my %req;
 
 		# preemption
 		delete $db{$key} if exists $db{$key};
 
-		die unless exists $queue{$key};
-		die unless exists $submit{$key};
+		unless (exists $queue{$key}) {
+			# Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+			my %req;
+
+			die unless exists $queues{$ctx};
+			die unless exists $queue{$vkey};
+			die unless exists $submit{$vkey};
+
+			# Create separate request record on the queue timeline
+			$q = $queue{$vkey};
+			$s = $submit{$vkey};
+			$req{'queue'} = $q;
+			$req{'submit'} = $s;
+			$req{'start'} = $time;
+			$req{'end'} = $time;
+			$req{'ring'} = VENG;
+			$req{'seqno'} = $seqno;
+			$req{'ctx'} = $ctx;
+			$req{'name'} = $ctx . '/' . $seqno;
+			$req{'global'} = $tp{'global'};
+			$req{'port'} = $tp{'port'};
+
+			$vdb{$vkey} = \%req;
+		} else {
+			$q = $queue{$key};
+			$s = $submit{$key};
+		}
 
 		$req{'start'} = $time;
 		$req{'ring'} = $ring;
@@ -419,8 +476,9 @@ while (<>) {
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
-		$req{'queue'} = $queue{$key};
-		$req{'submit'} = $submit{$key};
+		$req{'queue'} = $q;
+		$req{'submit'} = $s;
+		$req{'virtual'} = 1 if exists $queues{$ctx};
 		$rings{$ring} = $gid++ unless exists $rings{$ring};
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
@@ -720,8 +778,10 @@ foreach my $key (@sorted_keys) {
 
 	$running{$ring} += $end - $start if $correct_durations or
 					    not exists $db{$key}->{'no-end'};
-	$runnable{$ring} += $db{$key}->{'execute-delay'};
-	$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	unless (exists $db{$key}->{'virtual'}) {
+		$runnable{$ring} += $db{$key}->{'execute-delay'};
+		$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	}
 
 	$batch_count{$ring}++;
 
@@ -840,6 +900,12 @@ foreach my $key (keys %reqwait) {
 	$reqw{$reqwait{$key}->{'ring'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
 }
 
+# Add up all request waits per virtual engine
+my %vreqw;
+foreach my $key (keys %reqwait) {
+	$vreqw{$reqwait{$key}->{'ctx'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
+}
+
 say sprintf('GPU: %.2f%% idle, %.2f%% busy',
 	     $flat_busy{'gpu-idle'}, $flat_busy{'gpu-busy'}) unless $html;
 
@@ -961,18 +1027,24 @@ ENDHTML
 sub html_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $name;
 
-	$name = 'Ring' . $group;
+	$name = $veng ? 'Virtual' : 'Ring';
+	$name .= $group;
 	$name .= '<br><small><br>';
-	$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
-	$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
+		$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	}
 	$name .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable<br>';
 	$name .= sprintf('%.2f', $stats->{'queued'}) . '% queued<br><br>';
 	$name .= sprintf('%.2f', $stats->{'wait'}) . '% wait<br><br>';
 	$name .= $stats->{'count'} . ' batches<br>';
-	$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
-	$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
+		$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	}
 	$name .= '</small>';
 
 	print "\t{id: $id, content: '$name'},\n";
@@ -981,17 +1053,24 @@ sub html_stats
 sub stdio_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $str;
 
-	$str = 'Ring' . $group . ': ';
+	$str = $veng ? 'Virtual' : 'Ring';
+	$str .= $group . ': ';
 	$str .= $stats->{'count'} . ' batches, ';
-	$str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
-	$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
-	$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	unless ($veng) {
+		$str .= sprintf('%.2f (%.2f) avg batch us, ',
+				$stats->{'avg'}, $stats->{'total-avg'});
+		$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
+		$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	}
+
 	$str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
 	$str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
 	$str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';
-	if ($avg_delay_stats) {
+
+	if ($avg_delay_stats and not $veng) {
 		$str .= ', submit/execute/save-avg=(';
 		$str .= sprintf('%.2f/%.2f/%.2f)', $stats->{'submit'}, $stats->{'execute'}, $stats->{'save'});
 	}
@@ -1013,8 +1092,16 @@ foreach my $group (sort keys %rings) {
 
 	$stats{'idle'} = (1.0 - $flat_busy{$ring} / $elapsed) * 100.0;
 	$stats{'busy'} = $running{$ring} / $elapsed * 100.0;
-	$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
-	$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	if (exists $runnable{$ring}) {
+		$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $queued{$ring}) {
+		$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
 	$reqw{$ring} = 0 unless exists $reqw{$ring};
 	$stats{'wait'} = $reqw{$ring} / $elapsed * 100.0;
 	$stats{'count'} = $batch_count{$ring};
@@ -1031,6 +1118,59 @@ foreach my $group (sort keys %rings) {
 	}
 }
 
+sub sortVQueue {
+	my $as = $vdb{$a}->{'queue'};
+	my $bs = $vdb{$b}->{'queue'};
+	my $val;
+
+	$val = $as <=> $bs;
+	$val = $a cmp $b if $val == 0;
+
+	return $val;
+}
+
+my @sorted_vkeys = sort sortVQueue keys %vdb;
+my (%vqueued, %vrunnable);
+
+foreach my $key (@sorted_vkeys) {
+	my $ctx = $vdb{$key}->{'ctx'};
+
+	$vdb{$key}->{'submit-delay'} = $vdb{$key}->{'submit'} - $vdb{$key}->{'queue'};
+	$vdb{$key}->{'execute-delay'} = $vdb{$key}->{'start'} - $vdb{$key}->{'submit'};
+
+	$vqueued{$ctx} += $vdb{$key}->{'submit-delay'};
+	$vrunnable{$ctx} += $vdb{$key}->{'execute-delay'};
+}
+
+my $veng_id = $engine_start_id + scalar(keys %rings);
+
+foreach my $cid (sort keys %ctxmap) {
+	my $ctx = $ctxmap{$cid};
+	my $elapsed = $last_ts - $first_ts;
+	my %stats;
+
+	$stats{'virtual'} = 1;
+	if (exists $vrunnable{$ctx}) {
+		$stats{'runnable'} = $vrunnable{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $vqueued{$ctx}) {
+		$stats{'queued'} = $vqueued{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
+	$vreqw{$ctx} = 0 unless exists $vreqw{$ctx};
+	$stats{'wait'} = $vreqw{$ctx} / $elapsed * 100.0;
+	$stats{'count'} = scalar(grep {$ctx == $vdb{$_}->{'ctx'}} keys %vdb);
+
+	if ($html) {
+		html_stats(\%stats, $cid, $veng_id++);
+	} else {
+		stdio_stats(\%stats, $cid, $veng_id++);
+	}
+}
+
 exit 0 unless $html;
 
 print <<ENDHTML;
@@ -1134,6 +1274,7 @@ sub box_style
 }
 
 my $i = 0;
+my $req = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
@@ -1147,7 +1288,7 @@ foreach my $key (sort sortQueue keys %db) {
 	my $skey;
 
 	# submit to execute
-	unless (exists $skip_box{'queue'}) {
+	unless (exists $skip_box{'queue'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
 		$style = box_style($ctx, 'queue');
 		$content = "$name<br>$db{$key}->{'submit-delay'}us <small>($db{$key}->{'execute-delay'}us)</small>";
@@ -1158,7 +1299,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	# execute to start
 	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
-	unless (exists $skip_box{'ready'}) {
+	unless (exists $skip_box{'ready'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
@@ -1199,7 +1340,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	$last_ts = $end;
 
-	last if $i > $max_items;
+	last if ++$req > $max_requests;
 }
 
 push @freqs, [$prev_freq_ts, $last_ts, $prev_freq] if $prev_freq;
@@ -1232,6 +1373,43 @@ if ($gpu_timeline) {
 	}
 }
 
+$req = 0;
+$veng_id = $engine_start_id + scalar(keys %rings);
+foreach my $key (@sorted_vkeys) {
+	my ($name, $ctx, $seqno) = ($vdb{$key}->{'name'}, $vdb{$key}->{'ctx'}, $vdb{$key}->{'seqno'});
+	my $queue = $vdb{$key}->{'queue'};
+	my $submit = $vdb{$key}->{'submit'};
+	my $engine_start = $db{$key}->{'engine-start'};
+	my ($content, $style, $startend, $skey);
+	my $group = $veng_id + $cids{$ctx};
+	my $subgroup = $ctx - $min_ctx;
+	my $type = ' type: \'range\',';
+	my $duration;
+
+	# submit to execute
+	unless (exists $skip_box{'queue'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
+		$style = box_style($ctx, 'queue');
+		$content = "$name<br>$vdb{$key}->{'submit-delay'}us <small>($vdb{$key}->{'execute-delay'}us)</small>";
+		$startend = 'start: ' . $queue . ', end: ' . $submit;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	# execute to start
+	$engine_start = $vdb{$key}->{'start'} unless defined $engine_start;
+	unless (exists $skip_box{'ready'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
+		$style = box_style($ctx, 'ready');
+		$content = "<small>$name<br>$vdb{$key}->{'execute-delay'}us</small>";
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	last if ++$req > $max_requests;
+}
+
 my $end_ts = $first_ts + $width_us;
 $first_ts = $first_ts;
 
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Use the 'completed?' tracepoint field to detect more robustly when a
request has been preempted and remove it from the engine database if so.

Otherwise the script can hit a scenario where the same global seqno will
be mentioned multiple times (on an engine seqno) which aborts processing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 6cc332bb6e2a..cb7cc46df22e 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -483,17 +483,17 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey;
-
 		die unless exists $ctxengines{$ctx};
 
-		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
-
 		if ($tp{'completed?'}) {
+			my $gkey;
+
 			die unless exists $db{$key};
 			die unless exists $db{$key}->{'start'};
 			die if exists $db{$key}->{'end'};
 
+			$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
 			$db{$key}->{'end'} = $time;
 			$db{$key}->{'notify'} = $notify{$gkey}
 						if exists $notify{$gkey};
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Use the 'completed?' tracepoint field to detect more robustly when a
request has been preempted and remove it from the engine database if so.

Otherwise the script can hit a scenario where the same global seqno will
be mentioned multiple times (on an engine seqno) which aborts processing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 6cc332bb6e2a..cb7cc46df22e 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -483,17 +483,17 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey;
-
 		die unless exists $ctxengines{$ctx};
 
-		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
-
 		if ($tp{'completed?'}) {
+			my $gkey;
+
 			die unless exists $db{$key};
 			die unless exists $db{$key}->{'start'};
 			die if exists $db{$key}->{'end'};
 
+			$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
 			$db{$key}->{'end'} = $time;
 			$db{$key}->{'notify'} = $notify{$gkey}
 						if exists $notify{$gkey};
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 281 ++++++++++++++++++++++++++++++++++-------
 scripts/media-bench.pl |   9 +-
 2 files changed, 244 insertions(+), 46 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index afb9644dd7f0..1084e95fa8df 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -142,6 +142,14 @@ struct w_step
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
 
+struct ctx {
+	uint32_t id;
+	int priority;
+	bool targets_instance;
+	bool wants_balance;
+	unsigned int static_vcs;
+};
+
 struct workload
 {
 	unsigned int id;
@@ -163,11 +171,7 @@ struct workload
 	struct timespec repeat_start;
 
 	unsigned int nr_ctxs;
-	struct {
-		uint32_t id;
-		int priority;
-		unsigned int static_vcs;
-	} *ctx_list;
+	struct ctx *ctx_list;
 
 	int sync_timeline;
 	uint32_t sync_seqno;
@@ -224,6 +228,7 @@ static int fd;
 #define HEARTBEAT	(1<<7)
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
+#define I915		(1<<10)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -841,7 +846,11 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 	if (engine == VCS2 && (flags & VCS2REMAP))
 		engine = BCS;
 
-	eb->flags = eb_engine_map[engine];
+	if ((flags & I915) && engine == VCS) {
+		eb->flags = 0;
+	} else {
+		eb->flags = eb_engine_map[engine];
+	}
 }
 
 static void
@@ -867,6 +876,23 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
+static uint32_t
+get_ctxid(struct workload *wrk, struct w_step *w)
+{
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->targets_instance && ctx->wants_balance && w->engine == VCS)
+		return wrk->ctx_list[w->context * 2 + 1].id;
+	else
+		return wrk->ctx_list[w->context * 2].id;
+}
+
 static void
 alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 {
@@ -919,7 +945,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
 	w->eb.buffer_count = j + 1;
-	w->eb.rsvd1 = wrk->ctx_list[w->context].id;
+	w->eb.rsvd1 = get_ctxid(wrk, w);
 
 	if (flags & SWAPVCS && engine == VCS1)
 		engine = VCS2;
@@ -932,17 +958,29 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		printf("%x|", w->obj[i].handle);
 	printf(" %10lu flags=%llx bb=%x[%u] ctx[%u]=%u\n",
 		w->bb_sz, w->eb.flags, w->bb_handle, j, w->context,
-		wrk->ctx_list[w->context].id);
+		get_ctxid(wrk, w));
 #endif
 }
 
+static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
+{
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = ctx_id,
+		.param = I915_CONTEXT_PARAM_PRIORITY,
+		.value = prio,
+	};
+
+	if (prio)
+		gem_context_set_param(fd, &param);
+}
+
 static void
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs = 0;
 	int max_ctx = -1;
 	struct w_step *w;
-	int i;
+	int i, j;
 
 	wrk->id = id;
 	wrk->prng = rand();
@@ -973,44 +1011,183 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Pre-scan workload steps to allocate context list storage.
+	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-		if ((int)w->context > max_ctx) {
-			int delta = w->context + 1 - wrk->nr_ctxs;
+		int ctx = w->context * 2 + 1; /* Odd slots are special. */
+		int delta;
+
+		if (ctx <= max_ctx)
+			continue;
+
+		delta = ctx + 1 - wrk->nr_ctxs;
 
-			wrk->nr_ctxs += delta;
-			wrk->ctx_list = realloc(wrk->ctx_list,
-						wrk->nr_ctxs *
-						sizeof(*wrk->ctx_list));
-			memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
-			       delta * sizeof(*wrk->ctx_list));
+		wrk->nr_ctxs += delta;
+		wrk->ctx_list = realloc(wrk->ctx_list,
+					wrk->nr_ctxs * sizeof(*wrk->ctx_list));
+		memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
+			delta * sizeof(*wrk->ctx_list));
+
+		max_ctx = ctx;
+	}
+
+	/*
+	 * Identify if contexts target specific engine instances and if they
+	 * want to be balanced.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		bool targets = false;
+		bool balance = false;
 
-			max_ctx = w->context;
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->type != BATCH)
+				continue;
+
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->engine == VCS)
+				balance = true;
+			else
+				targets = true;
 		}
 
-		if (!wrk->ctx_list[w->context].id) {
-			struct drm_i915_gem_context_create arg = {};
+		if (flags & I915) {
+			wrk->ctx_list[j].targets_instance = targets;
+			wrk->ctx_list[j].wants_balance = balance;
+		}
+	}
 
-			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &arg);
-			igt_assert(arg.ctx_id);
+	/*
+	 * Create and configure contexts.
+	 */
+	for (i = 0; i < wrk->nr_ctxs; i += 2) {
+		struct ctx *ctx = &wrk->ctx_list[i];
+		uint32_t ctx_id, share_vm = 0;
 
-			wrk->ctx_list[w->context].id = arg.ctx_id;
+		if (ctx->id)
+			continue;
 
-			if (flags & GLOBAL_BALANCE) {
-				wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
-				context_vcs_rr ^= 1;
-			} else {
-				wrk->ctx_list[w->context].static_vcs = ctx_vcs;
-				ctx_vcs ^= 1;
-			}
+		if (flags & I915) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+			};
+			struct drm_i915_gem_context_create_ext args = { };
 
-			if (wrk->prio) {
+			/* Find existing context to share ppgtt with. */
+			for (j = 0; j < wrk->nr_ctxs; j++) {
 				struct drm_i915_gem_context_param param = {
-					.ctx_id = arg.ctx_id,
-					.param = I915_CONTEXT_PARAM_PRIORITY,
-					.value = wrk->prio,
+					.param = I915_CONTEXT_PARAM_VM,
 				};
-				gem_context_set_param(fd, &param);
+
+				if (!wrk->ctx_list[j].id)
+					continue;
+
+				param.ctx_id = wrk->ctx_list[j].id;
+
+				gem_context_get_param(fd, &param);
+				igt_assert(param.value);
+
+				share_vm = param.value;
+
+				ext.param.value = share_vm;
+				args.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
+				args.extensions = to_user_pointer(&ext);
+				break;
 			}
+
+			if (!ctx->targets_instance)
+				args.flags |=
+				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			ctx_id = args.ctx_id;
+		} else {
+			struct drm_i915_gem_context_create args = {};
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
+			ctx_id = args.ctx_id;
+		}
+
+		igt_assert(ctx_id);
+		ctx->id = ctx_id;
+
+		if (flags & GLOBAL_BALANCE) {
+			ctx->static_vcs = context_vcs_rr;
+			context_vcs_rr ^= 1;
+		} else {
+			ctx->static_vcs = ctx_vcs;
+			ctx_vcs ^= 1;
+		}
+
+		__ctx_set_prio(ctx_id, wrk->prio);
+
+		/*
+		 * Do we need a separate context to satisfy this workloads which
+		 * both want to target specific engines and be balanced by i915?
+		 */
+		if ((flags & I915) && ctx->wants_balance &&
+		    ctx->targets_instance) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+				.param.value = share_vm,
+			};
+			struct drm_i915_gem_context_create_ext args = {
+				.extensions = to_user_pointer(&ext),
+				.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
+				    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
+			};
+
+			igt_assert(share_vm);
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			igt_assert(args.ctx_id);
+			ctx_id = args.ctx_id;
+			wrk->ctx_list[i + 1].id = args.ctx_id;
+
+			__ctx_set_prio(ctx_id, wrk->prio);
+		}
+
+		if (ctx->wants_balance) {
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
+				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+				.num_siblings = 2,
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
+				.extensions = to_user_pointer(&load_balance),
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_INVALID,
+					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			gem_context_set_param(fd, &param);
 		}
 	}
 
@@ -1027,7 +1204,6 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
 		struct w_step *w2;
-		int j;
 
 		if (w->type != PREEMPTION)
 			continue;
@@ -1385,7 +1561,7 @@ static enum intel_engine_id
 context_balance(const struct workload_balancer *balancer,
 		struct workload *wrk, struct w_step *w)
 {
-	return get_vcs_engine(wrk->ctx_list[w->context].static_vcs);
+	return get_vcs_engine(__get_ctx(wrk, w)->static_vcs);
 }
 
 static unsigned int
@@ -1579,6 +1755,12 @@ static const struct workload_balancer all_balancers[] = {
 		.get_qd = get_engine_busy,
 		.balance = busy_avg_balance,
 	},
+	{
+		.id = 11,
+		.name = "i915",
+		.desc = "i915 balancing.",
+		.flags = I915,
+	},
 };
 
 static unsigned int
@@ -1957,7 +2139,8 @@ static void *run_workload(void *data)
 			last_sync = false;
 
 			wrk->nr_bb[engine]++;
-			if (engine == VCS && wrk->balancer) {
+			if (engine == VCS && wrk->balancer &&
+			    wrk->balancer->balance) {
 				engine = wrk->balancer->balance(wrk->balancer,
 								wrk, w);
 				wrk->nr_bb[engine]++;
@@ -2384,6 +2567,12 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
+	if ((flags & VCS2REMAP) && (flags & I915)) {
+		if (verbose)
+			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		return 1;
+	}
+
 	if (!nop_calibration) {
 		if (verbose > 1)
 			printf("Calibrating nop delay with %u%% tolerance...\n",
@@ -2469,11 +2658,17 @@ int main(int argc, char **argv)
 		printf("%u client%s.\n", clients, clients > 1 ? "s" : "");
 		if (flags & SWAPVCS)
 			printf("Swapping VCS rings between clients.\n");
-		if (flags & GLOBAL_BALANCE)
-			printf("Using %s balancer in global mode.\n",
-			       balancer->name);
-		else if (balancer)
+		if (flags & GLOBAL_BALANCE) {
+			if (flags & I915) {
+				printf("Ignoring global balancing with i915!\n");
+				flags &= ~GLOBAL_BALANCE;
+			} else {
+				printf("Using %s balancer in global mode.\n",
+				       balancer->name);
+			}
+		} else if (balancer) {
 			printf("Using %s balancer.\n", balancer->name);
+		}
 	}
 
 	if (master_workload >= 0 && clients == 1)
@@ -2490,7 +2685,7 @@ int main(int argc, char **argv)
 		if (flags & SWAPVCS && i & 1)
 			flags_ &= ~SWAPVCS;
 
-		if (flags & GLOBAL_BALANCE) {
+		if ((flags & GLOBAL_BALANCE) && !(flags & I915)) {
 			w[i]->balancer = &global_balancer;
 			w[i]->global_wrk = w[0];
 			w[i]->global_balancer = balancer;
diff --git a/scripts/media-bench.pl b/scripts/media-bench.pl
index 066b542f95df..ddf9c0ec05c8 100755
--- a/scripts/media-bench.pl
+++ b/scripts/media-bench.pl
@@ -49,10 +49,11 @@ my $nop;
 my %opts;
 
 my @balancers = ( 'rr', 'rand', 'qd', 'qdr', 'qdavg', 'rt', 'rtr', 'rtavg',
-		  'context', 'busy', 'busy-avg' );
+		  'context', 'busy', 'busy-avg', 'i915' );
 my %bal_skip_H = ( 'rr' => 1, 'rand' => 1, 'context' => 1, , 'busy' => 1,
-		   'busy-avg' => 1 );
-my %bal_skip_R = ( 'context' => 1 );
+		   'busy-avg' => 1, 'i915' => 1 );
+my %bal_skip_R = ( 'context' => 1, 'i915' => 1 );
+my %bal_skip_G = ( 'i915' => 1 );
 
 my @workloads = (
 	'media_load_balance_17i7.wsim',
@@ -498,6 +499,8 @@ foreach my $wrk (@saturation_workloads) {
 				my $bid;
 
 				if ($bal ne '') {
+					next GBAL if $G =~ '-G' and exists $bal_skip_G{$bal};
+
 					push @xargs, "-b $bal";
 					push @xargs, '-R' unless exists $bal_skip_R{$bal};
 					push @xargs, $G if $G ne '';
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [Intel-gfx] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 281 ++++++++++++++++++++++++++++++++++-------
 scripts/media-bench.pl |   9 +-
 2 files changed, 244 insertions(+), 46 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index afb9644dd7f0..1084e95fa8df 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -142,6 +142,14 @@ struct w_step
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
 
+struct ctx {
+	uint32_t id;
+	int priority;
+	bool targets_instance;
+	bool wants_balance;
+	unsigned int static_vcs;
+};
+
 struct workload
 {
 	unsigned int id;
@@ -163,11 +171,7 @@ struct workload
 	struct timespec repeat_start;
 
 	unsigned int nr_ctxs;
-	struct {
-		uint32_t id;
-		int priority;
-		unsigned int static_vcs;
-	} *ctx_list;
+	struct ctx *ctx_list;
 
 	int sync_timeline;
 	uint32_t sync_seqno;
@@ -224,6 +228,7 @@ static int fd;
 #define HEARTBEAT	(1<<7)
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
+#define I915		(1<<10)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -841,7 +846,11 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 	if (engine == VCS2 && (flags & VCS2REMAP))
 		engine = BCS;
 
-	eb->flags = eb_engine_map[engine];
+	if ((flags & I915) && engine == VCS) {
+		eb->flags = 0;
+	} else {
+		eb->flags = eb_engine_map[engine];
+	}
 }
 
 static void
@@ -867,6 +876,23 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
+static uint32_t
+get_ctxid(struct workload *wrk, struct w_step *w)
+{
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->targets_instance && ctx->wants_balance && w->engine == VCS)
+		return wrk->ctx_list[w->context * 2 + 1].id;
+	else
+		return wrk->ctx_list[w->context * 2].id;
+}
+
 static void
 alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 {
@@ -919,7 +945,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
 	w->eb.buffer_count = j + 1;
-	w->eb.rsvd1 = wrk->ctx_list[w->context].id;
+	w->eb.rsvd1 = get_ctxid(wrk, w);
 
 	if (flags & SWAPVCS && engine == VCS1)
 		engine = VCS2;
@@ -932,17 +958,29 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		printf("%x|", w->obj[i].handle);
 	printf(" %10lu flags=%llx bb=%x[%u] ctx[%u]=%u\n",
 		w->bb_sz, w->eb.flags, w->bb_handle, j, w->context,
-		wrk->ctx_list[w->context].id);
+		get_ctxid(wrk, w));
 #endif
 }
 
+static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
+{
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = ctx_id,
+		.param = I915_CONTEXT_PARAM_PRIORITY,
+		.value = prio,
+	};
+
+	if (prio)
+		gem_context_set_param(fd, &param);
+}
+
 static void
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs = 0;
 	int max_ctx = -1;
 	struct w_step *w;
-	int i;
+	int i, j;
 
 	wrk->id = id;
 	wrk->prng = rand();
@@ -973,44 +1011,183 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Pre-scan workload steps to allocate context list storage.
+	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-		if ((int)w->context > max_ctx) {
-			int delta = w->context + 1 - wrk->nr_ctxs;
+		int ctx = w->context * 2 + 1; /* Odd slots are special. */
+		int delta;
+
+		if (ctx <= max_ctx)
+			continue;
+
+		delta = ctx + 1 - wrk->nr_ctxs;
 
-			wrk->nr_ctxs += delta;
-			wrk->ctx_list = realloc(wrk->ctx_list,
-						wrk->nr_ctxs *
-						sizeof(*wrk->ctx_list));
-			memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
-			       delta * sizeof(*wrk->ctx_list));
+		wrk->nr_ctxs += delta;
+		wrk->ctx_list = realloc(wrk->ctx_list,
+					wrk->nr_ctxs * sizeof(*wrk->ctx_list));
+		memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
+			delta * sizeof(*wrk->ctx_list));
+
+		max_ctx = ctx;
+	}
+
+	/*
+	 * Identify if contexts target specific engine instances and if they
+	 * want to be balanced.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		bool targets = false;
+		bool balance = false;
 
-			max_ctx = w->context;
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->type != BATCH)
+				continue;
+
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->engine == VCS)
+				balance = true;
+			else
+				targets = true;
 		}
 
-		if (!wrk->ctx_list[w->context].id) {
-			struct drm_i915_gem_context_create arg = {};
+		if (flags & I915) {
+			wrk->ctx_list[j].targets_instance = targets;
+			wrk->ctx_list[j].wants_balance = balance;
+		}
+	}
 
-			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &arg);
-			igt_assert(arg.ctx_id);
+	/*
+	 * Create and configure contexts.
+	 */
+	for (i = 0; i < wrk->nr_ctxs; i += 2) {
+		struct ctx *ctx = &wrk->ctx_list[i];
+		uint32_t ctx_id, share_vm = 0;
 
-			wrk->ctx_list[w->context].id = arg.ctx_id;
+		if (ctx->id)
+			continue;
 
-			if (flags & GLOBAL_BALANCE) {
-				wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
-				context_vcs_rr ^= 1;
-			} else {
-				wrk->ctx_list[w->context].static_vcs = ctx_vcs;
-				ctx_vcs ^= 1;
-			}
+		if (flags & I915) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+			};
+			struct drm_i915_gem_context_create_ext args = { };
 
-			if (wrk->prio) {
+			/* Find existing context to share ppgtt with. */
+			for (j = 0; j < wrk->nr_ctxs; j++) {
 				struct drm_i915_gem_context_param param = {
-					.ctx_id = arg.ctx_id,
-					.param = I915_CONTEXT_PARAM_PRIORITY,
-					.value = wrk->prio,
+					.param = I915_CONTEXT_PARAM_VM,
 				};
-				gem_context_set_param(fd, &param);
+
+				if (!wrk->ctx_list[j].id)
+					continue;
+
+				param.ctx_id = wrk->ctx_list[j].id;
+
+				gem_context_get_param(fd, &param);
+				igt_assert(param.value);
+
+				share_vm = param.value;
+
+				ext.param.value = share_vm;
+				args.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
+				args.extensions = to_user_pointer(&ext);
+				break;
 			}
+
+			if (!ctx->targets_instance)
+				args.flags |=
+				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			ctx_id = args.ctx_id;
+		} else {
+			struct drm_i915_gem_context_create args = {};
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
+			ctx_id = args.ctx_id;
+		}
+
+		igt_assert(ctx_id);
+		ctx->id = ctx_id;
+
+		if (flags & GLOBAL_BALANCE) {
+			ctx->static_vcs = context_vcs_rr;
+			context_vcs_rr ^= 1;
+		} else {
+			ctx->static_vcs = ctx_vcs;
+			ctx_vcs ^= 1;
+		}
+
+		__ctx_set_prio(ctx_id, wrk->prio);
+
+		/*
+		 * Do we need a separate context to satisfy this workloads which
+		 * both want to target specific engines and be balanced by i915?
+		 */
+		if ((flags & I915) && ctx->wants_balance &&
+		    ctx->targets_instance) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+				.param.value = share_vm,
+			};
+			struct drm_i915_gem_context_create_ext args = {
+				.extensions = to_user_pointer(&ext),
+				.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
+				    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
+			};
+
+			igt_assert(share_vm);
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			igt_assert(args.ctx_id);
+			ctx_id = args.ctx_id;
+			wrk->ctx_list[i + 1].id = args.ctx_id;
+
+			__ctx_set_prio(ctx_id, wrk->prio);
+		}
+
+		if (ctx->wants_balance) {
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
+				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+				.num_siblings = 2,
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
+				.extensions = to_user_pointer(&load_balance),
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_INVALID,
+					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			gem_context_set_param(fd, &param);
 		}
 	}
 
@@ -1027,7 +1204,6 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
 		struct w_step *w2;
-		int j;
 
 		if (w->type != PREEMPTION)
 			continue;
@@ -1385,7 +1561,7 @@ static enum intel_engine_id
 context_balance(const struct workload_balancer *balancer,
 		struct workload *wrk, struct w_step *w)
 {
-	return get_vcs_engine(wrk->ctx_list[w->context].static_vcs);
+	return get_vcs_engine(__get_ctx(wrk, w)->static_vcs);
 }
 
 static unsigned int
@@ -1579,6 +1755,12 @@ static const struct workload_balancer all_balancers[] = {
 		.get_qd = get_engine_busy,
 		.balance = busy_avg_balance,
 	},
+	{
+		.id = 11,
+		.name = "i915",
+		.desc = "i915 balancing.",
+		.flags = I915,
+	},
 };
 
 static unsigned int
@@ -1957,7 +2139,8 @@ static void *run_workload(void *data)
 			last_sync = false;
 
 			wrk->nr_bb[engine]++;
-			if (engine == VCS && wrk->balancer) {
+			if (engine == VCS && wrk->balancer &&
+			    wrk->balancer->balance) {
 				engine = wrk->balancer->balance(wrk->balancer,
 								wrk, w);
 				wrk->nr_bb[engine]++;
@@ -2384,6 +2567,12 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
+	if ((flags & VCS2REMAP) && (flags & I915)) {
+		if (verbose)
+			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		return 1;
+	}
+
 	if (!nop_calibration) {
 		if (verbose > 1)
 			printf("Calibrating nop delay with %u%% tolerance...\n",
@@ -2469,11 +2658,17 @@ int main(int argc, char **argv)
 		printf("%u client%s.\n", clients, clients > 1 ? "s" : "");
 		if (flags & SWAPVCS)
 			printf("Swapping VCS rings between clients.\n");
-		if (flags & GLOBAL_BALANCE)
-			printf("Using %s balancer in global mode.\n",
-			       balancer->name);
-		else if (balancer)
+		if (flags & GLOBAL_BALANCE) {
+			if (flags & I915) {
+				printf("Ignoring global balancing with i915!\n");
+				flags &= ~GLOBAL_BALANCE;
+			} else {
+				printf("Using %s balancer in global mode.\n",
+				       balancer->name);
+			}
+		} else if (balancer) {
 			printf("Using %s balancer.\n", balancer->name);
+		}
 	}
 
 	if (master_workload >= 0 && clients == 1)
@@ -2490,7 +2685,7 @@ int main(int argc, char **argv)
 		if (flags & SWAPVCS && i & 1)
 			flags_ &= ~SWAPVCS;
 
-		if (flags & GLOBAL_BALANCE) {
+		if ((flags & GLOBAL_BALANCE) && !(flags & I915)) {
 			w[i]->balancer = &global_balancer;
 			w[i]->global_wrk = w[0];
 			w[i]->global_balancer = balancer;
diff --git a/scripts/media-bench.pl b/scripts/media-bench.pl
index 066b542f95df..ddf9c0ec05c8 100755
--- a/scripts/media-bench.pl
+++ b/scripts/media-bench.pl
@@ -49,10 +49,11 @@ my $nop;
 my %opts;
 
 my @balancers = ( 'rr', 'rand', 'qd', 'qdr', 'qdavg', 'rt', 'rtr', 'rtavg',
-		  'context', 'busy', 'busy-avg' );
+		  'context', 'busy', 'busy-avg', 'i915' );
 my %bal_skip_H = ( 'rr' => 1, 'rand' => 1, 'context' => 1, , 'busy' => 1,
-		   'busy-avg' => 1 );
-my %bal_skip_R = ( 'context' => 1 );
+		   'busy-avg' => 1, 'i915' => 1 );
+my %bal_skip_R = ( 'context' => 1, 'i915' => 1 );
+my %bal_skip_G = ( 'i915' => 1 );
 
 my @workloads = (
 	'media_load_balance_17i7.wsim',
@@ -498,6 +499,8 @@ foreach my $wrk (@saturation_workloads) {
 				my $bid;
 
 				if ($bal ne '') {
+					next GBAL if $G =~ '-G' and exists $bal_skip_G{$bal};
+
 					push @xargs, "-b $bal";
 					push @xargs, '-R' unless exists $bal_skip_R{$bal};
 					push @xargs, $G if $G ne '';
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 06/21] gem_wsim: Use IGT uapi headers
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are moving towards bumping the uAPI headers more often instead of using
too much local struct/ioctl/param definitions since the latter are more
challenging for rebase and maintenance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 1084e95fa8df..609e64f3d9c8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -41,7 +41,6 @@
 #include <limits.h>
 #include <pthread.h>
 
-
 #include "intel_chipset.h"
 #include "intel_reg.h"
 #include "drm.h"
@@ -57,9 +56,6 @@
 
 #include "ewma.h"
 
-#define LOCAL_I915_EXEC_FENCE_IN              (1<<16)
-#define LOCAL_I915_EXEC_FENCE_OUT             (1<<17)
-
 enum intel_engine_id {
 	RCS,
 	BCS,
@@ -864,7 +860,7 @@ eb_update_flags(struct w_step *w, enum intel_engine_id engine,
 
 	igt_assert(w->emit_fence <= 0);
 	if (w->emit_fence)
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_OUT;
+		w->eb.flags |= I915_EXEC_FENCE_OUT;
 }
 
 static struct drm_i915_gem_exec_object2 *
@@ -1993,16 +1989,16 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_IN;
+		w->eb.flags |= I915_EXEC_FENCE_IN;
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT)
+	if (w->eb.flags & I915_EXEC_FENCE_OUT)
 		gem_execbuf_wr(fd, &w->eb);
 	else
 		gem_execbuf(fd, &w->eb);
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT) {
+	if (w->eb.flags & I915_EXEC_FENCE_OUT) {
 		w->emit_fence = w->eb.rsvd2 >> 32;
 		igt_assert(w->emit_fence > 0);
 	}
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 06/21] gem_wsim: Use IGT uapi headers
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are moving towards bumping the uAPI headers more often instead of using
too much local struct/ioctl/param definitions since the latter are more
challenging for rebase and maintenance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 1084e95fa8df..609e64f3d9c8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -41,7 +41,6 @@
 #include <limits.h>
 #include <pthread.h>
 
-
 #include "intel_chipset.h"
 #include "intel_reg.h"
 #include "drm.h"
@@ -57,9 +56,6 @@
 
 #include "ewma.h"
 
-#define LOCAL_I915_EXEC_FENCE_IN              (1<<16)
-#define LOCAL_I915_EXEC_FENCE_OUT             (1<<17)
-
 enum intel_engine_id {
 	RCS,
 	BCS,
@@ -864,7 +860,7 @@ eb_update_flags(struct w_step *w, enum intel_engine_id engine,
 
 	igt_assert(w->emit_fence <= 0);
 	if (w->emit_fence)
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_OUT;
+		w->eb.flags |= I915_EXEC_FENCE_OUT;
 }
 
 static struct drm_i915_gem_exec_object2 *
@@ -1993,16 +1989,16 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_IN;
+		w->eb.flags |= I915_EXEC_FENCE_IN;
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT)
+	if (w->eb.flags & I915_EXEC_FENCE_OUT)
 		gem_execbuf_wr(fd, &w->eb);
 	else
 		gem_execbuf(fd, &w->eb);
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT) {
+	if (w->eb.flags & I915_EXEC_FENCE_OUT) {
 		w->emit_fence = w->eb.rsvd2 >> 32;
 		igt_assert(w->emit_fence > 0);
 	}
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 07/21] gem_wsim: Factor out common error handling
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

There is a repeated pattern with error handling which can be moved to a
macro to for better readability in the command parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 244 +++++++++++++++---------------------------
 1 file changed, 88 insertions(+), 156 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 609e64f3d9c8..ef97311a6879 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -289,6 +289,27 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 	return 0;
 }
 
+static void __attribute__((format(printf, 1, 2)))
+wsim_err(const char *fmt, ...)
+{
+	va_list ap;
+
+	if (!verbose)
+		return;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+}
+
+#define check_arg(cond, fmt, ...) \
+{ \
+	if (cond) { \
+		wsim_err(fmt, __VA_ARGS__); \
+		return NULL; \
+	} \
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -319,14 +340,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid delay at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid delay at step %u!\n",
+						  nr_steps);
 					step.type = DELAY;
 					step.delay = tmp;
 					goto add_step;
@@ -335,14 +351,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid period at step %u!\n",
+						  nr_steps);
 					step.type = PERIOD;
 					step.period = tmp;
 					goto add_step;
@@ -352,25 +363,17 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid priority format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.priority = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid priority format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -381,15 +384,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0 ||
-					    ((int)nr_steps + tmp) < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sync target at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0 ||
+						  ((int)nr_steps + tmp) < 0,
+						  "Invalid sync target at step %u!\n",
+						  nr_steps);
 					step.type = SYNC;
 					step.target = tmp;
 					goto add_step;
@@ -398,14 +396,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid throttle at step %u!\n",
+						  nr_steps);
 					step.type = THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -414,14 +407,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid qd throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid qd throttle at step %u!\n",
+						  nr_steps);
 					step.type = QD_THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -430,14 +418,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sw fence signal at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0,
+						  "Invalid sw fence signal at step %u!\n",
+						  nr_steps);
 					step.type = SW_FENCE_SIGNAL;
 					step.target = tmp;
 					goto add_step;
@@ -450,31 +433,20 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					} else if (tmp < 0 && nr == 1) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr == 1 && tmp < 0,
+						  "Invalid preemption period at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid preemption format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.period = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -492,13 +464,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			}
 
 			tmp = atoi(field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid ctx id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0, "Invalid ctx id at step %u!\n",
+				  nr_steps);
 			step.context = tmp;
 
 			valid++;
@@ -519,13 +486,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			}
 
-			if (old_valid == valid) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid engine id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(old_valid == valid,
+				  "Invalid engine id at step %u!\n", nr_steps);
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
@@ -535,25 +497,19 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmpl = strtol(field, &sep, 10);
-			if (tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid duration at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+				  tmpl == LONG_MAX,
+				  "Invalid duration at step %u!\n", nr_steps);
 			step.duration.min = tmpl;
 
 			if (sep && *sep == '-') {
 				tmpl = strtol(sep + 1, NULL, 10);
-				if (tmpl <= 0 || tmpl <= step.duration.min ||
-				    tmpl == LONG_MIN || tmpl == LONG_MAX) {
-					if (verbose)
-						fprintf(stderr,
-							"Invalid duration range at step %u!\n",
-							nr_steps);
-					return NULL;
-				}
+				check_arg(tmpl <= 0 ||
+					  tmpl <= step.duration.min ||
+					  tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration range at step %u!\n",
+					  nr_steps);
 				step.duration.max = tmpl;
 			} else {
 				step.duration.max = step.duration.min;
@@ -566,13 +522,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0,
+				  "Invalid dependency at step %u!\n", nr_steps);
 
 			valid++;
 		}
@@ -580,25 +531,16 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
 			fstart = NULL;
 
-			if (strlen(field) != 1 ||
-			    (field[0] != '0' && field[0] != '1')) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid wait boolean at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(strlen(field) != 1 ||
+				  (field[0] != '0' && field[0] != '1'),
+				  "Invalid wait boolean at step %u!\n",
+				  nr_steps);
 			step.sync = field[0] - '0';
 
 			valid++;
 		}
 
-		if (valid != 5) {
-			if (verbose)
-				fprintf(stderr, "Invalid record at step %u!\n",
-					nr_steps);
-			return NULL;
-		}
+		check_arg(valid != 5, "Invalid record at step %u!\n", nr_steps);
 
 		step.type = BATCH;
 
@@ -643,15 +585,10 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		for (j = 0; j < steps[i].fence_deps.nr; j++) {
 			tmp = steps[i].idx + steps[i].fence_deps.list[j];
-			if (tmp < 0 || tmp >= i ||
-			    (steps[tmp].type != BATCH &&
-			     steps[tmp].type != SW_FENCE)) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  (steps[tmp].type != BATCH &&
+				   steps[tmp].type != SW_FENCE),
+				  "Invalid dependency target %u!\n", i);
 			steps[tmp].emit_fence = -1;
 		}
 	}
@@ -660,14 +597,9 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		if (steps[i].type == SW_FENCE_SIGNAL) {
 			tmp = steps[i].idx + steps[i].target;
-			if (tmp < 0 || tmp >= i ||
-			    steps[tmp].type != SW_FENCE) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid sw fence target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  steps[tmp].type != SW_FENCE,
+				  "Invalid sw fence target %u!\n", i);
 		}
 	}
 
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 07/21] gem_wsim: Factor out common error handling
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

There is a repeated pattern with error handling which can be moved to a
macro to for better readability in the command parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 244 +++++++++++++++---------------------------
 1 file changed, 88 insertions(+), 156 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 609e64f3d9c8..ef97311a6879 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -289,6 +289,27 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 	return 0;
 }
 
+static void __attribute__((format(printf, 1, 2)))
+wsim_err(const char *fmt, ...)
+{
+	va_list ap;
+
+	if (!verbose)
+		return;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+}
+
+#define check_arg(cond, fmt, ...) \
+{ \
+	if (cond) { \
+		wsim_err(fmt, __VA_ARGS__); \
+		return NULL; \
+	} \
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -319,14 +340,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid delay at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid delay at step %u!\n",
+						  nr_steps);
 					step.type = DELAY;
 					step.delay = tmp;
 					goto add_step;
@@ -335,14 +351,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid period at step %u!\n",
+						  nr_steps);
 					step.type = PERIOD;
 					step.period = tmp;
 					goto add_step;
@@ -352,25 +363,17 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid priority format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.priority = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid priority format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -381,15 +384,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0 ||
-					    ((int)nr_steps + tmp) < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sync target at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0 ||
+						  ((int)nr_steps + tmp) < 0,
+						  "Invalid sync target at step %u!\n",
+						  nr_steps);
 					step.type = SYNC;
 					step.target = tmp;
 					goto add_step;
@@ -398,14 +396,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid throttle at step %u!\n",
+						  nr_steps);
 					step.type = THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -414,14 +407,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid qd throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid qd throttle at step %u!\n",
+						  nr_steps);
 					step.type = QD_THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -430,14 +418,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sw fence signal at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0,
+						  "Invalid sw fence signal at step %u!\n",
+						  nr_steps);
 					step.type = SW_FENCE_SIGNAL;
 					step.target = tmp;
 					goto add_step;
@@ -450,31 +433,20 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					} else if (tmp < 0 && nr == 1) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr == 1 && tmp < 0,
+						  "Invalid preemption period at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid preemption format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.period = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -492,13 +464,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			}
 
 			tmp = atoi(field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid ctx id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0, "Invalid ctx id at step %u!\n",
+				  nr_steps);
 			step.context = tmp;
 
 			valid++;
@@ -519,13 +486,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			}
 
-			if (old_valid == valid) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid engine id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(old_valid == valid,
+				  "Invalid engine id at step %u!\n", nr_steps);
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
@@ -535,25 +497,19 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmpl = strtol(field, &sep, 10);
-			if (tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid duration at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+				  tmpl == LONG_MAX,
+				  "Invalid duration at step %u!\n", nr_steps);
 			step.duration.min = tmpl;
 
 			if (sep && *sep == '-') {
 				tmpl = strtol(sep + 1, NULL, 10);
-				if (tmpl <= 0 || tmpl <= step.duration.min ||
-				    tmpl == LONG_MIN || tmpl == LONG_MAX) {
-					if (verbose)
-						fprintf(stderr,
-							"Invalid duration range at step %u!\n",
-							nr_steps);
-					return NULL;
-				}
+				check_arg(tmpl <= 0 ||
+					  tmpl <= step.duration.min ||
+					  tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration range at step %u!\n",
+					  nr_steps);
 				step.duration.max = tmpl;
 			} else {
 				step.duration.max = step.duration.min;
@@ -566,13 +522,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0,
+				  "Invalid dependency at step %u!\n", nr_steps);
 
 			valid++;
 		}
@@ -580,25 +531,16 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
 			fstart = NULL;
 
-			if (strlen(field) != 1 ||
-			    (field[0] != '0' && field[0] != '1')) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid wait boolean at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(strlen(field) != 1 ||
+				  (field[0] != '0' && field[0] != '1'),
+				  "Invalid wait boolean at step %u!\n",
+				  nr_steps);
 			step.sync = field[0] - '0';
 
 			valid++;
 		}
 
-		if (valid != 5) {
-			if (verbose)
-				fprintf(stderr, "Invalid record at step %u!\n",
-					nr_steps);
-			return NULL;
-		}
+		check_arg(valid != 5, "Invalid record at step %u!\n", nr_steps);
 
 		step.type = BATCH;
 
@@ -643,15 +585,10 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		for (j = 0; j < steps[i].fence_deps.nr; j++) {
 			tmp = steps[i].idx + steps[i].fence_deps.list[j];
-			if (tmp < 0 || tmp >= i ||
-			    (steps[tmp].type != BATCH &&
-			     steps[tmp].type != SW_FENCE)) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  (steps[tmp].type != BATCH &&
+				   steps[tmp].type != SW_FENCE),
+				  "Invalid dependency target %u!\n", i);
 			steps[tmp].emit_fence = -1;
 		}
 	}
@@ -660,14 +597,9 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		if (steps[i].type == SW_FENCE_SIGNAL) {
 			tmp = steps[i].idx + steps[i].target;
-			if (tmp < 0 || tmp >= i ||
-			    steps[tmp].type != SW_FENCE) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid sw fence target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  steps[tmp].type != SW_FENCE,
+				  "Invalid sw fence target %u!\n", i);
 		}
 	}
 
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 08/21] gem_wsim: More wsim_err
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few more opportunities to compact the code by using the error logging
helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 54 ++++++++++++-------------------------------
 1 file changed, 15 insertions(+), 39 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index ef97311a6879..f1fcef5dcfba 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -2396,9 +2396,7 @@ int main(int argc, char **argv)
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one master workload can be given!\n");
+				wsim_err("Only one master workload can be given!\n");
 				return 1;
 			}
 			master_workload = nr_w_args;
@@ -2411,9 +2409,7 @@ int main(int argc, char **argv)
 			break;
 		case 'a':
 			if (append_workload_arg) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one append workload can be given!\n");
+				wsim_err("Only one append workload can be given!\n");
 				return 1;
 			}
 			append_workload_arg = optarg;
@@ -2474,10 +2470,8 @@ int main(int argc, char **argv)
 			}
 
 			if (!balancer) {
-				if (verbose)
-					fprintf(stderr,
-						"Unknown balancing mode '%s'!\n",
-						optarg);
+				wsim_err("Unknown balancing mode '%s'!\n",
+					 optarg);
 				return 1;
 			}
 			break;
@@ -2490,14 +2484,12 @@ int main(int argc, char **argv)
 	}
 
 	if ((flags & HEARTBEAT) && !(flags & SEQNO)) {
-		if (verbose)
-			fprintf(stderr, "Heartbeat needs a seqno based balancer!\n");
+		wsim_err("Heartbeat needs a seqno based balancer!\n");
 		return 1;
 	}
 
 	if ((flags & VCS2REMAP) && (flags & I915)) {
-		if (verbose)
-			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		wsim_err("VCS remapping not supported with i915 balancing!\n");
 		return 1;
 	}
 
@@ -2514,31 +2506,24 @@ int main(int argc, char **argv)
 	}
 
 	if (!nr_w_args) {
-		if (verbose)
-			fprintf(stderr, "No workload descriptor(s)!\n");
+		wsim_err("No workload descriptor(s)!\n");
 		return 1;
 	}
 
 	if (nr_w_args > 1 && clients > 1) {
-		if (verbose)
-			fprintf(stderr,
-				"Cloned clients cannot be combined with multiple workloads!\n");
+		wsim_err("Cloned clients cannot be combined with multiple workloads!\n");
 		return 1;
 	}
 
 	if ((flags & GLOBAL_BALANCE) && !balancer) {
-		if (verbose)
-			fprintf(stderr,
-				"Balancer not specified in global balancing mode!\n");
+		wsim_err("Balancer not specified in global balancing mode!\n");
 		return 1;
 	}
 
 	if (append_workload_arg) {
 		append_workload_arg = load_workload_descriptor(append_workload_arg);
 		if (!append_workload_arg) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load append workload descriptor!\n");
+			wsim_err("Failed to load append workload descriptor!\n");
 			return 1;
 		}
 	}
@@ -2547,9 +2532,7 @@ int main(int argc, char **argv)
 		struct w_arg arg = { NULL, append_workload_arg, 0 };
 		app_w = parse_workload(&arg, flags, NULL);
 		if (!app_w) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse append workload!\n");
+			wsim_err("Failed to parse append workload!\n");
 			return 1;
 		}
 	}
@@ -2561,18 +2544,13 @@ int main(int argc, char **argv)
 		w_args[i].desc = load_workload_descriptor(w_args[i].filename);
 
 		if (!w_args[i].desc) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load workload descriptor %u!\n",
-					i);
+			wsim_err("Failed to load workload descriptor %u!\n", i);
 			return 1;
 		}
 
 		wrk[i] = parse_workload(&w_args[i], flags, app_w);
 		if (!wrk[i]) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse workload %u!\n", i);
+			wsim_err("Failed to parse workload %u!\n", i);
 			return 1;
 		}
 	}
@@ -2632,10 +2610,8 @@ int main(int argc, char **argv)
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
 			if (ret) {
-				if (verbose)
-					fprintf(stderr,
-						"Failed to initialize balancing! (%u=%d)\n",
-						i, ret);
+				wsim_err("Failed to initialize balancing! (%u=%d)\n",
+					 i, ret);
 				return 1;
 			}
 		}
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 08/21] gem_wsim: More wsim_err
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few more opportunities to compact the code by using the error logging
helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 54 ++++++++++++-------------------------------
 1 file changed, 15 insertions(+), 39 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index ef97311a6879..f1fcef5dcfba 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -2396,9 +2396,7 @@ int main(int argc, char **argv)
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one master workload can be given!\n");
+				wsim_err("Only one master workload can be given!\n");
 				return 1;
 			}
 			master_workload = nr_w_args;
@@ -2411,9 +2409,7 @@ int main(int argc, char **argv)
 			break;
 		case 'a':
 			if (append_workload_arg) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one append workload can be given!\n");
+				wsim_err("Only one append workload can be given!\n");
 				return 1;
 			}
 			append_workload_arg = optarg;
@@ -2474,10 +2470,8 @@ int main(int argc, char **argv)
 			}
 
 			if (!balancer) {
-				if (verbose)
-					fprintf(stderr,
-						"Unknown balancing mode '%s'!\n",
-						optarg);
+				wsim_err("Unknown balancing mode '%s'!\n",
+					 optarg);
 				return 1;
 			}
 			break;
@@ -2490,14 +2484,12 @@ int main(int argc, char **argv)
 	}
 
 	if ((flags & HEARTBEAT) && !(flags & SEQNO)) {
-		if (verbose)
-			fprintf(stderr, "Heartbeat needs a seqno based balancer!\n");
+		wsim_err("Heartbeat needs a seqno based balancer!\n");
 		return 1;
 	}
 
 	if ((flags & VCS2REMAP) && (flags & I915)) {
-		if (verbose)
-			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		wsim_err("VCS remapping not supported with i915 balancing!\n");
 		return 1;
 	}
 
@@ -2514,31 +2506,24 @@ int main(int argc, char **argv)
 	}
 
 	if (!nr_w_args) {
-		if (verbose)
-			fprintf(stderr, "No workload descriptor(s)!\n");
+		wsim_err("No workload descriptor(s)!\n");
 		return 1;
 	}
 
 	if (nr_w_args > 1 && clients > 1) {
-		if (verbose)
-			fprintf(stderr,
-				"Cloned clients cannot be combined with multiple workloads!\n");
+		wsim_err("Cloned clients cannot be combined with multiple workloads!\n");
 		return 1;
 	}
 
 	if ((flags & GLOBAL_BALANCE) && !balancer) {
-		if (verbose)
-			fprintf(stderr,
-				"Balancer not specified in global balancing mode!\n");
+		wsim_err("Balancer not specified in global balancing mode!\n");
 		return 1;
 	}
 
 	if (append_workload_arg) {
 		append_workload_arg = load_workload_descriptor(append_workload_arg);
 		if (!append_workload_arg) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load append workload descriptor!\n");
+			wsim_err("Failed to load append workload descriptor!\n");
 			return 1;
 		}
 	}
@@ -2547,9 +2532,7 @@ int main(int argc, char **argv)
 		struct w_arg arg = { NULL, append_workload_arg, 0 };
 		app_w = parse_workload(&arg, flags, NULL);
 		if (!app_w) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse append workload!\n");
+			wsim_err("Failed to parse append workload!\n");
 			return 1;
 		}
 	}
@@ -2561,18 +2544,13 @@ int main(int argc, char **argv)
 		w_args[i].desc = load_workload_descriptor(w_args[i].filename);
 
 		if (!w_args[i].desc) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load workload descriptor %u!\n",
-					i);
+			wsim_err("Failed to load workload descriptor %u!\n", i);
 			return 1;
 		}
 
 		wrk[i] = parse_workload(&w_args[i], flags, app_w);
 		if (!wrk[i]) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse workload %u!\n", i);
+			wsim_err("Failed to parse workload %u!\n", i);
 			return 1;
 		}
 	}
@@ -2632,10 +2610,8 @@ int main(int argc, char **argv)
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
 			if (ret) {
-				if (verbose)
-					fprintf(stderr,
-						"Failed to initialize balancing! (%u=%d)\n",
-						i, ret);
+				wsim_err("Failed to initialize balancing! (%u=%d)\n",
+					 i, ret);
 				return 1;
 			}
 		}
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 09/21] gem_wsim: Submit fence support
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add support for submit fences in a way similar to how normal input fences
are handled. Eg:

  1.RCS.500-1000.0.0
  1.VCS1.3000.s-1.0
  1.VCS2.3000.s-2.0

Submit fences are signalled when the originating request enters the
submission backend.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 20 ++++++++++++++++----
 benchmarks/wsim/README | 17 +++++++++++++++++
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index f1fcef5dcfba..5245692df6eb 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 struct deps
 {
 	int nr;
+	bool submit_fence;
 	int *list;
 };
 
@@ -253,17 +254,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 		   w->data_deps.list == w->fence_deps.list);
 
 	while ((token = strtok_r(tstart, "/", &tctx)) != NULL) {
+		bool submit_fence = false;
 		char *str = token;
 		struct deps *deps;
 		int dep;
 
 		tstart = NULL;
 
-		if (strlen(token) > 1 && token[0] == 'f') {
+		if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) {
+			deps = &w->data_deps;
+		} else {
+			if (str[0] == 's')
+				submit_fence = true;
+			else if (str[0] != 'f')
+				return -1;
+
 			deps = &w->fence_deps;
 			str++;
-		} else {
-			deps = &w->data_deps;
 		}
 
 		dep = atoi(str);
@@ -281,6 +288,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 					     sizeof(*deps->list) * deps->nr);
 			igt_assert(deps->list);
 			deps->list[deps->nr - 1] = dep;
+			deps->submit_fence = submit_fence;
 		}
 	}
 
@@ -1921,7 +1929,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= I915_EXEC_FENCE_IN;
+		if (w->fence_deps.submit_fence)
+			w->eb.flags |= I915_EXEC_FENCE_SUBMIT;
+		else
+			w->eb.flags |= I915_EXEC_FENCE_IN;
+
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 205cd6c93afb..4786f116b4ac 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -114,6 +114,23 @@ runnable. When the second RCS batch completes the standalone fence is signaled
 which allows the two VCS batches to be executed. Finally we wait until the both
 VCS batches have completed before starting the (optional) next iteration.
 
+Submit fences
+-------------
+
+Submit fences are a type of input fence which are signalled when the originating
+batch buffer is submitted to the GPU. (In contrary to normal sync fences, which
+are signalled when completed.)
+
+Submit fences have the identical syntax as the sync fences with the lower-case
+'s' being used to select them. Eg:
+
+  1.RCS.500-1000.0.0
+  1.VCS1.3000.s-1.0
+  1.VCS2.3000.s-2.0
+
+Here VCS1 and VCS2 batches will only be submitted for executing once the RCS
+batch enters the GPU.
+
 Context priority
 ----------------
 
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [Intel-gfx] [PATCH i-g-t 09/21] gem_wsim: Submit fence support
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add support for submit fences in a way similar to how normal input fences
are handled. Eg:

  1.RCS.500-1000.0.0
  1.VCS1.3000.s-1.0
  1.VCS2.3000.s-2.0

Submit fences are signalled when the originating request enters the
submission backend.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 20 ++++++++++++++++----
 benchmarks/wsim/README | 17 +++++++++++++++++
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index f1fcef5dcfba..5245692df6eb 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 struct deps
 {
 	int nr;
+	bool submit_fence;
 	int *list;
 };
 
@@ -253,17 +254,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 		   w->data_deps.list == w->fence_deps.list);
 
 	while ((token = strtok_r(tstart, "/", &tctx)) != NULL) {
+		bool submit_fence = false;
 		char *str = token;
 		struct deps *deps;
 		int dep;
 
 		tstart = NULL;
 
-		if (strlen(token) > 1 && token[0] == 'f') {
+		if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) {
+			deps = &w->data_deps;
+		} else {
+			if (str[0] == 's')
+				submit_fence = true;
+			else if (str[0] != 'f')
+				return -1;
+
 			deps = &w->fence_deps;
 			str++;
-		} else {
-			deps = &w->data_deps;
 		}
 
 		dep = atoi(str);
@@ -281,6 +288,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 					     sizeof(*deps->list) * deps->nr);
 			igt_assert(deps->list);
 			deps->list[deps->nr - 1] = dep;
+			deps->submit_fence = submit_fence;
 		}
 	}
 
@@ -1921,7 +1929,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= I915_EXEC_FENCE_IN;
+		if (w->fence_deps.submit_fence)
+			w->eb.flags |= I915_EXEC_FENCE_SUBMIT;
+		else
+			w->eb.flags |= I915_EXEC_FENCE_IN;
+
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 205cd6c93afb..4786f116b4ac 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -114,6 +114,23 @@ runnable. When the second RCS batch completes the standalone fence is signaled
 which allows the two VCS batches to be executed. Finally we wait until the both
 VCS batches have completed before starting the (optional) next iteration.
 
+Submit fences
+-------------
+
+Submit fences are a type of input fence which are signalled when the originating
+batch buffer is submitted to the GPU. (In contrary to normal sync fences, which
+are signalled when completed.)
+
+Submit fences have the identical syntax as the sync fences with the lower-case
+'s' being used to select them. Eg:
+
+  1.RCS.500-1000.0.0
+  1.VCS1.3000.s-1.0
+  1.VCS2.3000.s-2.0
+
+Here VCS1 and VCS2 batches will only be submitted for executing once the RCS
+batch enters the GPU.
+
 Context priority
 ----------------
 
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 5245692df6eb..f654decb24cc 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
 	} \
 }
 
+static int str_to_engine(const char *str)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
+		if (!strcasecmp(str, ring_str_map[i]))
+			return i;
+	}
+
+	return -1;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-			unsigned int old_valid = valid;
-
 			fstart = NULL;
 
-			for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
-				if (!strcasecmp(field, ring_str_map[i])) {
-					step.engine = i;
-					if (step.engine == BCS)
-						bcs_used = true;
-					valid++;
-					break;
-				}
-			}
-
-			check_arg(old_valid == valid,
+			i = str_to_engine(field);
+			check_arg(i < 0,
 				  "Invalid engine id at step %u!\n", nr_steps);
+			if (i >= 0)
+				valid++;
+
+			step.engine = i;
+
+			if (step.engine == BCS)
+				bcs_used = true;
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [Intel-gfx] [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 5245692df6eb..f654decb24cc 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
 	} \
 }
 
+static int str_to_engine(const char *str)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
+		if (!strcasecmp(str, ring_str_map[i]))
+			return i;
+	}
+
+	return -1;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-			unsigned int old_valid = valid;
-
 			fstart = NULL;
 
-			for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
-				if (!strcasecmp(field, ring_str_map[i])) {
-					step.engine = i;
-					if (step.engine == BCS)
-						bcs_used = true;
-					valid++;
-					break;
-				}
-			}
-
-			check_arg(old_valid == valid,
+			i = str_to_engine(field);
+			check_arg(i < 0,
 				  "Invalid engine id at step %u!\n", nr_steps);
+			if (i >= 0)
+				valid++;
+
+			step.engine = i;
+
+			if (step.engine == BCS)
+				bcs_used = true;
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 11/21] gem_wsim: Engine map support
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support new i915 uAPI for configuring contexts with engine maps.

Please refer to the README file for more detailed explanation.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 212 ++++++++++++++++++++++++++++++++++-------
 benchmarks/wsim/README |  17 +++-
 2 files changed, 192 insertions(+), 37 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index f654decb24cc..e6b7b8f5335d 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -57,6 +57,7 @@
 #include "ewma.h"
 
 enum intel_engine_id {
+	DEFAULT,
 	RCS,
 	BCS,
 	VCS,
@@ -81,7 +82,8 @@ enum w_type
 	SW_FENCE,
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
-	PREEMPTION
+	PREEMPTION,
+	ENGINE_MAP
 };
 
 struct deps
@@ -115,6 +117,10 @@ struct w_step
 		int throttle;
 		int fence_signal;
 		int priority;
+		struct {
+			unsigned int engine_map_count;
+			enum intel_engine_id *engine_map;
+		};
 	};
 
 	/* Implementation details */
@@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
 struct ctx {
 	uint32_t id;
 	int priority;
+	unsigned int engine_map_count;
+	enum intel_engine_id *engine_map;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -200,10 +208,10 @@ struct workload
 		int fd;
 		bool first;
 		unsigned int num_engines;
-		unsigned int engine_map[5];
+		unsigned int engine_map[NUM_ENGINES];
 		uint64_t t_prev;
-		uint64_t prev[5];
-		double busy[5];
+		uint64_t prev[NUM_ENGINES];
+		double busy[NUM_ENGINES];
 	} busy_balancer;
 };
 
@@ -234,6 +242,7 @@ static int fd;
 #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
 
 static const char *ring_str_map[NUM_ENGINES] = {
+	[DEFAULT] = "DEFAULT",
 	[RCS] = "RCS",
 	[BCS] = "BCS",
 	[VCS] = "VCS",
@@ -330,6 +339,37 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static int parse_engine_map(struct w_step *step, const char *_str)
+{
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine;
+
+		tstart = NULL;
+
+		if (!strcmp(token, "DEFAULT"))
+			return -1;
+		else if (!strcmp(token, "VCS"))
+			return -1;
+
+		engine = str_to_engine(token);
+		if ((int)engine < 0)
+			return -1;
+
+		if (engine != VCS1 && engine != VCS2)
+			return -1; /* TODO */
+
+		step->engine_map_count++;
+		step->engine_map = realloc(step->engine_map,
+					   step->engine_map_count *
+					   sizeof(step->engine_map[0]));
+		step->engine_map[step->engine_map_count - 1] = engine;
+	}
+
+	return 0;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -448,6 +488,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
+			} else if (!strcmp(field, "M")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx)) !=
+				    NULL) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid engine map format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else {
+						tmp = parse_engine_map(&step,
+								       field);
+						check_arg(tmp < 0,
+							  "Invalid engine map list at step %u!\n",
+							  nr_steps);
+					}
+
+					nr++;
+				}
+
+				step.type = ENGINE_MAP;
+				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
@@ -497,9 +564,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			i = str_to_engine(field);
 			check_arg(i < 0,
 				  "Invalid engine id at step %u!\n", nr_steps);
-			if (i >= 0)
-				valid++;
-
+			valid++;
 			step.engine = i;
 
 			if (step.engine == BCS)
@@ -774,6 +839,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
+	[DEFAULT] = I915_EXEC_DEFAULT,
 	[RCS] = I915_EXEC_RENDER,
 	[BCS] = I915_EXEC_BLT,
 	[VCS] = I915_EXEC_BSD,
@@ -790,18 +856,42 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 	if (engine == VCS2 && (flags & VCS2REMAP))
 		engine = BCS;
 
-	if ((flags & I915) && engine == VCS) {
+	if ((flags & I915) && engine == VCS)
 		eb->flags = 0;
-	} else {
+	else
 		eb->flags = eb_engine_map[engine];
+}
+
+static unsigned int
+find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
+{
+	unsigned int i;
+
+	for (i = 0; i < ctx->engine_map_count; i++) {
+		if (ctx->engine_map[i] == engine)
+			return i + 1;
 	}
+
+	igt_assert(0);
+	return 0;
+}
+
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
 }
 
 static void
-eb_update_flags(struct w_step *w, enum intel_engine_id engine,
-		unsigned int flags)
+eb_update_flags(struct workload *wrk, struct w_step *w,
+		enum intel_engine_id engine, unsigned int flags)
 {
-	eb_set_engine(&w->eb, engine, flags);
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->engine_map)
+		w->eb.flags = find_engine_in_map(ctx, engine);
+	else
+		eb_set_engine(&w->eb, engine, flags);
 
 	w->eb.flags |= I915_EXEC_HANDLE_LUT;
 	w->eb.flags |= I915_EXEC_NO_RELOC;
@@ -820,12 +910,6 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
-static struct ctx *
-__get_ctx(struct workload *wrk, struct w_step *w)
-{
-	return &wrk->ctx_list[w->context * 2];
-}
-
 static uint32_t
 get_ctxid(struct workload *wrk, struct w_step *w)
 {
@@ -895,7 +979,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		engine = VCS2;
 	else if (flags & SWAPVCS && engine == VCS2)
 		engine = VCS1;
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 #ifdef DEBUG
 	printf("%u: %u:|", w->idx, w->eb.buffer_count);
 	for (i = 0; i <= j; i++)
@@ -918,7 +1002,7 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 		gem_context_set_param(fd, &param);
 }
 
-static void
+static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs = 0;
@@ -979,30 +1063,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	/*
 	 * Identify if contexts target specific engine instances and if they
 	 * want to be balanced.
+	 *
+	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
 		bool targets = false;
 		bool balance = false;
 
 		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-			if (w->type != BATCH)
-				continue;
-
 			if (w->context != (j / 2))
 				continue;
 
-			if (w->engine == VCS)
-				balance = true;
-			else
-				targets = true;
+			if (w->type == BATCH) {
+				if (w->engine == VCS)
+					balance = true;
+				else
+					targets = true;
+			} else if (w->type == ENGINE_MAP) {
+				wrk->ctx_list[j].engine_map = w->engine_map;
+				wrk->ctx_list[j].engine_map_count =
+					w->engine_map_count;
+			}
 		}
 
-		if (flags & I915) {
-			wrk->ctx_list[j].targets_instance = targets;
+		wrk->ctx_list[j].targets_instance = targets;
+		if (flags & I915)
 			wrk->ctx_list[j].wants_balance = balance;
+	}
+
+	/*
+	 * Ensure VCS is not allowed with engine map contexts.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->type != BATCH)
+				continue;
+
+			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+				wsim_err("Batches targetting engine maps must use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
+
 	/*
 	 * Create and configure contexts.
 	 */
@@ -1013,7 +1120,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->id)
 			continue;
 
-		if (flags & I915) {
+		if ((flags & I915) || ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1043,7 +1150,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->targets_instance)
+			if (!ctx->engine_map && !ctx->targets_instance)
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1076,7 +1183,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		 * both want to target specific engines and be balanced by i915?
 		 */
 		if ((flags & I915) && ctx->wants_balance &&
-		    ctx->targets_instance) {
+		    ctx->targets_instance && !ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1101,7 +1208,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			__ctx_set_prio(ctx_id, wrk->prio);
 		}
 
-		if (ctx->wants_balance) {
+		if (ctx->engine_map) {
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  ctx->engine_map_count + 1);
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			set_engines.extensions = 0;
+
+			/* Reserve slot for virtual engine. */
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+
+			for (j = 1; j <= ctx->engine_map_count; j++) {
+				set_engines.engines[j].engine_class =
+					I915_ENGINE_CLASS_VIDEO; /* FIXME */
+				set_engines.engines[j].engine_instance =
+					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+			}
+
+			gem_context_set_param(fd, &param);
+		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
 				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
 				.num_siblings = 2,
@@ -1181,6 +1314,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		alloc_step_batch(wrk, w, _flags);
 	}
+
+	return 0;
 }
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
@@ -1918,7 +2053,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	uint32_t seqno = new_seqno(wrk, engine);
 	unsigned int i;
 
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 
 	if (flags & SEQNO)
 		update_bb_seqno(w, engine, seqno);
@@ -2067,7 +2202,8 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
-			} else if (w->type == PREEMPTION) {
+			} else if (w->type == PREEMPTION ||
+				   w->type == ENGINE_MAP) {
 				continue;
 			}
 
@@ -2625,7 +2761,11 @@ int main(int argc, char **argv)
 		w[i]->print_stats = verbose > 1 ||
 				    (verbose > 0 && master_workload == i);
 
-		prepare_workload(i, w[i], flags_);
+		if (prepare_workload(i, w[i], flags_)) {
+			wsim_err("Failed to prepare workload %u!\n", i);
+			return 1;
+		}
+
 
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4786f116b4ac..4b14aa28bfa7 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
 f
@@ -23,10 +24,11 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
 
-Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS
+Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
 
 Example (leading spaces must not be present in the actual file):
 ----------------------------------------------------------------
@@ -161,3 +163,16 @@ The same context is then marked to have batches which can be preempted every
 
 Same as with context priority, context preemption commands are valid until
 optionally overriden by another preemption control change on the same context.
+
+Engine maps
+-----------
+
+Engine maps are a per context feature which changes the way engine selection is
+done in the driver.
+
+Example:
+
+  M.1.VCS1|VCS2
+
+This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
+Submission to this context can now only reference these two engines.
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [Intel-gfx] [PATCH i-g-t 11/21] gem_wsim: Engine map support
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support new i915 uAPI for configuring contexts with engine maps.

Please refer to the README file for more detailed explanation.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 212 ++++++++++++++++++++++++++++++++++-------
 benchmarks/wsim/README |  17 +++-
 2 files changed, 192 insertions(+), 37 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index f654decb24cc..e6b7b8f5335d 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -57,6 +57,7 @@
 #include "ewma.h"
 
 enum intel_engine_id {
+	DEFAULT,
 	RCS,
 	BCS,
 	VCS,
@@ -81,7 +82,8 @@ enum w_type
 	SW_FENCE,
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
-	PREEMPTION
+	PREEMPTION,
+	ENGINE_MAP
 };
 
 struct deps
@@ -115,6 +117,10 @@ struct w_step
 		int throttle;
 		int fence_signal;
 		int priority;
+		struct {
+			unsigned int engine_map_count;
+			enum intel_engine_id *engine_map;
+		};
 	};
 
 	/* Implementation details */
@@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
 struct ctx {
 	uint32_t id;
 	int priority;
+	unsigned int engine_map_count;
+	enum intel_engine_id *engine_map;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -200,10 +208,10 @@ struct workload
 		int fd;
 		bool first;
 		unsigned int num_engines;
-		unsigned int engine_map[5];
+		unsigned int engine_map[NUM_ENGINES];
 		uint64_t t_prev;
-		uint64_t prev[5];
-		double busy[5];
+		uint64_t prev[NUM_ENGINES];
+		double busy[NUM_ENGINES];
 	} busy_balancer;
 };
 
@@ -234,6 +242,7 @@ static int fd;
 #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
 
 static const char *ring_str_map[NUM_ENGINES] = {
+	[DEFAULT] = "DEFAULT",
 	[RCS] = "RCS",
 	[BCS] = "BCS",
 	[VCS] = "VCS",
@@ -330,6 +339,37 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static int parse_engine_map(struct w_step *step, const char *_str)
+{
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine;
+
+		tstart = NULL;
+
+		if (!strcmp(token, "DEFAULT"))
+			return -1;
+		else if (!strcmp(token, "VCS"))
+			return -1;
+
+		engine = str_to_engine(token);
+		if ((int)engine < 0)
+			return -1;
+
+		if (engine != VCS1 && engine != VCS2)
+			return -1; /* TODO */
+
+		step->engine_map_count++;
+		step->engine_map = realloc(step->engine_map,
+					   step->engine_map_count *
+					   sizeof(step->engine_map[0]));
+		step->engine_map[step->engine_map_count - 1] = engine;
+	}
+
+	return 0;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -448,6 +488,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
+			} else if (!strcmp(field, "M")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx)) !=
+				    NULL) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid engine map format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else {
+						tmp = parse_engine_map(&step,
+								       field);
+						check_arg(tmp < 0,
+							  "Invalid engine map list at step %u!\n",
+							  nr_steps);
+					}
+
+					nr++;
+				}
+
+				step.type = ENGINE_MAP;
+				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
@@ -497,9 +564,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			i = str_to_engine(field);
 			check_arg(i < 0,
 				  "Invalid engine id at step %u!\n", nr_steps);
-			if (i >= 0)
-				valid++;
-
+			valid++;
 			step.engine = i;
 
 			if (step.engine == BCS)
@@ -774,6 +839,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
+	[DEFAULT] = I915_EXEC_DEFAULT,
 	[RCS] = I915_EXEC_RENDER,
 	[BCS] = I915_EXEC_BLT,
 	[VCS] = I915_EXEC_BSD,
@@ -790,18 +856,42 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 	if (engine == VCS2 && (flags & VCS2REMAP))
 		engine = BCS;
 
-	if ((flags & I915) && engine == VCS) {
+	if ((flags & I915) && engine == VCS)
 		eb->flags = 0;
-	} else {
+	else
 		eb->flags = eb_engine_map[engine];
+}
+
+static unsigned int
+find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
+{
+	unsigned int i;
+
+	for (i = 0; i < ctx->engine_map_count; i++) {
+		if (ctx->engine_map[i] == engine)
+			return i + 1;
 	}
+
+	igt_assert(0);
+	return 0;
+}
+
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
 }
 
 static void
-eb_update_flags(struct w_step *w, enum intel_engine_id engine,
-		unsigned int flags)
+eb_update_flags(struct workload *wrk, struct w_step *w,
+		enum intel_engine_id engine, unsigned int flags)
 {
-	eb_set_engine(&w->eb, engine, flags);
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->engine_map)
+		w->eb.flags = find_engine_in_map(ctx, engine);
+	else
+		eb_set_engine(&w->eb, engine, flags);
 
 	w->eb.flags |= I915_EXEC_HANDLE_LUT;
 	w->eb.flags |= I915_EXEC_NO_RELOC;
@@ -820,12 +910,6 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
-static struct ctx *
-__get_ctx(struct workload *wrk, struct w_step *w)
-{
-	return &wrk->ctx_list[w->context * 2];
-}
-
 static uint32_t
 get_ctxid(struct workload *wrk, struct w_step *w)
 {
@@ -895,7 +979,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		engine = VCS2;
 	else if (flags & SWAPVCS && engine == VCS2)
 		engine = VCS1;
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 #ifdef DEBUG
 	printf("%u: %u:|", w->idx, w->eb.buffer_count);
 	for (i = 0; i <= j; i++)
@@ -918,7 +1002,7 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 		gem_context_set_param(fd, &param);
 }
 
-static void
+static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs = 0;
@@ -979,30 +1063,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	/*
 	 * Identify if contexts target specific engine instances and if they
 	 * want to be balanced.
+	 *
+	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
 		bool targets = false;
 		bool balance = false;
 
 		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-			if (w->type != BATCH)
-				continue;
-
 			if (w->context != (j / 2))
 				continue;
 
-			if (w->engine == VCS)
-				balance = true;
-			else
-				targets = true;
+			if (w->type == BATCH) {
+				if (w->engine == VCS)
+					balance = true;
+				else
+					targets = true;
+			} else if (w->type == ENGINE_MAP) {
+				wrk->ctx_list[j].engine_map = w->engine_map;
+				wrk->ctx_list[j].engine_map_count =
+					w->engine_map_count;
+			}
 		}
 
-		if (flags & I915) {
-			wrk->ctx_list[j].targets_instance = targets;
+		wrk->ctx_list[j].targets_instance = targets;
+		if (flags & I915)
 			wrk->ctx_list[j].wants_balance = balance;
+	}
+
+	/*
+	 * Ensure VCS is not allowed with engine map contexts.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->type != BATCH)
+				continue;
+
+			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+				wsim_err("Batches targetting engine maps must use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
+
 	/*
 	 * Create and configure contexts.
 	 */
@@ -1013,7 +1120,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->id)
 			continue;
 
-		if (flags & I915) {
+		if ((flags & I915) || ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1043,7 +1150,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->targets_instance)
+			if (!ctx->engine_map && !ctx->targets_instance)
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1076,7 +1183,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		 * both want to target specific engines and be balanced by i915?
 		 */
 		if ((flags & I915) && ctx->wants_balance &&
-		    ctx->targets_instance) {
+		    ctx->targets_instance && !ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1101,7 +1208,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			__ctx_set_prio(ctx_id, wrk->prio);
 		}
 
-		if (ctx->wants_balance) {
+		if (ctx->engine_map) {
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  ctx->engine_map_count + 1);
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			set_engines.extensions = 0;
+
+			/* Reserve slot for virtual engine. */
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+
+			for (j = 1; j <= ctx->engine_map_count; j++) {
+				set_engines.engines[j].engine_class =
+					I915_ENGINE_CLASS_VIDEO; /* FIXME */
+				set_engines.engines[j].engine_instance =
+					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+			}
+
+			gem_context_set_param(fd, &param);
+		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
 				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
 				.num_siblings = 2,
@@ -1181,6 +1314,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		alloc_step_batch(wrk, w, _flags);
 	}
+
+	return 0;
 }
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
@@ -1918,7 +2053,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	uint32_t seqno = new_seqno(wrk, engine);
 	unsigned int i;
 
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 
 	if (flags & SEQNO)
 		update_bb_seqno(w, engine, seqno);
@@ -2067,7 +2202,8 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
-			} else if (w->type == PREEMPTION) {
+			} else if (w->type == PREEMPTION ||
+				   w->type == ENGINE_MAP) {
 				continue;
 			}
 
@@ -2625,7 +2761,11 @@ int main(int argc, char **argv)
 		w[i]->print_stats = verbose > 1 ||
 				    (verbose > 0 && master_workload == i);
 
-		prepare_workload(i, w[i], flags_);
+		if (prepare_workload(i, w[i], flags_)) {
+			wsim_err("Failed to prepare workload %u!\n", i);
+			return 1;
+		}
+
 
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4786f116b4ac..4b14aa28bfa7 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
 f
@@ -23,10 +24,11 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
 
-Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS
+Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
 
 Example (leading spaces must not be present in the actual file):
 ----------------------------------------------------------------
@@ -161,3 +163,16 @@ The same context is then marked to have batches which can be preempted every
 
 Same as with context priority, context preemption commands are valid until
 optionally overriden by another preemption control change on the same context.
+
+Engine maps
+-----------
+
+Engine maps are a per context feature which changes the way engine selection is
+done in the driver.
+
+Example:
+
+  M.1.VCS1|VCS2
+
+This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
+Submission to this context can now only reference these two engines.
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 12/21] gem_wsim: Save some lines by changing to implicit NULL checking
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can improve the parsing loop readability a bit more by avoiding some
line breaks caused by explicit NULL checks.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index e6b7b8f5335d..4dbfc3e922a9 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -385,7 +385,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 	igt_assert(desc);
 
-	while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) {
+	while ((_token = strtok_r(tstart, ",", &tctx))) {
 		tstart = NULL;
 		token = strdup(_token);
 		igt_assert(token);
@@ -393,12 +393,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		valid = 0;
 		memset(&step, 0, sizeof(step));
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid delay at step %u!\n",
@@ -408,8 +407,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid period at step %u!\n",
@@ -420,8 +418,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -441,8 +438,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0 ||
 						  ((int)nr_steps + tmp) < 0,
@@ -453,8 +449,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid throttle at step %u!\n",
@@ -464,8 +459,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid qd throttle at step %u!\n",
@@ -475,8 +469,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0,
 						  "Invalid sw fence signal at step %u!\n",
@@ -490,8 +483,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "M")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -517,8 +509,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -558,7 +549,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			i = str_to_engine(field);
@@ -571,7 +562,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				bcs_used = true;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			char *sep = NULL;
 			long int tmpl;
 
@@ -599,7 +590,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
@@ -609,7 +600,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			check_arg(strlen(field) != 1 ||
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 12/21] gem_wsim: Save some lines by changing to implicit NULL checking
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can improve the parsing loop readability a bit more by avoiding some
line breaks caused by explicit NULL checks.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index e6b7b8f5335d..4dbfc3e922a9 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -385,7 +385,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 	igt_assert(desc);
 
-	while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) {
+	while ((_token = strtok_r(tstart, ",", &tctx))) {
 		tstart = NULL;
 		token = strdup(_token);
 		igt_assert(token);
@@ -393,12 +393,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		valid = 0;
 		memset(&step, 0, sizeof(step));
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid delay at step %u!\n",
@@ -408,8 +407,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid period at step %u!\n",
@@ -420,8 +418,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -441,8 +438,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0 ||
 						  ((int)nr_steps + tmp) < 0,
@@ -453,8 +449,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid throttle at step %u!\n",
@@ -464,8 +459,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid qd throttle at step %u!\n",
@@ -475,8 +469,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0,
 						  "Invalid sw fence signal at step %u!\n",
@@ -490,8 +483,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "M")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -517,8 +509,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -558,7 +549,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			i = str_to_engine(field);
@@ -571,7 +562,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				bcs_used = true;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			char *sep = NULL;
 			long int tmpl;
 
@@ -599,7 +590,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
@@ -609,7 +600,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			check_arg(strlen(field) != 1 ||
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Parsing an integer workload descriptor field is a common pattern which we
can extract to a helper macro and by doing so further improve the
readability of the main parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 55 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 4dbfc3e922a9..c2e13d9939c2 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -370,6 +370,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
+	if ((field = strtok_r(fstart, ".", &fctx))) { \
+		tmp = atoi(field); \
+		check_arg(_COND_, _ERR_, nr_steps); \
+		step.type = _STEP_; \
+		step._FIELD_ = tmp; \
+		goto add_step; \
+	} \
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -397,25 +406,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid delay at step %u!\n",
-						  nr_steps);
-					step.type = DELAY;
-					step.delay = tmp;
-					goto add_step;
-				}
+				int_field(DELAY, delay, tmp <= 0,
+					  "Invalid delay at step %u!\n");
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid period at step %u!\n",
-						  nr_steps);
-					step.type = PERIOD;
-					step.period = tmp;
-					goto add_step;
-				}
+				int_field(PERIOD, period, tmp <= 0,
+					  "Invalid period at step %u!\n");
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -438,46 +433,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0 ||
-						  ((int)nr_steps + tmp) < 0,
-						  "Invalid sync target at step %u!\n",
-						  nr_steps);
-					step.type = SYNC;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SYNC, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid sync target at step %u!\n");
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid throttle at step %u!\n",
-						  nr_steps);
-					step.type = THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid throttle at step %u!\n");
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid qd throttle at step %u!\n",
-						  nr_steps);
-					step.type = QD_THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(QD_THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid qd throttle at step %u!\n");
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0,
-						  "Invalid sw fence signal at step %u!\n",
-						  nr_steps);
-					step.type = SW_FENCE_SIGNAL;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SW_FENCE_SIGNAL, target,
+					  tmp >= 0,
+					  "Invalid sw fence signal at step %u!\n");
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [Intel-gfx] [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Parsing an integer workload descriptor field is a common pattern which we
can extract to a helper macro and by doing so further improve the
readability of the main parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 55 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 4dbfc3e922a9..c2e13d9939c2 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -370,6 +370,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
+	if ((field = strtok_r(fstart, ".", &fctx))) { \
+		tmp = atoi(field); \
+		check_arg(_COND_, _ERR_, nr_steps); \
+		step.type = _STEP_; \
+		step._FIELD_ = tmp; \
+		goto add_step; \
+	} \
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -397,25 +406,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid delay at step %u!\n",
-						  nr_steps);
-					step.type = DELAY;
-					step.delay = tmp;
-					goto add_step;
-				}
+				int_field(DELAY, delay, tmp <= 0,
+					  "Invalid delay at step %u!\n");
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid period at step %u!\n",
-						  nr_steps);
-					step.type = PERIOD;
-					step.period = tmp;
-					goto add_step;
-				}
+				int_field(PERIOD, period, tmp <= 0,
+					  "Invalid period at step %u!\n");
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -438,46 +433,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0 ||
-						  ((int)nr_steps + tmp) < 0,
-						  "Invalid sync target at step %u!\n",
-						  nr_steps);
-					step.type = SYNC;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SYNC, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid sync target at step %u!\n");
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid throttle at step %u!\n",
-						  nr_steps);
-					step.type = THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid throttle at step %u!\n");
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid qd throttle at step %u!\n",
-						  nr_steps);
-					step.type = QD_THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(QD_THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid qd throttle at step %u!\n");
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0,
-						  "Invalid sw fence signal at step %u!\n",
-						  nr_steps);
-					step.type = SW_FENCE_SIGNAL;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SW_FENCE_SIGNAL, target,
+					  tmp >= 0,
+					  "Invalid sw fence signal at step %u!\n");
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 73 ++++++++++++++++++++++++++++++++++++++----
 benchmarks/wsim/README | 18 +++++++++++
 2 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index c2e13d9939c2..b610a603f7b0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@ enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@ struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -501,6 +503,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -833,7 +854,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1044,12 +1065,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1063,10 +1091,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
+
+			if (wrk->ctx_list[j].engine_map &&
+			    wrk->ctx_list[j].wants_balance &&
+			    w->engine != DEFAULT) {
+				wsim_err("Batches targetting load balanced maps must not use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
@@ -1111,7 +1148,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2164,7 +2222,8 @@ static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4b14aa28bfa7..45fa1e0a1d76 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -176,3 +178,19 @@ Example:
 
 This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
 Submission to this context can now only reference these two engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
+
+Submissions to load balanced contexts are only allowed to use the DEFAULT engine
+specifier.
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 73 ++++++++++++++++++++++++++++++++++++++----
 benchmarks/wsim/README | 18 +++++++++++
 2 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index c2e13d9939c2..b610a603f7b0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@ enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@ struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -501,6 +503,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -833,7 +854,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1044,12 +1065,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1063,10 +1091,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
+
+			if (wrk->ctx_list[j].engine_map &&
+			    wrk->ctx_list[j].wants_balance &&
+			    w->engine != DEFAULT) {
+				wsim_err("Batches targetting load balanced maps must not use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
@@ -1111,7 +1148,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2164,7 +2222,8 @@ static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4b14aa28bfa7..45fa1e0a1d76 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -176,3 +178,19 @@ Example:
 
 This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
 Submission to this context can now only reference these two engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
+
+Submissions to load balanced contexts are only allowed to use the DEFAULT engine
+specifier.
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 15/21] gem_wsim: Engine bond command
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.

Please refer to the README for a more detailed description.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 107 ++++++++++++++++++++++++++++++++++++++---
 benchmarks/wsim/README |  50 +++++++++++++++++++
 2 files changed, 150 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index b610a603f7b0..cc6f4a742c12 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -85,6 +85,7 @@ enum w_type
 	PREEMPTION,
 	ENGINE_MAP,
 	LOAD_BALANCE,
+	BOND,
 };
 
 struct deps
@@ -100,6 +101,11 @@ struct w_arg {
 	int prio;
 };
 
+struct bond {
+	uint64_t mask;
+	enum intel_engine_id master;
+};
+
 struct w_step
 {
 	/* Workload step metadata */
@@ -123,6 +129,10 @@ struct w_step
 			enum intel_engine_id *engine_map;
 		};
 		bool load_balance;
+		struct {
+			uint64_t bond_mask;
+			enum intel_engine_id bond_master;
+		};
 	};
 
 	/* Implementation details */
@@ -152,6 +162,8 @@ struct ctx {
 	int priority;
 	unsigned int engine_map_count;
 	enum intel_engine_id *engine_map;
+	unsigned int bond_count;
+	struct bond *bonds;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -522,6 +534,40 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = LOAD_BALANCE;
 				goto add_step;
+			} else if (!strcmp(field, "b")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr == 1 &&
+						  (tmp < -1 || tmp == 0),
+						  "Invalid siblings mask at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 2,
+						  "Invalid bond format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else if (nr == 1) {
+						step.bond_mask = tmp;
+					} else if (nr == 2) {
+						tmp = str_to_engine(field);
+						check_arg(tmp <= 0 ||
+							  tmp == VCS ||
+							  tmp == DEFAULT,
+							  "Invalid master engine at step %u!\n",
+							  nr_steps);
+						step.bond_master = tmp;
+					}
+
+					nr++;
+				}
+
+				step.type = BOND;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -1049,6 +1095,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		struct ctx *ctx = &wrk->ctx_list[j];
+
 		bool targets = false;
 		bool balance = false;
 
@@ -1062,16 +1110,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				else
 					targets = true;
 			} else if (w->type == ENGINE_MAP) {
-				wrk->ctx_list[j].engine_map = w->engine_map;
-				wrk->ctx_list[j].engine_map_count =
-					w->engine_map_count;
+				ctx->engine_map = w->engine_map;
+				ctx->engine_map_count = w->engine_map_count;
 			} else if (w->type == LOAD_BALANCE) {
-				if (!wrk->ctx_list[j].engine_map) {
+				if (!ctx->engine_map) {
 					wsim_err("Load balancing needs an engine map!\n");
 					return 1;
 				}
-				wrk->ctx_list[j].wants_balance =
-					w->load_balance;
+				ctx->wants_balance = w->load_balance;
+			} else if (w->type == BOND) {
+				if (!ctx->wants_balance) {
+					wsim_err("Engine bonds need load balancing engine map!\n");
+					return 1;
+				}
+				ctx->bond_count++;
+				ctx->bonds = realloc(ctx->bonds,
+						     ctx->bond_count *
+						     sizeof(struct bond));
+				igt_assert(ctx->bonds);
+				ctx->bonds[ctx->bond_count - 1].mask =
+					w->bond_mask;
+				ctx->bonds[ctx->bond_count - 1].master =
+					w->bond_master;
 			}
 		}
 
@@ -1252,6 +1312,38 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map[j - 1] - VCS1; /* FIXME */
 			}
 
+			for (j = 0; j < ctx->bond_count; j++) {
+				unsigned long mask = ctx->bonds[j].mask;
+				I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
+								 __builtin_popcount(mask));
+				struct i915_context_engines_bond *p = NULL, *prev;
+				unsigned int b, e;
+
+				prev = p;
+				p = alloca(sizeof(bond));
+				assert(p);
+				memset(p, 0, sizeof(bond));
+
+				if (j == 0)
+					load_balance.base.next_extension =
+						to_user_pointer(p);
+				else if (j < (ctx->bond_count - 1))
+					prev->base.next_extension =
+						to_user_pointer(p);
+
+				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+				p->virtual_index = 0;
+				p->master.engine_class =
+					I915_ENGINE_CLASS_VIDEO;
+				p->master.engine_instance =
+					ctx->bonds[j].master - VCS1;
+
+				for (b = 0, e = 1; mask; e++, mask >>= 1)
+					if (mask & 1)
+						p->engines[b++] =
+							set_engines.engines[e];
+			}
+
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
@@ -2223,7 +2315,8 @@ static void *run_workload(void *data)
 				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
-				   w->type == LOAD_BALANCE) {
+				   w->type == LOAD_BALANCE ||
+				   w->type == BOND) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 45fa1e0a1d76..6aec718bc812 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -7,6 +7,7 @@ B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
+b.<uint>.<uint>.<str>
 f
 
 For duration a range can be given from which a random value will be picked
@@ -26,6 +27,7 @@ Additional workload steps are also supported:
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
  'B' - Turn on context load balancing.
+ 'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -194,3 +196,51 @@ This enables load balancing for context number one.
 
 Submissions to load balanced contexts are only allowed to use the DEFAULT engine
 specifier.
+
+Engine bonds
+------------
+
+Engine bonds are extensions on load balanced contexts. They allow expressing
+rules of engine selection between two co-operating contexts tied with submit
+fences. In other words, the rule expression is telling the driver: "If you pick
+this engine for context one, then you have to pick that engine for context two".
+
+Syntax is:
+  b.<context>.<engine_mask>.<master_engine>
+
+Engine mask is a bitmask representing engines in the engine map configured for
+the same context.
+
+There can be multiple bonds tied to the same context.
+
+Example:
+
+  M.1.RCS|VECS
+  B.1
+  M.2.VCS1|VCS2
+  B.2
+  b.2.1.RCS
+  b.2.2.VECS
+
+This tells the driver that if it picked RCS for context one, it has to pick VCS1
+for context two. And if it picked VECS for context one, it has to pick VCS1 for
+context two.
+
+If we extend the above example with more workload directives:
+
+  1.DEFAULT.1000.0.0
+  2.DEFAULT.1000.s-1.0
+
+We get to a fully functional example where two batch buffers are submitted in a
+load balanced fashion, telling the driver they should run simultaneously and
+that valid engine pairs are either RCS + VCS1 (for two contexts respectively),
+or VECS + VCS2.
+
+This can also be extended using sync fences to improve chances of the first
+submission not getting on the hardware after the second one. Second block would
+then look like:
+
+  f
+  1.DEFAULT.1000.f-1.0
+  2.DEFAULT.1000.s-1.0
+  a.-3
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [Intel-gfx] [PATCH i-g-t 15/21] gem_wsim: Engine bond command
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.

Please refer to the README for a more detailed description.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 107 ++++++++++++++++++++++++++++++++++++++---
 benchmarks/wsim/README |  50 +++++++++++++++++++
 2 files changed, 150 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index b610a603f7b0..cc6f4a742c12 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -85,6 +85,7 @@ enum w_type
 	PREEMPTION,
 	ENGINE_MAP,
 	LOAD_BALANCE,
+	BOND,
 };
 
 struct deps
@@ -100,6 +101,11 @@ struct w_arg {
 	int prio;
 };
 
+struct bond {
+	uint64_t mask;
+	enum intel_engine_id master;
+};
+
 struct w_step
 {
 	/* Workload step metadata */
@@ -123,6 +129,10 @@ struct w_step
 			enum intel_engine_id *engine_map;
 		};
 		bool load_balance;
+		struct {
+			uint64_t bond_mask;
+			enum intel_engine_id bond_master;
+		};
 	};
 
 	/* Implementation details */
@@ -152,6 +162,8 @@ struct ctx {
 	int priority;
 	unsigned int engine_map_count;
 	enum intel_engine_id *engine_map;
+	unsigned int bond_count;
+	struct bond *bonds;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -522,6 +534,40 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = LOAD_BALANCE;
 				goto add_step;
+			} else if (!strcmp(field, "b")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr == 1 &&
+						  (tmp < -1 || tmp == 0),
+						  "Invalid siblings mask at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 2,
+						  "Invalid bond format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else if (nr == 1) {
+						step.bond_mask = tmp;
+					} else if (nr == 2) {
+						tmp = str_to_engine(field);
+						check_arg(tmp <= 0 ||
+							  tmp == VCS ||
+							  tmp == DEFAULT,
+							  "Invalid master engine at step %u!\n",
+							  nr_steps);
+						step.bond_master = tmp;
+					}
+
+					nr++;
+				}
+
+				step.type = BOND;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -1049,6 +1095,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		struct ctx *ctx = &wrk->ctx_list[j];
+
 		bool targets = false;
 		bool balance = false;
 
@@ -1062,16 +1110,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				else
 					targets = true;
 			} else if (w->type == ENGINE_MAP) {
-				wrk->ctx_list[j].engine_map = w->engine_map;
-				wrk->ctx_list[j].engine_map_count =
-					w->engine_map_count;
+				ctx->engine_map = w->engine_map;
+				ctx->engine_map_count = w->engine_map_count;
 			} else if (w->type == LOAD_BALANCE) {
-				if (!wrk->ctx_list[j].engine_map) {
+				if (!ctx->engine_map) {
 					wsim_err("Load balancing needs an engine map!\n");
 					return 1;
 				}
-				wrk->ctx_list[j].wants_balance =
-					w->load_balance;
+				ctx->wants_balance = w->load_balance;
+			} else if (w->type == BOND) {
+				if (!ctx->wants_balance) {
+					wsim_err("Engine bonds need load balancing engine map!\n");
+					return 1;
+				}
+				ctx->bond_count++;
+				ctx->bonds = realloc(ctx->bonds,
+						     ctx->bond_count *
+						     sizeof(struct bond));
+				igt_assert(ctx->bonds);
+				ctx->bonds[ctx->bond_count - 1].mask =
+					w->bond_mask;
+				ctx->bonds[ctx->bond_count - 1].master =
+					w->bond_master;
 			}
 		}
 
@@ -1252,6 +1312,38 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map[j - 1] - VCS1; /* FIXME */
 			}
 
+			for (j = 0; j < ctx->bond_count; j++) {
+				unsigned long mask = ctx->bonds[j].mask;
+				I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
+								 __builtin_popcount(mask));
+				struct i915_context_engines_bond *p = NULL, *prev;
+				unsigned int b, e;
+
+				prev = p;
+				p = alloca(sizeof(bond));
+				assert(p);
+				memset(p, 0, sizeof(bond));
+
+				if (j == 0)
+					load_balance.base.next_extension =
+						to_user_pointer(p);
+				else if (j < (ctx->bond_count - 1))
+					prev->base.next_extension =
+						to_user_pointer(p);
+
+				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+				p->virtual_index = 0;
+				p->master.engine_class =
+					I915_ENGINE_CLASS_VIDEO;
+				p->master.engine_instance =
+					ctx->bonds[j].master - VCS1;
+
+				for (b = 0, e = 1; mask; e++, mask >>= 1)
+					if (mask & 1)
+						p->engines[b++] =
+							set_engines.engines[e];
+			}
+
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
@@ -2223,7 +2315,8 @@ static void *run_workload(void *data)
 				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
-				   w->type == LOAD_BALANCE) {
+				   w->type == LOAD_BALANCE ||
+				   w->type == BOND) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 45fa1e0a1d76..6aec718bc812 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -7,6 +7,7 @@ B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
+b.<uint>.<uint>.<str>
 f
 
 For duration a range can be given from which a random value will be picked
@@ -26,6 +27,7 @@ Additional workload steps are also supported:
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
  'B' - Turn on context load balancing.
+ 'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -194,3 +196,51 @@ This enables load balancing for context number one.
 
 Submissions to load balanced contexts are only allowed to use the DEFAULT engine
 specifier.
+
+Engine bonds
+------------
+
+Engine bonds are extensions on load balanced contexts. They allow expressing
+rules of engine selection between two co-operating contexts tied with submit
+fences. In other words, the rule expression is telling the driver: "If you pick
+this engine for context one, then you have to pick that engine for context two".
+
+Syntax is:
+  b.<context>.<engine_mask>.<master_engine>
+
+Engine mask is a bitmask representing engines in the engine map configured for
+the same context.
+
+There can be multiple bonds tied to the same context.
+
+Example:
+
+  M.1.RCS|VECS
+  B.1
+  M.2.VCS1|VCS2
+  B.2
+  b.2.1.RCS
+  b.2.2.VECS
+
+This tells the driver that if it picked RCS for context one, it has to pick VCS1
+for context two. And if it picked VECS for context one, it has to pick VCS1 for
+context two.
+
+If we extend the above example with more workload directives:
+
+  1.DEFAULT.1000.0.0
+  2.DEFAULT.1000.s-1.0
+
+We get to a fully functional example where two batch buffers are submitted in a
+load balanced fashion, telling the driver they should run simultaneously and
+that valid engine pairs are either RCS + VCS1 (for two contexts respectively),
+or VECS + VCS2.
+
+This can also be extended using sync fences to improve chances of the first
+submission not getting on the hardware after the second one. Second block would
+then look like:
+
+  f
+  1.DEFAULT.1000.f-1.0
+  2.DEFAULT.1000.s-1.0
+  a.-3
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few additional workloads useful for experimenting with scheduling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/wsim/frame-split-60fps.wsim      | 16 ++++++++++++++++
 benchmarks/wsim/high-composited-game.wsim   | 11 +++++++++++
 benchmarks/wsim/media-1080p-player.wsim     |  5 +++++
 benchmarks/wsim/medium-composited-game.wsim |  9 +++++++++
 4 files changed, 41 insertions(+)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
new file mode 100644
index 000000000000..cfbfcd39be7d
--- /dev/null
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -0,0 +1,16 @@
+X.1.0
+M.1.VCS1
+B.1
+X.2.0
+M.2.VCS2
+B.2
+b.2.1.VCS1
+f
+1.DEFAULT.4000-6000.f-1.0
+2.DEFAULT.4000-6000.s-1.0
+a.-3
+3.RCS.2000-4000.-3/-2.0
+3.VECS.2000.-1.0
+4.BCS.1000.-1.0
+s.-2
+p.16667
diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim
new file mode 100644
index 000000000000..a90a2b2be95b
--- /dev/null
+++ b/benchmarks/wsim/high-composited-game.wsim
@@ -0,0 +1,11 @@
+1.RCS.500.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim
new file mode 100644
index 000000000000..bcbb0cfd2ad3
--- /dev/null
+++ b/benchmarks/wsim/media-1080p-player.wsim
@@ -0,0 +1,5 @@
+1.VCS.5000-10000.0.0
+2.RCS.1000-2000.-1.0
+P.3.1
+3.BCS.1000.-2.0
+p.16667
diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim
new file mode 100644
index 000000000000..580883516168
--- /dev/null
+++ b/benchmarks/wsim/medium-composited-game.wsim
@@ -0,0 +1,9 @@
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few additional workloads useful for experimenting with scheduling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/wsim/frame-split-60fps.wsim      | 16 ++++++++++++++++
 benchmarks/wsim/high-composited-game.wsim   | 11 +++++++++++
 benchmarks/wsim/media-1080p-player.wsim     |  5 +++++
 benchmarks/wsim/medium-composited-game.wsim |  9 +++++++++
 4 files changed, 41 insertions(+)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
new file mode 100644
index 000000000000..cfbfcd39be7d
--- /dev/null
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -0,0 +1,16 @@
+X.1.0
+M.1.VCS1
+B.1
+X.2.0
+M.2.VCS2
+B.2
+b.2.1.VCS1
+f
+1.DEFAULT.4000-6000.f-1.0
+2.DEFAULT.4000-6000.s-1.0
+a.-3
+3.RCS.2000-4000.-3/-2.0
+3.VECS.2000.-1.0
+4.BCS.1000.-1.0
+s.-2
+p.16667
diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim
new file mode 100644
index 000000000000..a90a2b2be95b
--- /dev/null
+++ b/benchmarks/wsim/high-composited-game.wsim
@@ -0,0 +1,11 @@
+1.RCS.500.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim
new file mode 100644
index 000000000000..bcbb0cfd2ad3
--- /dev/null
+++ b/benchmarks/wsim/media-1080p-player.wsim
@@ -0,0 +1,5 @@
+1.VCS.5000-10000.0.0
+2.RCS.1000-2000.-1.0
+P.3.1
+3.BCS.1000.-2.0
+p.16667
diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim
new file mode 100644
index 000000000000..580883516168
--- /dev/null
+++ b/benchmarks/wsim/medium-composited-game.wsim
@@ -0,0 +1,9 @@
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:

  1.RCS.*.0.0
  T.-1

First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c                  | 119 +++++++++++++++++++------
 benchmarks/wsim/README                 |   9 +-
 benchmarks/wsim/frame-split-60fps.wsim |   6 +-
 3 files changed, 102 insertions(+), 32 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index cc6f4a742c12..97821b723b02 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -86,6 +86,7 @@ enum w_type
 	ENGINE_MAP,
 	LOAD_BALANCE,
 	BOND,
+	TERMINATE,
 };
 
 struct deps
@@ -113,6 +114,7 @@ struct w_step
 	unsigned int context;
 	unsigned int engine;
 	struct duration duration;
+	bool unbound_duration;
 	struct deps data_deps;
 	struct deps fence_deps;
 	int emit_fence;
@@ -143,7 +145,7 @@ struct w_step
 
 	struct drm_i915_gem_execbuffer2 eb;
 	struct drm_i915_gem_exec_object2 *obj;
-	struct drm_i915_gem_relocation_entry reloc[4];
+	struct drm_i915_gem_relocation_entry reloc[5];
 	unsigned long bb_sz;
 	uint32_t bb_handle;
 	uint32_t *seqno_value;
@@ -153,6 +155,7 @@ struct w_step
 	uint32_t *rt1_address;
 	uint32_t *latch_value;
 	uint32_t *latch_address;
+	uint32_t *recursive_bb_start;
 };
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
@@ -491,6 +494,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = ENGINE_MAP;
 				goto add_step;
+			} else if (!strcmp(field, "T")) {
+				int_field(TERMINATE, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid terminate target at step %u!\n");
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -605,23 +612,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 			fstart = NULL;
 
-			tmpl = strtol(field, &sep, 10);
-			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
-				  tmpl == LONG_MAX,
-				  "Invalid duration at step %u!\n", nr_steps);
-			step.duration.min = tmpl;
-
-			if (sep && *sep == '-') {
-				tmpl = strtol(sep + 1, NULL, 10);
-				check_arg(tmpl <= 0 ||
-					  tmpl <= step.duration.min ||
-					  tmpl == LONG_MIN ||
+			if (field[0] == '*') {
+				step.unbound_duration = true;
+			} else {
+				tmpl = strtol(field, &sep, 10);
+				check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
 					  tmpl == LONG_MAX,
-					  "Invalid duration range at step %u!\n",
+					  "Invalid duration at step %u!\n",
 					  nr_steps);
-				step.duration.max = tmpl;
-			} else {
-				step.duration.max = step.duration.min;
+				step.duration.min = tmpl;
+
+				if (sep && *sep == '-') {
+					tmpl = strtol(sep + 1, NULL, 10);
+					check_arg(tmpl <= 0 ||
+						tmpl <= step.duration.min ||
+						tmpl == LONG_MIN ||
+						tmpl == LONG_MAX,
+						"Invalid duration range at step %u!\n",
+						nr_steps);
+					step.duration.max = tmpl;
+				} else {
+					step.duration.max = step.duration.min;
+				}
 			}
 
 			valid++;
@@ -781,7 +793,7 @@ init_bb(struct w_step *w, unsigned int flags)
 	unsigned int i;
 	uint32_t *ptr;
 
-	if (!arb_period)
+	if (w->unbound_duration || !arb_period)
 		return;
 
 	gem_set_domain(fd, w->bb_handle,
@@ -801,6 +813,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	const uint32_t bbe = 0xa << 23;
 	unsigned long mmap_start, mmap_len;
 	unsigned long batch_start = w->bb_sz;
+	unsigned int r = 0;
 	uint32_t *ptr, *cs;
 
 	igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
@@ -811,6 +824,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	if (flags & RT)
 		batch_start -= 12 * sizeof(uint32_t);
 
+	if (w->unbound_duration)
+		batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
+
 	mmap_start = rounddown(batch_start, PAGE_SIZE);
 	mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
 
@@ -820,8 +836,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
 	cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
 
+	if (w->unbound_duration) {
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
+		batch_start += 4 * sizeof(uint32_t);
+
+		*cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
+		w->recursive_bb_start = cs;
+		*cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
+		*cs++ = 0;
+		*cs++ = 0;
+	}
+
 	if (flags & SEQNO) {
-		w->reloc[0].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -833,7 +860,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	if (flags & RT) {
-		w->reloc[1].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -843,7 +870,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		w->rt0_value = cs;
 		*cs++ = 0;
 
-		w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
@@ -852,7 +879,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		*cs++ = 0;
 		*cs++ = 0;
 
-		w->reloc[3].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -984,19 +1011,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		}
 	}
 
-	w->bb_sz = get_bb_sz(w->duration.max);
-	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
+	if (w->unbound_duration)
+		/* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
+		w->bb_sz = max(64, get_bb_sz(w->preempt_us)) +
+			   (1 + 3) * sizeof(uint32_t);
+	else
+		w->bb_sz = get_bb_sz(w->duration.max);
+	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
 	init_bb(w, flags);
 	terminate_bb(w, flags);
 
-	if (flags & SEQNO) {
+	if ((flags & SEQNO) || w->unbound_duration) {
 		w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
+		if (flags & SEQNO)
+			w->obj[j].relocation_count++;
 		if (flags & RT)
-			w->obj[j].relocation_count = 4;
-		else
-			w->obj[j].relocation_count = 1;
+			w->obj[j].relocation_count += 3;
+		if (w->unbound_duration)
+			w->obj[j].relocation_count++;
 		for (i = 0; i < w->obj[j].relocation_count; i++)
 			w->reloc[i].target_handle = 1;
+		if (w->unbound_duration)
+			w->reloc[0].target_handle = j;
 	}
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
@@ -2036,6 +2072,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
 	}
 }
 
+static void
+update_bb_start(struct w_step *w)
+{
+	if (!w->unbound_duration)
+		return;
+
+	gem_set_domain(fd, w->bb_handle,
+		       I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+
+	*w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
+}
+
 static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
 {
 	if (target < 0)
@@ -2171,9 +2219,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	if (flags & RT)
 		update_bb_rt(w, engine, seqno);
 
+	update_bb_start(w);
+
 	w->eb.batch_start_offset =
+		w->unbound_duration ?
+		0 :
 		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
-			2 * sizeof(uint32_t));
+		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
 		int tgt = w->idx + w->fence_deps.list[i];
@@ -2313,6 +2365,17 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
+			} else if (w->type == TERMINATE) {
+				unsigned int t_idx = i + w->target;
+
+				igt_assert(t_idx >= 0 && t_idx < i);
+				igt_assert(wrk->steps[t_idx].type == BATCH);
+				igt_assert(wrk->steps[t_idx].unbound_duration);
+
+				*wrk->steps[t_idx].recursive_bb_start =
+					MI_BATCH_BUFFER_END;
+				__sync_synchronize();
+				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
 				   w->type == LOAD_BALANCE ||
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 6aec718bc812..c94d01018419 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -2,11 +2,11 @@ Workload descriptor format
 ==========================
 
 ctx.engine.duration_us.dependency.wait,...
-<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
-d|p|s|t|q|a.<int>,...
+d|p|s|t|q|a|T.<int>,...
 b.<uint>.<uint>.<str>
 f
 
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
 Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
@@ -77,6 +78,10 @@ Example:
 
 I this case the last step has a data dependency on both first and second steps.
 
+Batch durations can also be specified as infinite by using the '*' in the
+duration field. Such batches must be ended by the terminate command ('T')
+otherwise they will cause a GPU hang to be reported.
+
 Sync (fd) fences
 ----------------
 
diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
index cfbfcd39be7d..ea89da3add48 100644
--- a/benchmarks/wsim/frame-split-60fps.wsim
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -6,10 +6,12 @@ M.2.VCS2
 B.2
 b.2.1.VCS1
 f
-1.DEFAULT.4000-6000.f-1.0
+1.DEFAULT.*.f-1.0
 2.DEFAULT.4000-6000.s-1.0
 a.-3
-3.RCS.2000-4000.-3/-2.0
+s.-2
+T.-4
+3.RCS.2000-4000.-5/-4.0
 3.VECS.2000.-1.0
 4.BCS.1000.-1.0
 s.-2
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:

  1.RCS.*.0.0
  T.-1

First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c                  | 119 +++++++++++++++++++------
 benchmarks/wsim/README                 |   9 +-
 benchmarks/wsim/frame-split-60fps.wsim |   6 +-
 3 files changed, 102 insertions(+), 32 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index cc6f4a742c12..97821b723b02 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -86,6 +86,7 @@ enum w_type
 	ENGINE_MAP,
 	LOAD_BALANCE,
 	BOND,
+	TERMINATE,
 };
 
 struct deps
@@ -113,6 +114,7 @@ struct w_step
 	unsigned int context;
 	unsigned int engine;
 	struct duration duration;
+	bool unbound_duration;
 	struct deps data_deps;
 	struct deps fence_deps;
 	int emit_fence;
@@ -143,7 +145,7 @@ struct w_step
 
 	struct drm_i915_gem_execbuffer2 eb;
 	struct drm_i915_gem_exec_object2 *obj;
-	struct drm_i915_gem_relocation_entry reloc[4];
+	struct drm_i915_gem_relocation_entry reloc[5];
 	unsigned long bb_sz;
 	uint32_t bb_handle;
 	uint32_t *seqno_value;
@@ -153,6 +155,7 @@ struct w_step
 	uint32_t *rt1_address;
 	uint32_t *latch_value;
 	uint32_t *latch_address;
+	uint32_t *recursive_bb_start;
 };
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
@@ -491,6 +494,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = ENGINE_MAP;
 				goto add_step;
+			} else if (!strcmp(field, "T")) {
+				int_field(TERMINATE, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid terminate target at step %u!\n");
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -605,23 +612,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 			fstart = NULL;
 
-			tmpl = strtol(field, &sep, 10);
-			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
-				  tmpl == LONG_MAX,
-				  "Invalid duration at step %u!\n", nr_steps);
-			step.duration.min = tmpl;
-
-			if (sep && *sep == '-') {
-				tmpl = strtol(sep + 1, NULL, 10);
-				check_arg(tmpl <= 0 ||
-					  tmpl <= step.duration.min ||
-					  tmpl == LONG_MIN ||
+			if (field[0] == '*') {
+				step.unbound_duration = true;
+			} else {
+				tmpl = strtol(field, &sep, 10);
+				check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
 					  tmpl == LONG_MAX,
-					  "Invalid duration range at step %u!\n",
+					  "Invalid duration at step %u!\n",
 					  nr_steps);
-				step.duration.max = tmpl;
-			} else {
-				step.duration.max = step.duration.min;
+				step.duration.min = tmpl;
+
+				if (sep && *sep == '-') {
+					tmpl = strtol(sep + 1, NULL, 10);
+					check_arg(tmpl <= 0 ||
+						tmpl <= step.duration.min ||
+						tmpl == LONG_MIN ||
+						tmpl == LONG_MAX,
+						"Invalid duration range at step %u!\n",
+						nr_steps);
+					step.duration.max = tmpl;
+				} else {
+					step.duration.max = step.duration.min;
+				}
 			}
 
 			valid++;
@@ -781,7 +793,7 @@ init_bb(struct w_step *w, unsigned int flags)
 	unsigned int i;
 	uint32_t *ptr;
 
-	if (!arb_period)
+	if (w->unbound_duration || !arb_period)
 		return;
 
 	gem_set_domain(fd, w->bb_handle,
@@ -801,6 +813,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	const uint32_t bbe = 0xa << 23;
 	unsigned long mmap_start, mmap_len;
 	unsigned long batch_start = w->bb_sz;
+	unsigned int r = 0;
 	uint32_t *ptr, *cs;
 
 	igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
@@ -811,6 +824,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	if (flags & RT)
 		batch_start -= 12 * sizeof(uint32_t);
 
+	if (w->unbound_duration)
+		batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
+
 	mmap_start = rounddown(batch_start, PAGE_SIZE);
 	mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
 
@@ -820,8 +836,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
 	cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
 
+	if (w->unbound_duration) {
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
+		batch_start += 4 * sizeof(uint32_t);
+
+		*cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
+		w->recursive_bb_start = cs;
+		*cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
+		*cs++ = 0;
+		*cs++ = 0;
+	}
+
 	if (flags & SEQNO) {
-		w->reloc[0].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -833,7 +860,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	if (flags & RT) {
-		w->reloc[1].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -843,7 +870,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		w->rt0_value = cs;
 		*cs++ = 0;
 
-		w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
@@ -852,7 +879,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		*cs++ = 0;
 		*cs++ = 0;
 
-		w->reloc[3].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -984,19 +1011,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		}
 	}
 
-	w->bb_sz = get_bb_sz(w->duration.max);
-	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
+	if (w->unbound_duration)
+		/* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
+		w->bb_sz = max(64, get_bb_sz(w->preempt_us)) +
+			   (1 + 3) * sizeof(uint32_t);
+	else
+		w->bb_sz = get_bb_sz(w->duration.max);
+	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
 	init_bb(w, flags);
 	terminate_bb(w, flags);
 
-	if (flags & SEQNO) {
+	if ((flags & SEQNO) || w->unbound_duration) {
 		w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
+		if (flags & SEQNO)
+			w->obj[j].relocation_count++;
 		if (flags & RT)
-			w->obj[j].relocation_count = 4;
-		else
-			w->obj[j].relocation_count = 1;
+			w->obj[j].relocation_count += 3;
+		if (w->unbound_duration)
+			w->obj[j].relocation_count++;
 		for (i = 0; i < w->obj[j].relocation_count; i++)
 			w->reloc[i].target_handle = 1;
+		if (w->unbound_duration)
+			w->reloc[0].target_handle = j;
 	}
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
@@ -2036,6 +2072,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
 	}
 }
 
+static void
+update_bb_start(struct w_step *w)
+{
+	if (!w->unbound_duration)
+		return;
+
+	gem_set_domain(fd, w->bb_handle,
+		       I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+
+	*w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
+}
+
 static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
 {
 	if (target < 0)
@@ -2171,9 +2219,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	if (flags & RT)
 		update_bb_rt(w, engine, seqno);
 
+	update_bb_start(w);
+
 	w->eb.batch_start_offset =
+		w->unbound_duration ?
+		0 :
 		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
-			2 * sizeof(uint32_t));
+		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
 		int tgt = w->idx + w->fence_deps.list[i];
@@ -2313,6 +2365,17 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
+			} else if (w->type == TERMINATE) {
+				unsigned int t_idx = i + w->target;
+
+				igt_assert(t_idx >= 0 && t_idx < i);
+				igt_assert(wrk->steps[t_idx].type == BATCH);
+				igt_assert(wrk->steps[t_idx].unbound_duration);
+
+				*wrk->steps[t_idx].recursive_bb_start =
+					MI_BATCH_BUFFER_END;
+				__sync_synchronize();
+				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
 				   w->type == LOAD_BALANCE ||
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 6aec718bc812..c94d01018419 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -2,11 +2,11 @@ Workload descriptor format
 ==========================
 
 ctx.engine.duration_us.dependency.wait,...
-<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
-d|p|s|t|q|a.<int>,...
+d|p|s|t|q|a|T.<int>,...
 b.<uint>.<uint>.<str>
 f
 
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
 Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
@@ -77,6 +78,10 @@ Example:
 
 I this case the last step has a data dependency on both first and second steps.
 
+Batch durations can also be specified as infinite by using the '*' in the
+duration field. Such batches must be ended by the terminate command ('T')
+otherwise they will cause a GPU hang to be reported.
+
 Sync (fd) fences
 ----------------
 
diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
index cfbfcd39be7d..ea89da3add48 100644
--- a/benchmarks/wsim/frame-split-60fps.wsim
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -6,10 +6,12 @@ M.2.VCS2
 B.2
 b.2.1.VCS1
 f
-1.DEFAULT.4000-6000.f-1.0
+1.DEFAULT.*.f-1.0
 2.DEFAULT.4000-6000.s-1.0
 a.-3
-3.RCS.2000-4000.-3/-2.0
+s.-2
+T.-4
+3.RCS.2000-4000.-5/-4.0
 3.VECS.2000.-1.0
 4.BCS.1000.-1.0
 s.-2
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 18/21] gem_wsim: Command line switch for specifying low slice count workloads
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new command line switch ('-s') is added which toggles the low slice
count mode for workloads following on the command line.

This enables easy benchmarking of the effect of running the existing media
workloads in parallel against another client. For example:

  ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim

Adding or removing the '-s' switch before the second workload enables
analyzing the cost of dynamic SSEU switching impacted to the first
(master) workload.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 44 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 97821b723b02..64dd251a25eb 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -100,6 +100,7 @@ struct w_arg {
 	char *filename;
 	char *desc;
 	int prio;
+	bool sseu;
 };
 
 struct bond {
@@ -179,6 +180,7 @@ struct workload
 	unsigned int nr_steps;
 	struct w_step *steps;
 	int prio;
+	bool sseu;
 
 	pthread_t thread;
 	bool run;
@@ -251,6 +253,7 @@ static int fd;
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
 #define I915		(1<<10)
+#define SSEU		(1<<11)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -696,6 +699,7 @@ add_step:
 	wrk->nr_steps = nr_steps;
 	wrk->steps = steps;
 	wrk->prio = arg->prio;
+	wrk->sseu = arg->sseu;
 
 	free(desc);
 
@@ -741,6 +745,7 @@ clone_workload(struct workload *_wrk)
 	memset(wrk, 0, sizeof(*wrk));
 
 	wrk->prio = _wrk->prio;
+	wrk->sseu = _wrk->sseu;
 	wrk->nr_steps = _wrk->nr_steps;
 	wrk->steps = calloc(wrk->nr_steps, sizeof(struct w_step));
 	igt_assert(wrk->steps);
@@ -1066,6 +1071,26 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 		gem_context_set_param(fd, &param);
 }
 
+static void
+set_ctx_sseu(uint32_t ctx)
+{
+	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param param = { };
+
+	sseu.class = I915_ENGINE_CLASS_RENDER;
+	sseu.instance = 0;
+
+	param.ctx_id = ctx;
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&sseu;
+
+	gem_context_get_param(fd, &param);
+
+	sseu.slice_mask = 1;
+
+	gem_context_set_param(fd, &param);
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1413,6 +1438,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 			gem_context_set_param(fd, &param);
 		}
+
+		if (wrk->sseu)
+			set_ctx_sseu(arg.ctx_id);
 	}
 
 	/* Record default preemption. */
@@ -2585,6 +2613,8 @@ static void print_help(void)
 "  -R              Round-robin initial VCS assignment per client.\n"
 "  -H              Send heartbeat on synchronisation points with seqno based\n"
 "                  balancers. Gives better engine busyness view in some cases.\n"
+"  -s              Turn on small SSEU config for the next workload on the\n"
+"                  command line. Subsequent -s switches it off.\n"
 "  -S              Synchronize the sequence of random batch durations between\n"
 "                  clients.\n"
 "  -G              Global load balancing - a single load balancer will be shared\n"
@@ -2627,11 +2657,12 @@ static char *load_workload_descriptor(char *filename)
 }
 
 static struct w_arg *
-add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg, int prio)
+add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg,
+		 int prio, bool sseu)
 {
 	w_args = realloc(w_args, sizeof(*w_args) * nr_args);
 	igt_assert(w_args);
-	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio };
+	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio, sseu };
 
 	return w_args;
 }
@@ -2724,7 +2755,8 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
-	while ((c = getopt(argc, argv, "hqv2RSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+	while ((c = getopt(argc, argv,
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -2734,7 +2766,8 @@ int main(int argc, char **argv)
 			master_workload = nr_w_args;
 			/* Fall through */
 		case 'w':
-			w_args = add_workload_arg(w_args, ++nr_w_args, optarg, prio);
+			w_args = add_workload_arg(w_args, ++nr_w_args, optarg,
+						  prio, flags & SSEU);
 			break;
 		case 'p':
 			prio = atoi(optarg);
@@ -2776,6 +2809,9 @@ int main(int argc, char **argv)
 		case 'S':
 			flags |= SYNCEDCLIENTS;
 			break;
+		case 's':
+			flags ^= SSEU;
+			break;
 		case 'H':
 			flags |= HEARTBEAT;
 			break;
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 18/21] gem_wsim: Command line switch for specifying low slice count workloads
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new command line switch ('-s') is added which toggles the low slice
count mode for workloads following on the command line.

This enables easy benchmarking of the effect of running the existing media
workloads in parallel against another client. For example:

  ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim

Adding or removing the '-s' switch before the second workload enables
analyzing the cost of dynamic SSEU switching impacted to the first
(master) workload.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 44 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 97821b723b02..64dd251a25eb 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -100,6 +100,7 @@ struct w_arg {
 	char *filename;
 	char *desc;
 	int prio;
+	bool sseu;
 };
 
 struct bond {
@@ -179,6 +180,7 @@ struct workload
 	unsigned int nr_steps;
 	struct w_step *steps;
 	int prio;
+	bool sseu;
 
 	pthread_t thread;
 	bool run;
@@ -251,6 +253,7 @@ static int fd;
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
 #define I915		(1<<10)
+#define SSEU		(1<<11)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -696,6 +699,7 @@ add_step:
 	wrk->nr_steps = nr_steps;
 	wrk->steps = steps;
 	wrk->prio = arg->prio;
+	wrk->sseu = arg->sseu;
 
 	free(desc);
 
@@ -741,6 +745,7 @@ clone_workload(struct workload *_wrk)
 	memset(wrk, 0, sizeof(*wrk));
 
 	wrk->prio = _wrk->prio;
+	wrk->sseu = _wrk->sseu;
 	wrk->nr_steps = _wrk->nr_steps;
 	wrk->steps = calloc(wrk->nr_steps, sizeof(struct w_step));
 	igt_assert(wrk->steps);
@@ -1066,6 +1071,26 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 		gem_context_set_param(fd, &param);
 }
 
+static void
+set_ctx_sseu(uint32_t ctx)
+{
+	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param param = { };
+
+	sseu.class = I915_ENGINE_CLASS_RENDER;
+	sseu.instance = 0;
+
+	param.ctx_id = ctx;
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&sseu;
+
+	gem_context_get_param(fd, &param);
+
+	sseu.slice_mask = 1;
+
+	gem_context_set_param(fd, &param);
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1413,6 +1438,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 			gem_context_set_param(fd, &param);
 		}
+
+		if (wrk->sseu)
+			set_ctx_sseu(arg.ctx_id);
 	}
 
 	/* Record default preemption. */
@@ -2585,6 +2613,8 @@ static void print_help(void)
 "  -R              Round-robin initial VCS assignment per client.\n"
 "  -H              Send heartbeat on synchronisation points with seqno based\n"
 "                  balancers. Gives better engine busyness view in some cases.\n"
+"  -s              Turn on small SSEU config for the next workload on the\n"
+"                  command line. Subsequent -s switches it off.\n"
 "  -S              Synchronize the sequence of random batch durations between\n"
 "                  clients.\n"
 "  -G              Global load balancing - a single load balancer will be shared\n"
@@ -2627,11 +2657,12 @@ static char *load_workload_descriptor(char *filename)
 }
 
 static struct w_arg *
-add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg, int prio)
+add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg,
+		 int prio, bool sseu)
 {
 	w_args = realloc(w_args, sizeof(*w_args) * nr_args);
 	igt_assert(w_args);
-	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio };
+	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio, sseu };
 
 	return w_args;
 }
@@ -2724,7 +2755,8 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
-	while ((c = getopt(argc, argv, "hqv2RSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+	while ((c = getopt(argc, argv,
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -2734,7 +2766,8 @@ int main(int argc, char **argv)
 			master_workload = nr_w_args;
 			/* Fall through */
 		case 'w':
-			w_args = add_workload_arg(w_args, ++nr_w_args, optarg, prio);
+			w_args = add_workload_arg(w_args, ++nr_w_args, optarg,
+						  prio, flags & SSEU);
 			break;
 		case 'p':
 			prio = atoi(optarg);
@@ -2776,6 +2809,9 @@ int main(int argc, char **argv)
 		case 'S':
 			flags |= SYNCEDCLIENTS;
 			break;
+		case 's':
+			flags ^= SSEU;
+			break;
 		case 'H':
 			flags |= HEARTBEAT;
 			break;
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 19/21] gem_wsim: Per context SSEU control
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command ('S') is added which allows per context slice
(re-)configuration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 69 +++++++++++++++++++++++++++++++++++-------
 benchmarks/wsim/README | 23 +++++++++++++-
 2 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 64dd251a25eb..ed5acee02e20 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 	LOAD_BALANCE,
 	BOND,
 	TERMINATE,
+	SSEU
 };
 
 struct deps
@@ -136,6 +137,7 @@ struct w_step
 			uint64_t bond_mask;
 			enum intel_engine_id bond_master;
 		};
+		int sseu;
 	};
 
 	/* Implementation details */
@@ -171,6 +173,7 @@ struct ctx {
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
+	uint64_t sseu;
 };
 
 struct workload
@@ -241,6 +244,7 @@ static unsigned int context_vcs_rr;
 
 static int verbose = 1;
 static int fd;
+static struct drm_i915_gem_context_param_sseu device_sseu;
 
 #define SWAPVCS		(1<<0)
 #define SEQNO		(1<<1)
@@ -456,6 +460,27 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				int_field(SYNC, target,
 					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
 					  "Invalid sync target at step %u!\n");
+			} else if (!strcmp(field, "S")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(tmp <= 0 && nr == 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid SSEU format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
+						step.context = tmp;
+					else if (nr == 1)
+						step.sseu = tmp;
+
+					nr++;
+				}
+
+				step.type = SSEU;
+				goto add_step;
 			} else if (!strcmp(field, "t")) {
 				int_field(THROTTLE, throttle,
 					  tmp < 0,
@@ -1071,24 +1096,24 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 		gem_context_set_param(fd, &param);
 }
 
-static void
-set_ctx_sseu(uint32_t ctx)
+static uint64_t
+set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 {
-	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param_sseu sseu = device_sseu;
 	struct drm_i915_gem_context_param param = { };
 
-	sseu.class = I915_ENGINE_CLASS_RENDER;
-	sseu.instance = 0;
+	if (slice_mask == -1)
+		slice_mask = device_sseu.slice_mask;
+
+	sseu.slice_mask = slice_mask;
 
 	param.ctx_id = ctx;
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&sseu;
 
-	gem_context_get_param(fd, &param);
-
-	sseu.slice_mask = 1;
-
 	gem_context_set_param(fd, &param);
+
+	return slice_mask;
 }
 
 static int
@@ -1287,6 +1312,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		igt_assert(ctx_id);
 		ctx->id = ctx_id;
+		ctx->sseu = device_sseu.slice_mask;
 
 		if (flags & GLOBAL_BALANCE) {
 			ctx->static_vcs = context_vcs_rr;
@@ -1439,8 +1465,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
-		if (wrk->sseu)
-			set_ctx_sseu(arg.ctx_id);
+		if (wrk->sseu) {
+			/* Set to slice 0 only, one slice. */
+			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+		}
 	}
 
 	/* Record default preemption. */
@@ -2409,6 +2437,13 @@ static void *run_workload(void *data)
 				   w->type == LOAD_BALANCE ||
 				   w->type == BOND) {
 				continue;
+			} else if (w->type == SSEU) {
+				if (w->sseu != wrk->ctx_list[w->context].sseu) {
+					wrk->ctx_list[w->context].sseu =
+						set_ctx_sseu(wrk->ctx_list[w->context].id,
+							     w->sseu);
+				}
+				continue;
 			}
 
 			if (do_sleep || w->type == PERIOD) {
@@ -2725,6 +2760,16 @@ static void init_clocks(void)
 	       rcs_end - rcs_start, 1e6*t, 1024e6 * t / (rcs_end - rcs_start));
 }
 
+static void get_device_sseu(void)
+{
+	struct drm_i915_gem_context_param param = { };
+
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&device_sseu;
+
+	gem_context_get_param(fd, &param);
+}
+
 int main(int argc, char **argv)
 {
 	unsigned int repeat = 1;
@@ -2753,6 +2798,8 @@ int main(int argc, char **argv)
 	fd = __drm_open_driver(DRIVER_INTEL);
 	igt_require(fd);
 
+	get_device_sseu();
+
 	init_clocks();
 
 	while ((c = getopt(argc, argv,
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index c94d01018419..d7c255b9527c 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -5,7 +5,7 @@ ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
-P|X.<uint>.<int>
+P|S|X.<uint>.<int>
 d|p|s|t|q|a|T.<int>,...
 b.<uint>.<uint>.<str>
 f
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'S' - Context SSEU configuration.
  'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
@@ -249,3 +250,23 @@ then look like:
   1.DEFAULT.1000.f-1.0
   2.DEFAULT.1000.s-1.0
   a.-3
+
+Context SSEU configuration
+--------------------------
+
+  S.1.1
+  1.RCS.1000.0.0
+  S.2.-1
+  2.RCS.1000.0.0
+
+Context 1 is configured to run with one enabled slice (slice mask 1) and a batch
+is sumitted against it. Context 2 is configured to run with all slices (this is
+the default so the command could also be omitted) and a batch submitted against
+it.
+
+This shows the dynamic SSEU reconfiguration cost beween two contexts competing
+for the render engine.
+
+Slice mask of -1 has a special meaning of "all slices". Otherwise any integer
+can be specifying as the slice mask, but beware any apart from 1 and -1 can make
+the workload not portable between different GPUs.
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 19/21] gem_wsim: Per context SSEU control
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command ('S') is added which allows per context slice
(re-)configuration.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 69 +++++++++++++++++++++++++++++++++++-------
 benchmarks/wsim/README | 23 +++++++++++++-
 2 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 64dd251a25eb..ed5acee02e20 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 	LOAD_BALANCE,
 	BOND,
 	TERMINATE,
+	SSEU
 };
 
 struct deps
@@ -136,6 +137,7 @@ struct w_step
 			uint64_t bond_mask;
 			enum intel_engine_id bond_master;
 		};
+		int sseu;
 	};
 
 	/* Implementation details */
@@ -171,6 +173,7 @@ struct ctx {
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
+	uint64_t sseu;
 };
 
 struct workload
@@ -241,6 +244,7 @@ static unsigned int context_vcs_rr;
 
 static int verbose = 1;
 static int fd;
+static struct drm_i915_gem_context_param_sseu device_sseu;
 
 #define SWAPVCS		(1<<0)
 #define SEQNO		(1<<1)
@@ -456,6 +460,27 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				int_field(SYNC, target,
 					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
 					  "Invalid sync target at step %u!\n");
+			} else if (!strcmp(field, "S")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(tmp <= 0 && nr == 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid SSEU format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
+						step.context = tmp;
+					else if (nr == 1)
+						step.sseu = tmp;
+
+					nr++;
+				}
+
+				step.type = SSEU;
+				goto add_step;
 			} else if (!strcmp(field, "t")) {
 				int_field(THROTTLE, throttle,
 					  tmp < 0,
@@ -1071,24 +1096,24 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 		gem_context_set_param(fd, &param);
 }
 
-static void
-set_ctx_sseu(uint32_t ctx)
+static uint64_t
+set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 {
-	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param_sseu sseu = device_sseu;
 	struct drm_i915_gem_context_param param = { };
 
-	sseu.class = I915_ENGINE_CLASS_RENDER;
-	sseu.instance = 0;
+	if (slice_mask == -1)
+		slice_mask = device_sseu.slice_mask;
+
+	sseu.slice_mask = slice_mask;
 
 	param.ctx_id = ctx;
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&sseu;
 
-	gem_context_get_param(fd, &param);
-
-	sseu.slice_mask = 1;
-
 	gem_context_set_param(fd, &param);
+
+	return slice_mask;
 }
 
 static int
@@ -1287,6 +1312,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		igt_assert(ctx_id);
 		ctx->id = ctx_id;
+		ctx->sseu = device_sseu.slice_mask;
 
 		if (flags & GLOBAL_BALANCE) {
 			ctx->static_vcs = context_vcs_rr;
@@ -1439,8 +1465,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
-		if (wrk->sseu)
-			set_ctx_sseu(arg.ctx_id);
+		if (wrk->sseu) {
+			/* Set to slice 0 only, one slice. */
+			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+		}
 	}
 
 	/* Record default preemption. */
@@ -2409,6 +2437,13 @@ static void *run_workload(void *data)
 				   w->type == LOAD_BALANCE ||
 				   w->type == BOND) {
 				continue;
+			} else if (w->type == SSEU) {
+				if (w->sseu != wrk->ctx_list[w->context].sseu) {
+					wrk->ctx_list[w->context].sseu =
+						set_ctx_sseu(wrk->ctx_list[w->context].id,
+							     w->sseu);
+				}
+				continue;
 			}
 
 			if (do_sleep || w->type == PERIOD) {
@@ -2725,6 +2760,16 @@ static void init_clocks(void)
 	       rcs_end - rcs_start, 1e6*t, 1024e6 * t / (rcs_end - rcs_start));
 }
 
+static void get_device_sseu(void)
+{
+	struct drm_i915_gem_context_param param = { };
+
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&device_sseu;
+
+	gem_context_get_param(fd, &param);
+}
+
 int main(int argc, char **argv)
 {
 	unsigned int repeat = 1;
@@ -2753,6 +2798,8 @@ int main(int argc, char **argv)
 	fd = __drm_open_driver(DRIVER_INTEL);
 	igt_require(fd);
 
+	get_device_sseu();
+
 	init_clocks();
 
 	while ((c = getopt(argc, argv,
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index c94d01018419..d7c255b9527c 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -5,7 +5,7 @@ ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
-P|X.<uint>.<int>
+P|S|X.<uint>.<int>
 d|p|s|t|q|a|T.<int>,...
 b.<uint>.<uint>.<str>
 f
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'S' - Context SSEU configuration.
  'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
@@ -249,3 +250,23 @@ then look like:
   1.DEFAULT.1000.f-1.0
   2.DEFAULT.1000.s-1.0
   a.-3
+
+Context SSEU configuration
+--------------------------
+
+  S.1.1
+  1.RCS.1000.0.0
+  S.2.-1
+  2.RCS.1000.0.0
+
+Context 1 is configured to run with one enabled slice (slice mask 1) and a batch
+is sumitted against it. Context 2 is configured to run with all slices (this is
+the default so the command could also be omitted) and a batch submitted against
+it.
+
+This shows the dynamic SSEU reconfiguration cost beween two contexts competing
+for the render engine.
+
+Slice mask of -1 has a special meaning of "all slices". Otherwise any integer
+can be specifying as the slice mask, but beware any apart from 1 and -1 can make
+the workload not portable between different GPUs.
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 20/21] gem_wsim: Allow RCS virtual engine with SSEU control
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To allow exercising the SSEU configuration in combination with Virtual
Engine, allow RCS to be specified in the engine map and use appropriate
index based addressing when applying SSEU configuration to it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 51 ++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index ed5acee02e20..7990ab41f6fa 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -381,7 +381,7 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		if ((int)engine < 0)
 			return -1;
 
-		if (engine != VCS1 && engine != VCS2)
+		if (engine != VCS1 && engine != VCS2 && engine != RCS)
 			return -1; /* TODO */
 
 		step->engine_map_count++;
@@ -1097,7 +1097,7 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 }
 
 static uint64_t
-set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+set_ctx_sseu(struct ctx *ctx, uint64_t slice_mask)
 {
 	struct drm_i915_gem_context_param_sseu sseu = device_sseu;
 	struct drm_i915_gem_context_param param = { };
@@ -1105,10 +1105,17 @@ set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 	if (slice_mask == -1)
 		slice_mask = device_sseu.slice_mask;
 
+	if (ctx->engine_map && ctx->wants_balance) {
+		sseu.flags = I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX;
+		sseu.engine.engine_class = I915_ENGINE_CLASS_INVALID;
+		sseu.engine.engine_instance = 0;
+	}
+
 	sseu.slice_mask = slice_mask;
 
-	param.ctx_id = ctx;
+	param.ctx_id = ctx->id;
 	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.size = sizeof(sseu);
 	param.value = (uintptr_t)&sseu;
 
 	gem_context_set_param(fd, &param);
@@ -1377,10 +1384,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map_count;
 
 				for (j = 0; j < ctx->engine_map_count; j++) {
-					load_balance.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					load_balance.engines[j].engine_instance =
-						ctx->engine_map[j] - VCS1; /* FIXME */
+					if (ctx->engine_map[j] == RCS) {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_RENDER;
+						load_balance.engines[j].engine_instance =
+							0; /* FIXME */
+					} else {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_VIDEO; /* FIXME */
+						load_balance.engines[j].engine_instance =
+							ctx->engine_map[j] - VCS1; /* FIXME */
+					}
 				}
 			} else {
 				set_engines.extensions = 0;
@@ -1393,10 +1407,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				I915_ENGINE_CLASS_INVALID_NONE;
 
 			for (j = 1; j <= ctx->engine_map_count; j++) {
-				set_engines.engines[j].engine_class =
-					I915_ENGINE_CLASS_VIDEO; /* FIXME */
-				set_engines.engines[j].engine_instance =
-					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				if (ctx->engine_map[j - 1] == RCS) {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_RENDER;
+					set_engines.engines[j].engine_instance = 0; /* FIXME */
+				} else {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					set_engines.engines[j].engine_instance =
+						ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				}
 			}
 
 			for (j = 0; j < ctx->bond_count; j++) {
@@ -1467,7 +1487,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		if (wrk->sseu) {
 			/* Set to slice 0 only, one slice. */
-			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+			ctx->sseu = set_ctx_sseu(ctx, 1);
 		}
 	}
 
@@ -2438,9 +2458,9 @@ static void *run_workload(void *data)
 				   w->type == BOND) {
 				continue;
 			} else if (w->type == SSEU) {
-				if (w->sseu != wrk->ctx_list[w->context].sseu) {
-					wrk->ctx_list[w->context].sseu =
-						set_ctx_sseu(wrk->ctx_list[w->context].id,
+				if (w->sseu != wrk->ctx_list[w->context * 2].sseu) {
+					wrk->ctx_list[w->context * 2].sseu =
+						set_ctx_sseu(&wrk->ctx_list[w->context * 2],
 							     w->sseu);
 				}
 				continue;
@@ -2766,6 +2786,7 @@ static void get_device_sseu(void)
 
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&device_sseu;
+	param.size = sizeof(device_sseu);
 
 	gem_context_get_param(fd, &param);
 }
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 20/21] gem_wsim: Allow RCS virtual engine with SSEU control
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To allow exercising the SSEU configuration in combination with Virtual
Engine, allow RCS to be specified in the engine map and use appropriate
index based addressing when applying SSEU configuration to it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 51 ++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index ed5acee02e20..7990ab41f6fa 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -381,7 +381,7 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		if ((int)engine < 0)
 			return -1;
 
-		if (engine != VCS1 && engine != VCS2)
+		if (engine != VCS1 && engine != VCS2 && engine != RCS)
 			return -1; /* TODO */
 
 		step->engine_map_count++;
@@ -1097,7 +1097,7 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
 }
 
 static uint64_t
-set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+set_ctx_sseu(struct ctx *ctx, uint64_t slice_mask)
 {
 	struct drm_i915_gem_context_param_sseu sseu = device_sseu;
 	struct drm_i915_gem_context_param param = { };
@@ -1105,10 +1105,17 @@ set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 	if (slice_mask == -1)
 		slice_mask = device_sseu.slice_mask;
 
+	if (ctx->engine_map && ctx->wants_balance) {
+		sseu.flags = I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX;
+		sseu.engine.engine_class = I915_ENGINE_CLASS_INVALID;
+		sseu.engine.engine_instance = 0;
+	}
+
 	sseu.slice_mask = slice_mask;
 
-	param.ctx_id = ctx;
+	param.ctx_id = ctx->id;
 	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.size = sizeof(sseu);
 	param.value = (uintptr_t)&sseu;
 
 	gem_context_set_param(fd, &param);
@@ -1377,10 +1384,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map_count;
 
 				for (j = 0; j < ctx->engine_map_count; j++) {
-					load_balance.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					load_balance.engines[j].engine_instance =
-						ctx->engine_map[j] - VCS1; /* FIXME */
+					if (ctx->engine_map[j] == RCS) {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_RENDER;
+						load_balance.engines[j].engine_instance =
+							0; /* FIXME */
+					} else {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_VIDEO; /* FIXME */
+						load_balance.engines[j].engine_instance =
+							ctx->engine_map[j] - VCS1; /* FIXME */
+					}
 				}
 			} else {
 				set_engines.extensions = 0;
@@ -1393,10 +1407,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				I915_ENGINE_CLASS_INVALID_NONE;
 
 			for (j = 1; j <= ctx->engine_map_count; j++) {
-				set_engines.engines[j].engine_class =
-					I915_ENGINE_CLASS_VIDEO; /* FIXME */
-				set_engines.engines[j].engine_instance =
-					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				if (ctx->engine_map[j - 1] == RCS) {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_RENDER;
+					set_engines.engines[j].engine_instance = 0; /* FIXME */
+				} else {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					set_engines.engines[j].engine_instance =
+						ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				}
 			}
 
 			for (j = 0; j < ctx->bond_count; j++) {
@@ -1467,7 +1487,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		if (wrk->sseu) {
 			/* Set to slice 0 only, one slice. */
-			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+			ctx->sseu = set_ctx_sseu(ctx, 1);
 		}
 	}
 
@@ -2438,9 +2458,9 @@ static void *run_workload(void *data)
 				   w->type == BOND) {
 				continue;
 			} else if (w->type == SSEU) {
-				if (w->sseu != wrk->ctx_list[w->context].sseu) {
-					wrk->ctx_list[w->context].sseu =
-						set_ctx_sseu(wrk->ctx_list[w->context].id,
+				if (w->sseu != wrk->ctx_list[w->context * 2].sseu) {
+					wrk->ctx_list[w->context * 2].sseu =
+						set_ctx_sseu(&wrk->ctx_list[w->context * 2],
 							     w->sseu);
 				}
 				continue;
@@ -2766,6 +2786,7 @@ static void get_device_sseu(void)
 
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&device_sseu;
+	param.size = sizeof(device_sseu);
 
 	gem_context_get_param(fd, &param);
 }
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [PATCH i-g-t 21/21] tests/i915_query: Engine discovery tests
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Test the new engine discovery query.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/i915_query.c | 247 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)

diff --git a/tests/i915/i915_query.c b/tests/i915/i915_query.c
index 7d0c0e3a061c..ecbec3ae141d 100644
--- a/tests/i915/i915_query.c
+++ b/tests/i915/i915_query.c
@@ -483,6 +483,241 @@ test_query_topology_known_pci_ids(int fd, int devid)
 	free(topo_info);
 }
 
+static bool query_engine_info_supported(int fd)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(fd, &item, 1) == 0 && item.length > 0;
+}
+
+static void engines_invalid(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len;
+
+	/* Flags is MBZ. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.flags = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Length not zero and not greater or equal required size. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Query correct length. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	len = item.length;
+
+	engines = malloc(len);
+	igt_assert(engines);
+
+	/* Ivalid pointer. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	/* All fields in engines query are MBZ and only filled by the kernel. */
+
+	memset(engines, 0, len);
+	engines->num_engines = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[0] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[1] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[2] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	free(engines);
+
+	igt_assert(len <= 4096);
+	engines = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON,
+		       -1, 0);
+	igt_assert(engines != MAP_FAILED);
+
+	/* PROT_NONE is similar to unmapped area. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_NONE), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+	igt_assert_eq(mprotect(engines, len, PROT_WRITE), 0);
+
+	/* Read-only so kernel cannot fill the data back. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_READ), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	munmap(engines, 4096);
+}
+
+static bool
+has_engine(struct drm_i915_query_engine_info *engines,
+	   unsigned class, unsigned instance)
+{
+	unsigned int i;
+
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		if (engine->engine.engine_class == class &&
+		    engine->engine.engine_instance == instance)
+			return true;
+	}
+
+	return false;
+}
+
+static void engines(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len, i;
+
+	engines = malloc(4096);
+	igt_assert(engines);
+
+	/* Query required buffer length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	igt_assert(item.length <= 4096);
+	len = item.length;
+
+	/* Check length larger than required works and reports same length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 4096;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Actual query. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Every GPU has at least one engine. */
+	igt_assert(engines->num_engines > 0);
+
+	/* MBZ fields. */
+	igt_assert_eq(engines->rsvd[0], 0);
+	igt_assert_eq(engines->rsvd[1], 0);
+	igt_assert_eq(engines->rsvd[2], 0);
+
+	/* Check results match the legacy GET_PARAM (where we can). */
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		igt_debug("%u: class=%u instance=%u flags=%llx capabilities=%llx\n",
+			  i,
+			  engine->engine.engine_class,
+			  engine->engine.engine_instance,
+			  engine->flags,
+			  engine->capabilities);
+
+		/* MBZ fields. */
+		igt_assert_eq(engine->rsvd0, 0);
+		igt_assert_eq(engine->rsvd1[0], 0);
+		igt_assert_eq(engine->rsvd1[1], 0);
+
+		switch (engine->engine.engine_class) {
+		case I915_ENGINE_CLASS_RENDER:
+			/* Will be tested later. */
+			break;
+		case I915_ENGINE_CLASS_COPY:
+			igt_assert(gem_has_blt(fd));
+			break;
+		case I915_ENGINE_CLASS_VIDEO:
+			switch (engine->engine.engine_instance) {
+			case 0:
+				igt_assert(gem_has_bsd(fd));
+				break;
+			case 1:
+				igt_assert(gem_has_bsd2(fd));
+				break;
+			}
+			break;
+		case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+			igt_assert(gem_has_vebox(fd));
+			break;
+		default:
+			igt_assert(0);
+		}
+	}
+
+	/* Reverse check to the above - all GET_PARAM engines are present. */
+	igt_assert(has_engine(engines, I915_ENGINE_CLASS_RENDER, 0));
+	if (gem_has_blt(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_COPY, 0));
+	if (gem_has_bsd(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 0));
+	if (gem_has_bsd2(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 1));
+	if (gem_has_vebox(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO_ENHANCE,
+				       0));
+
+	free(engines);
+}
+
 igt_main
 {
 	int fd = -1;
@@ -530,6 +765,18 @@ igt_main
 		test_query_topology_known_pci_ids(fd, devid);
 	}
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(query_engine_info_supported(fd));
+		}
+
+		igt_subtest("engine-info-invalid")
+			engines_invalid(fd);
+
+		igt_subtest("engine-info")
+			engines(fd);
+	}
+
 	igt_fixture {
 		close(fd);
 	}
-- 
2.19.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* [igt-dev] [PATCH i-g-t 21/21] tests/i915_query: Engine discovery tests
@ 2019-05-08 12:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 12:10 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Test the new engine discovery query.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/i915_query.c | 247 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)

diff --git a/tests/i915/i915_query.c b/tests/i915/i915_query.c
index 7d0c0e3a061c..ecbec3ae141d 100644
--- a/tests/i915/i915_query.c
+++ b/tests/i915/i915_query.c
@@ -483,6 +483,241 @@ test_query_topology_known_pci_ids(int fd, int devid)
 	free(topo_info);
 }
 
+static bool query_engine_info_supported(int fd)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(fd, &item, 1) == 0 && item.length > 0;
+}
+
+static void engines_invalid(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len;
+
+	/* Flags is MBZ. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.flags = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Length not zero and not greater or equal required size. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Query correct length. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	len = item.length;
+
+	engines = malloc(len);
+	igt_assert(engines);
+
+	/* Ivalid pointer. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	/* All fields in engines query are MBZ and only filled by the kernel. */
+
+	memset(engines, 0, len);
+	engines->num_engines = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[0] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[1] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[2] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	free(engines);
+
+	igt_assert(len <= 4096);
+	engines = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON,
+		       -1, 0);
+	igt_assert(engines != MAP_FAILED);
+
+	/* PROT_NONE is similar to unmapped area. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_NONE), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+	igt_assert_eq(mprotect(engines, len, PROT_WRITE), 0);
+
+	/* Read-only so kernel cannot fill the data back. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_READ), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	munmap(engines, 4096);
+}
+
+static bool
+has_engine(struct drm_i915_query_engine_info *engines,
+	   unsigned class, unsigned instance)
+{
+	unsigned int i;
+
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		if (engine->engine.engine_class == class &&
+		    engine->engine.engine_instance == instance)
+			return true;
+	}
+
+	return false;
+}
+
+static void engines(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len, i;
+
+	engines = malloc(4096);
+	igt_assert(engines);
+
+	/* Query required buffer length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	igt_assert(item.length <= 4096);
+	len = item.length;
+
+	/* Check length larger than required works and reports same length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 4096;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Actual query. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Every GPU has at least one engine. */
+	igt_assert(engines->num_engines > 0);
+
+	/* MBZ fields. */
+	igt_assert_eq(engines->rsvd[0], 0);
+	igt_assert_eq(engines->rsvd[1], 0);
+	igt_assert_eq(engines->rsvd[2], 0);
+
+	/* Check results match the legacy GET_PARAM (where we can). */
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		igt_debug("%u: class=%u instance=%u flags=%llx capabilities=%llx\n",
+			  i,
+			  engine->engine.engine_class,
+			  engine->engine.engine_instance,
+			  engine->flags,
+			  engine->capabilities);
+
+		/* MBZ fields. */
+		igt_assert_eq(engine->rsvd0, 0);
+		igt_assert_eq(engine->rsvd1[0], 0);
+		igt_assert_eq(engine->rsvd1[1], 0);
+
+		switch (engine->engine.engine_class) {
+		case I915_ENGINE_CLASS_RENDER:
+			/* Will be tested later. */
+			break;
+		case I915_ENGINE_CLASS_COPY:
+			igt_assert(gem_has_blt(fd));
+			break;
+		case I915_ENGINE_CLASS_VIDEO:
+			switch (engine->engine.engine_instance) {
+			case 0:
+				igt_assert(gem_has_bsd(fd));
+				break;
+			case 1:
+				igt_assert(gem_has_bsd2(fd));
+				break;
+			}
+			break;
+		case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+			igt_assert(gem_has_vebox(fd));
+			break;
+		default:
+			igt_assert(0);
+		}
+	}
+
+	/* Reverse check to the above - all GET_PARAM engines are present. */
+	igt_assert(has_engine(engines, I915_ENGINE_CLASS_RENDER, 0));
+	if (gem_has_blt(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_COPY, 0));
+	if (gem_has_bsd(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 0));
+	if (gem_has_bsd2(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 1));
+	if (gem_has_vebox(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO_ENHANCE,
+				       0));
+
+	free(engines);
+}
+
 igt_main
 {
 	int fd = -1;
@@ -530,6 +765,18 @@ igt_main
 		test_query_topology_known_pci_ids(fd, devid);
 	}
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(query_engine_info_supported(fd));
+		}
+
+		igt_subtest("engine-info-invalid")
+			engines_invalid(fd);
+
+		igt_subtest("engine-info")
+			engines(fd);
+	}
+
 	igt_fixture {
 		close(fd);
 	}
-- 
2.19.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:17     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-08 12:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> After the removal of engine global seqnos and the corresponding
> intel_engine_notify tracepoints the script needs to be adjusted to cope
> with the new state of things.
> 
> To keep working it switches over using the dma_fence:dma_fence_signaled:
> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
> with engines.

Is the map suitable for the planned (by me)

	s/i915_request_wait_begin/dma_fence_wait_begin/

I guess it should be.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
@ 2019-05-08 12:17     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-08 12:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> After the removal of engine global seqnos and the corresponding
> intel_engine_notify tracepoints the script needs to be adjusted to cope
> with the new state of things.
> 
> To keep working it switches over using the dma_fence:dma_fence_signaled:
> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
> with engines.

Is the map suitable for the planned (by me)

	s/i915_request_wait_begin/dma_fence_wait_begin/

I guess it should be.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-08 12:27     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-08 12:27 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A few additional workloads useful for experimenting with scheduling.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Are the extra modes & .wsim supported by scripts/media-bench.pl?
i.e. can I just run media-bench.pl and have it exercise all the new
features?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
@ 2019-05-08 12:27     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-08 12:27 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A few additional workloads useful for experimenting with scheduling.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Are the extra modes & .wsim supported by scripts/media-bench.pl?
i.e. can I just run media-bench.pl and have it exercise all the new
features?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Media scalability tooling (rev2)
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
                   ` (21 preceding siblings ...)
  (?)
@ 2019-05-08 12:53 ` Patchwork
  -1 siblings, 0 replies; 126+ messages in thread
From: Patchwork @ 2019-05-08 12:53 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Media scalability tooling (rev2)
URL   : https://patchwork.freedesktop.org/series/51193/
State : success

== Summary ==

CI Bug Log - changes from IGT_4973 -> IGTPW_2952
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/51193/revisions/2/mbox/

Known issues
------------

  Here are the changes found in IGTPW_2952 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live_hangcheck:
    - fi-skl-iommu:       [PASS][1] -> [INCOMPLETE][2] ([fdo#108602] / [fdo#108744])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/fi-skl-iommu/igt@i915_selftest@live_hangcheck.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/fi-skl-iommu/igt@i915_selftest@live_hangcheck.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_hangcheck:
    - {fi-icl-y}:         [INCOMPLETE][3] ([fdo#107713] / [fdo#108569]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/fi-icl-y/igt@i915_selftest@live_hangcheck.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/fi-icl-y/igt@i915_selftest@live_hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#108602]: https://bugs.freedesktop.org/show_bug.cgi?id=108602
  [fdo#108744]: https://bugs.freedesktop.org/show_bug.cgi?id=108744


Participating hosts (51 -> 44)
------------------------------

  Additional (2): fi-icl-u2 fi-apl-guc 
  Missing    (9): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-bsw-n3050 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * IGT: IGT_4973 -> IGTPW_2952

  CI_DRM_6063: 44ae4003d35743cbc7883825c5fe777d136b5247 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_2952: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/
  IGT_4973: 3e3ff0e48989abd25fce4916e85e8fef20a3c63a @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+igt@i915_query@engine-info
+igt@i915_query@engine-info-invalid

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
  2019-05-08 12:27     ` Chris Wilson
@ 2019-05-08 13:50       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 13:50 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 08/05/2019 13:27, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A few additional workloads useful for experimenting with scheduling.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Are the extra modes & .wsim supported by scripts/media-bench.pl?
> i.e. can I just run media-bench.pl and have it exercise all the new
> features?

Not sure what you mean by extra modes? If all new wsim commands then no. 
They are not in the default media-bench.pl set. The workloads from this 
patch are not in that set so are just for reference.

Virtual engine (gem_wsim -b i915) is supported by media-bench.pl even 
with the old/default set of workloads.

The catch is old wsim workloads use VCS to mean any VCS and in those 
cases -b i915 will set up the virtual engine 
automatically/transparently. So those old workloads can be ran both with 
userspace or i915 balancing.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
@ 2019-05-08 13:50       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 13:50 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 08/05/2019 13:27, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A few additional workloads useful for experimenting with scheduling.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Are the extra modes & .wsim supported by scripts/media-bench.pl?
> i.e. can I just run media-bench.pl and have it exercise all the new
> features?

Not sure what you mean by extra modes? If all new wsim commands then no. 
They are not in the default media-bench.pl set. The workloads from this 
patch are not in that set so are just for reference.

Virtual engine (gem_wsim -b i915) is supported by media-bench.pl even 
with the old/default set of workloads.

The catch is old wsim workloads use VCS to mean any VCS and in those 
cases -b i915 will set up the virtual engine 
automatically/transparently. So those old workloads can be ran both with 
userspace or i915 balancing.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
  2019-05-08 13:50       ` Tvrtko Ursulin
@ 2019-05-08 13:56         ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-08 13:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 14:50:41)
> 
> On 08/05/2019 13:27, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> A few additional workloads useful for experimenting with scheduling.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > Are the extra modes & .wsim supported by scripts/media-bench.pl?
> > i.e. can I just run media-bench.pl and have it exercise all the new
> > features?
> 
> Not sure what you mean by extra modes? If all new wsim commands then no. 
> They are not in the default media-bench.pl set. The workloads from this 
> patch are not in that set so are just for reference.

That's what I meant, are the new example.wsim with explicit engine maps
and so I presume inter-mixing of load-balanced workloads with other work
included in the default set run by ./scripts/media-bench.pl

What's the minimum amount of effort I need to exercise all the new
features of gem_wsim? :)

> Virtual engine (gem_wsim -b i915) is supported by media-bench.pl even 
> with the old/default set of workloads.
> 
> The catch is old wsim workloads use VCS to mean any VCS and in those 
> cases -b i915 will set up the virtual engine 
> automatically/transparently. So those old workloads can be ran both with 
> userspace or i915 balancing.

And seems to still be working.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
@ 2019-05-08 13:56         ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-08 13:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 14:50:41)
> 
> On 08/05/2019 13:27, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> A few additional workloads useful for experimenting with scheduling.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > Are the extra modes & .wsim supported by scripts/media-bench.pl?
> > i.e. can I just run media-bench.pl and have it exercise all the new
> > features?
> 
> Not sure what you mean by extra modes? If all new wsim commands then no. 
> They are not in the default media-bench.pl set. The workloads from this 
> patch are not in that set so are just for reference.

That's what I meant, are the new example.wsim with explicit engine maps
and so I presume inter-mixing of load-balanced workloads with other work
included in the default set run by ./scripts/media-bench.pl

What's the minimum amount of effort I need to exercise all the new
features of gem_wsim? :)

> Virtual engine (gem_wsim -b i915) is supported by media-bench.pl even 
> with the old/default set of workloads.
> 
> The catch is old wsim workloads use VCS to mean any VCS and in those 
> cases -b i915 will set up the virtual engine 
> automatically/transparently. So those old workloads can be ran both with 
> userspace or i915 balancing.

And seems to still be working.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
  2019-05-08 13:56         ` Chris Wilson
@ 2019-05-08 14:16           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 14:16 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 08/05/2019 14:56, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 14:50:41)
>>
>> On 08/05/2019 13:27, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> A few additional workloads useful for experimenting with scheduling.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Are the extra modes & .wsim supported by scripts/media-bench.pl?
>>> i.e. can I just run media-bench.pl and have it exercise all the new
>>> features?
>>
>> Not sure what you mean by extra modes? If all new wsim commands then no.
>> They are not in the default media-bench.pl set. The workloads from this
>> patch are not in that set so are just for reference.
> 
> That's what I meant, are the new example.wsim with explicit engine maps
> and so I presume inter-mixing of load-balanced workloads with other work
> included in the default set run by ./scripts/media-bench.pl

It's not in the default set but manual workloads can be given to 
media-bench.pl using the -w switch. String passed there is passed onto 
gem_wsim directly so one or more workloads can be manually specified.

> What's the minimum amount of effort I need to exercise all the new
> features of gem_wsim? :)

frame-split-60fps.wsim uses almost all new features: preemption control, 
engine map, load balance, bond, submit fence and the "endless" batch.

Only missing is SSEU control for which I did not add an example workload 
(there is a snippet in README though) since the access to uapi is 
blocked outside the gen11 special case. To use that the i915 IS_GEN11 
check in set_sseu needs to be lifted as well.

>> Virtual engine (gem_wsim -b i915) is supported by media-bench.pl even
>> with the old/default set of workloads.
>>
>> The catch is old wsim workloads use VCS to mean any VCS and in those
>> cases -b i915 will set up the virtual engine
>> automatically/transparently. So those old workloads can be ran both with
>> userspace or i915 balancing.
> 
> And seems to still be working.

I'd hope so, I mostly do test things! :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
@ 2019-05-08 14:16           ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-08 14:16 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 08/05/2019 14:56, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 14:50:41)
>>
>> On 08/05/2019 13:27, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> A few additional workloads useful for experimenting with scheduling.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Are the extra modes & .wsim supported by scripts/media-bench.pl?
>>> i.e. can I just run media-bench.pl and have it exercise all the new
>>> features?
>>
>> Not sure what you mean by extra modes? If all new wsim commands then no.
>> They are not in the default media-bench.pl set. The workloads from this
>> patch are not in that set so are just for reference.
> 
> That's what I meant, are the new example.wsim with explicit engine maps
> and so I presume inter-mixing of load-balanced workloads with other work
> included in the default set run by ./scripts/media-bench.pl

It's not in the default set but manual workloads can be given to 
media-bench.pl using the -w switch. String passed there is passed onto 
gem_wsim directly so one or more workloads can be manually specified.

> What's the minimum amount of effort I need to exercise all the new
> features of gem_wsim? :)

frame-split-60fps.wsim uses almost all new features: preemption control, 
engine map, load balance, bond, submit fence and the "endless" batch.

Only missing is SSEU control for which I did not add an example workload 
(there is a snippet in README though) since the access to uapi is 
blocked outside the gen11 special case. To use that the i915 IS_GEN11 
check in set_sseu needs to be lifted as well.

>> Virtual engine (gem_wsim -b i915) is supported by media-bench.pl even
>> with the old/default set of workloads.
>>
>> The catch is old wsim workloads use VCS to mean any VCS and in those
>> cases -b i915 will set up the virtual engine
>> automatically/transparently. So those old workloads can be ran both with
>> userspace or i915 balancing.
> 
> And seems to still be working.

I'd hope so, I mostly do test things! :)

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for Media scalability tooling (rev2)
  2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
                   ` (22 preceding siblings ...)
  (?)
@ 2019-05-08 16:01 ` Patchwork
  -1 siblings, 0 replies; 126+ messages in thread
From: Patchwork @ 2019-05-08 16:01 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Media scalability tooling (rev2)
URL   : https://patchwork.freedesktop.org/series/51193/
State : success

== Summary ==

CI Bug Log - changes from IGT_4973_full -> IGTPW_2952_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/51193/revisions/2/mbox/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_2952_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@i915_query@engine-info} (NEW):
    - shard-iclb:         NOTRUN -> [SKIP][1] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb7/igt@i915_query@engine-info.html

  
New tests
---------

  New tests have been introduced between IGT_4973_full and IGTPW_2952_full:

### New IGT tests (2) ###

  * igt@i915_query@engine-info:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@i915_query@engine-info-invalid:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_2952_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_eio@unwedge-stress:
    - shard-snb:          [PASS][2] -> [FAIL][3] ([fdo#109661])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-snb6/igt@gem_eio@unwedge-stress.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-snb1/igt@gem_eio@unwedge-stress.html

  * igt@i915_suspend@fence-restore-untiled:
    - shard-apl:          [PASS][4] -> [DMESG-WARN][5] ([fdo#108566]) +11 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-apl2/igt@i915_suspend@fence-restore-untiled.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-apl2/igt@i915_suspend@fence-restore-untiled.html

  * igt@kms_cursor_crc@cursor-128x128-dpms:
    - shard-apl:          [PASS][6] -> [FAIL][7] ([fdo#103232])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-apl2/igt@kms_cursor_crc@cursor-128x128-dpms.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-apl8/igt@kms_cursor_crc@cursor-128x128-dpms.html

  * igt@kms_dp_dsc@basic-dsc-enable-edp:
    - shard-iclb:         [PASS][8] -> [SKIP][9] ([fdo#109349])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb2/igt@kms_dp_dsc@basic-dsc-enable-edp.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb7/igt@kms_dp_dsc@basic-dsc-enable-edp.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt:
    - shard-iclb:         [PASS][10] -> [FAIL][11] ([fdo#103167]) +4 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb4/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-blt.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-iclb:         [PASS][12] -> [FAIL][13] ([fdo#103166])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb5/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb1/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_plane_scaling@pipe-a-scaler-with-clipping-clamping:
    - shard-glk:          [PASS][14] -> [SKIP][15] ([fdo#109271] / [fdo#109278])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-glk9/igt@kms_plane_scaling@pipe-a-scaler-with-clipping-clamping.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-glk3/igt@kms_plane_scaling@pipe-a-scaler-with-clipping-clamping.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [PASS][16] -> [SKIP][17] ([fdo#109441]) +1 similar issue
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb5/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_sysfs_edid_timing:
    - shard-iclb:         [PASS][18] -> [FAIL][19] ([fdo#100047])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb4/igt@kms_sysfs_edid_timing.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb3/igt@kms_sysfs_edid_timing.html

  * igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend:
    - shard-kbl:          [PASS][20] -> [INCOMPLETE][21] ([fdo#103665])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-kbl2/igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-kbl3/igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend.html

  
#### Possible fixes ####

  * igt@gem_tiled_swapping@non-threaded:
    - shard-glk:          [DMESG-WARN][22] ([fdo#108686]) -> [PASS][23]
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-glk7/igt@gem_tiled_swapping@non-threaded.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-glk6/igt@gem_tiled_swapping@non-threaded.html

  * igt@i915_suspend@debugfs-reader:
    - shard-apl:          [DMESG-WARN][24] ([fdo#108566]) -> [PASS][25] +4 similar issues
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-apl4/igt@i915_suspend@debugfs-reader.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-apl4/igt@i915_suspend@debugfs-reader.html

  * igt@kms_cursor_crc@cursor-64x21-sliding:
    - shard-apl:          [FAIL][26] ([fdo#103232]) -> [PASS][27]
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-apl3/igt@kms_cursor_crc@cursor-64x21-sliding.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-apl5/igt@kms_cursor_crc@cursor-64x21-sliding.html
    - shard-kbl:          [FAIL][28] ([fdo#103232]) -> [PASS][29]
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-kbl3/igt@kms_cursor_crc@cursor-64x21-sliding.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-kbl2/igt@kms_cursor_crc@cursor-64x21-sliding.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
    - shard-glk:          [FAIL][30] ([fdo#105363]) -> [PASS][31]
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-glk4/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-glk2/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@2x-flip-vs-suspend:
    - shard-hsw:          [INCOMPLETE][32] ([fdo#103540]) -> [PASS][33] +1 similar issue
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-hsw7/igt@kms_flip@2x-flip-vs-suspend.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-hsw6/igt@kms_flip@2x-flip-vs-suspend.html

  * igt@kms_flip@plain-flip-fb-recreate:
    - shard-kbl:          [FAIL][34] ([fdo#100368]) -> [PASS][35]
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-kbl4/igt@kms_flip@plain-flip-fb-recreate.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-kbl1/igt@kms_flip@plain-flip-fb-recreate.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt:
    - shard-iclb:         [FAIL][36] ([fdo#103167]) -> [PASS][37] +7 similar issues
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb7/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt.html

  * igt@kms_plane@pixel-format-pipe-c-planes-source-clamping:
    - shard-glk:          [SKIP][38] ([fdo#109271]) -> [PASS][39]
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-glk1/igt@kms_plane@pixel-format-pipe-c-planes-source-clamping.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-glk9/igt@kms_plane@pixel-format-pipe-c-planes-source-clamping.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
    - shard-iclb:         [FAIL][40] ([fdo#103166]) -> [PASS][41]
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb5/igt@kms_plane_lowres@pipe-a-tiling-x.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb6/igt@kms_plane_lowres@pipe-a-tiling-x.html

  * igt@kms_plane_scaling@pipe-b-scaler-with-rotation:
    - shard-glk:          [SKIP][42] ([fdo#109271] / [fdo#109278]) -> [PASS][43]
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-glk2/igt@kms_plane_scaling@pipe-b-scaler-with-rotation.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-glk9/igt@kms_plane_scaling@pipe-b-scaler-with-rotation.html

  * igt@kms_psr@psr2_cursor_render:
    - shard-iclb:         [SKIP][44] ([fdo#109441]) -> [PASS][45] +2 similar issues
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-iclb7/igt@kms_psr@psr2_cursor_render.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-iclb2/igt@kms_psr@psr2_cursor_render.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [FAIL][46] ([fdo#99912]) -> [PASS][47]
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-kbl3/igt@kms_setmode@basic.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-kbl7/igt@kms_setmode@basic.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-kbl:          [INCOMPLETE][48] ([fdo#103665]) -> [PASS][49]
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4973/shard-kbl1/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/shard-kbl3/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#100047]: https://bugs.freedesktop.org/show_bug.cgi?id=100047
  [fdo#100368]: https://bugs.freedesktop.org/show_bug.cgi?id=100368
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103540]: https://bugs.freedesktop.org/show_bug.cgi?id=103540
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#108686]: https://bugs.freedesktop.org/show_bug.cgi?id=108686
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109349]: https://bugs.freedesktop.org/show_bug.cgi?id=109349
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109661]: https://bugs.freedesktop.org/show_bug.cgi?id=109661
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (7 -> 6)
------------------------------

  Missing    (1): shard-skl 


Build changes
-------------

  * IGT: IGT_4973 -> IGTPW_2952

  CI_DRM_6063: 44ae4003d35743cbc7883825c5fe777d136b5247 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_2952: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/
  IGT_4973: 3e3ff0e48989abd25fce4916e85e8fef20a3c63a @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2952/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
  2019-05-08 12:17     ` Chris Wilson
@ 2019-05-09  9:27       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-09  9:27 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 08/05/2019 13:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> After the removal of engine global seqnos and the corresponding
>> intel_engine_notify tracepoints the script needs to be adjusted to cope
>> with the new state of things.
>>
>> To keep working it switches over using the dma_fence:dma_fence_signaled:
>> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
>> with engines.
> 
> Is the map suitable for the planned (by me)
> 
> 	s/i915_request_wait_begin/dma_fence_wait_begin/
> 
> I guess it should be.

I think it would be workable. One complication would be that engine is 
not guaranteed to be known ahead of the wait, like it is ahead of the 
signal. But since ctx.seqno is unique it can be tracked and added later 
I think.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
@ 2019-05-09  9:27       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-09  9:27 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 08/05/2019 13:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> After the removal of engine global seqnos and the corresponding
>> intel_engine_notify tracepoints the script needs to be adjusted to cope
>> with the new state of things.
>>
>> To keep working it switches over using the dma_fence:dma_fence_signaled:
>> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
>> with engines.
> 
> Is the map suitable for the planned (by me)
> 
> 	s/i915_request_wait_begin/dma_fence_wait_begin/
> 
> I guess it should be.

I think it would be workable. One complication would be that engine is 
not guaranteed to be known ahead of the wait, like it is ahead of the 
signal. But since ctx.seqno is unique it can be tracked and added later 
I think.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 12:33     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 12:33 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> After the removal of engine global seqnos and the corresponding
> intel_engine_notify tracepoints the script needs to be adjusted to cope
> with the new state of things.
> 
> To keep working it switches over using the dma_fence:dma_fence_signaled:
> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
> with engines.
> 
> It also needs to key the completion events on the full engine/ctx/seqno
> tokens, and adjust correspondingly the timeline sorting logic.
> 
> v2:
>  * Do not use late notifications (received after context complete) when
>    splitting up coalesced requests. They are now much more likely and can
>    not be used.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  scripts/trace.pl | 82 ++++++++++++++++++++++++------------------------
>  1 file changed, 41 insertions(+), 41 deletions(-)
> 
> diff --git a/scripts/trace.pl b/scripts/trace.pl
> index 18f9f3b18396..95dc3a645e8e 100755
> --- a/scripts/trace.pl
> +++ b/scripts/trace.pl
> @@ -27,7 +27,8 @@ use warnings;
>  use 5.010;
>  
>  my $gid = 0;
> -my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
> +my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
> +    %ctxtimelines, %ctxengines);
>  my @freqs;

So what's ctxengines? Or rings for that matter?

I take it ctxengines is really the last engine which we saw this context
execute on?

>  
>  my $max_items = 3000;
> @@ -66,7 +67,7 @@ Notes:
>                                i915:i915_request_submit, \
>                                i915:i915_request_in, \
>                                i915:i915_request_out, \
> -                              i915:intel_engine_notify, \
> +                              dma_fence:dma_fence_signaled, \
>                                i915:i915_request_wait_begin, \
>                                i915:i915_request_wait_end \
>                                [command-to-be-profiled]
> @@ -161,7 +162,7 @@ sub arg_trace
>                        'i915:i915_request_submit',
>                        'i915:i915_request_in',
>                        'i915:i915_request_out',
> -                      'i915:intel_engine_notify',
> +                      'dma_fence:dma_fence_signaled',
>                        'i915:i915_request_wait_begin',
>                        'i915:i915_request_wait_end' );
>  
> @@ -312,13 +313,6 @@ sub db_key
>         return $ring . '/' . $ctx . '/' . $seqno;
>  }
>  
> -sub global_key
> -{
> -       my ($ring, $seqno) = @_;
> -
> -       return $ring . '/' . $seqno;
> -}
> -
>  sub sanitize_ctx
>  {
>         my ($ctx, $ring) = @_;
> @@ -419,6 +413,8 @@ while (<>) {
>                 $req{'ring'} = $ring;
>                 $req{'seqno'} = $seqno;
>                 $req{'ctx'} = $ctx;
> +               die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
> +               $ctxengines{$ctx} = $ring;
>                 $ctxtimelines{$ctx . '/' . $ring} = 1;
>                 $req{'name'} = $ctx . '/' . $seqno;
>                 $req{'global'} = $tp{'global'};
> @@ -429,16 +425,29 @@ while (<>) {
>                 $ringmap{$rings{$ring}} = $ring;
>                 $db{$key} = \%req;
>         } elsif ($tp_name eq 'i915:i915_request_out:') {
> -               my $gkey = global_key($ring, $tp{'global'});
> +               my $gkey;
> +

# Must be paired with a previous i915_request_in
> +               die unless exists $ctxengines{$ctx};

I'd suggest next unless, because there's always a change the capture is
started part way though someone's workload.

> +               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
> +
> +               if ($tp{'completed?'}) {
> +                       die unless exists $db{$key};
> +                       die unless exists $db{$key}->{'start'};
> +                       die if exists $db{$key}->{'end'};
> +
> +                       $db{$key}->{'end'} = $time;
> +                       $db{$key}->{'notify'} = $notify{$gkey}
> +                                               if exists $notify{$gkey};

Hmm. With preempt-to-busy, a request can complete when we are no longer
tracking it (it completes before we preempt it).

They will still get the schedule-out tracepoint, but marked as
incomplete, and there will be a signaled tp later before we try and
resubmit.

> +               } else {
> +                       delete $db{$key};
> +               }
> +       } elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
> +               my $gkey;
>  
> -               die unless exists $db{$key};
> -               die unless exists $db{$key}->{'start'};
> -               die if exists $db{$key}->{'end'};
> +               die unless exists $ctxengines{$tp{'context'}};
>  
> -               $db{$key}->{'end'} = $time;
> -               $db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
> -       } elsif ($tp_name eq 'i915:intel_engine_notify:') {
> -               my $gkey = global_key($ring, $seqno);
> +               $gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
>  
>                 $notify{$gkey} = $time unless exists $notify{$gkey};
>         } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
> @@ -452,7 +461,7 @@ while (<>) {
>  # find the largest seqno to be used for timeline sorting purposes.
>  my $max_seqno = 0;
>  foreach my $key (keys %db) {
> -       my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
> +       my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
>  
>         die unless exists $db{$key}->{'start'};
>  
> @@ -478,14 +487,13 @@ my $key_count = scalar(keys %db);
>  
>  my %engine_timelines;
>  
> -sub sortEngine {
> -       my $as = $db{$a}->{'global'};
> -       my $bs = $db{$b}->{'global'};
> +sub sortStart {
> +       my $as = $db{$a}->{'start'};
> +       my $bs = $db{$b}->{'start'};
>         my $val;
>  
>         $val = $as <=> $bs;
> -
> -       die if $val == 0;
> +       $val = $a cmp $b if $val == 0;
>  
>         return $val;
>  }
> @@ -497,9 +505,7 @@ sub get_engine_timeline {
>         return $engine_timelines{$ring} if exists $engine_timelines{$ring};
>  
>         @timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
> -       # FIXME seqno restart
> -       @timeline = sort sortEngine @timeline;
> -
> +       @timeline = sort sortStart @timeline;
>         $engine_timelines{$ring} = \@timeline;
>  
>         return \@timeline;
> @@ -561,20 +567,10 @@ foreach my $gid (sort keys %rings) {
>                         $db{$key}->{'no-notify'} = 1;
>                 }
>                 $db{$key}->{'end'} = $end;
> +               $db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
>         }
>  }
>  
> -sub sortStart {
> -       my $as = $db{$a}->{'start'};
> -       my $bs = $db{$b}->{'start'};
> -       my $val;
> -
> -       $val = $as <=> $bs;
> -       $val = $a cmp $b if $val == 0;
> -
> -       return $val;
> -}
> -
>  my $re_sort = 1;
>  my @sorted_keys;
>  
> @@ -670,9 +666,13 @@ if ($correct_durations) {
>                         next unless exists $db{$key}->{'no-end'};
>                         last if $pos == $#{$timeline};
>  
> -                       # Shift following request to start after the current one
> +                       # Shift following request to start after the current
> +                       # one, but only if that wouldn't make it zero duration,
> +                       # which would indicate notify arrived after context
> +                       # complete.
>                         $next_key = ${$timeline}[$pos + 1];
> -                       if (exists $db{$key}->{'notify'}) {
> +                       if (exists $db{$key}->{'notify'} and
> +                           $db{$key}->{'notify'} < $db{$key}->{'end'}) {
>                                 $db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
>                                 $db{$next_key}->{'start'} = $db{$key}->{'notify'};
>                                 $re_sort = 1;
> @@ -750,9 +750,9 @@ foreach my $gid (sort keys %rings) {
>         # Extract all GPU busy intervals and sort them.
>         foreach my $key (@sorted_keys) {
>                 next unless $db{$key}->{'ring'} eq $ring;
> +               die if $db{$key}->{'start'} > $db{$key}->{'end'};

Heh, we're out of luck if we want to trace across seqno wraparound.

It makes enough sense,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
@ 2019-05-10 12:33     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 12:33 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> After the removal of engine global seqnos and the corresponding
> intel_engine_notify tracepoints the script needs to be adjusted to cope
> with the new state of things.
> 
> To keep working it switches over using the dma_fence:dma_fence_signaled:
> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
> with engines.
> 
> It also needs to key the completion events on the full engine/ctx/seqno
> tokens, and adjust correspondingly the timeline sorting logic.
> 
> v2:
>  * Do not use late notifications (received after context complete) when
>    splitting up coalesced requests. They are now much more likely and can
>    not be used.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  scripts/trace.pl | 82 ++++++++++++++++++++++++------------------------
>  1 file changed, 41 insertions(+), 41 deletions(-)
> 
> diff --git a/scripts/trace.pl b/scripts/trace.pl
> index 18f9f3b18396..95dc3a645e8e 100755
> --- a/scripts/trace.pl
> +++ b/scripts/trace.pl
> @@ -27,7 +27,8 @@ use warnings;
>  use 5.010;
>  
>  my $gid = 0;
> -my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
> +my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
> +    %ctxtimelines, %ctxengines);
>  my @freqs;

So what's ctxengines? Or rings for that matter?

I take it ctxengines is really the last engine which we saw this context
execute on?

>  
>  my $max_items = 3000;
> @@ -66,7 +67,7 @@ Notes:
>                                i915:i915_request_submit, \
>                                i915:i915_request_in, \
>                                i915:i915_request_out, \
> -                              i915:intel_engine_notify, \
> +                              dma_fence:dma_fence_signaled, \
>                                i915:i915_request_wait_begin, \
>                                i915:i915_request_wait_end \
>                                [command-to-be-profiled]
> @@ -161,7 +162,7 @@ sub arg_trace
>                        'i915:i915_request_submit',
>                        'i915:i915_request_in',
>                        'i915:i915_request_out',
> -                      'i915:intel_engine_notify',
> +                      'dma_fence:dma_fence_signaled',
>                        'i915:i915_request_wait_begin',
>                        'i915:i915_request_wait_end' );
>  
> @@ -312,13 +313,6 @@ sub db_key
>         return $ring . '/' . $ctx . '/' . $seqno;
>  }
>  
> -sub global_key
> -{
> -       my ($ring, $seqno) = @_;
> -
> -       return $ring . '/' . $seqno;
> -}
> -
>  sub sanitize_ctx
>  {
>         my ($ctx, $ring) = @_;
> @@ -419,6 +413,8 @@ while (<>) {
>                 $req{'ring'} = $ring;
>                 $req{'seqno'} = $seqno;
>                 $req{'ctx'} = $ctx;
> +               die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
> +               $ctxengines{$ctx} = $ring;
>                 $ctxtimelines{$ctx . '/' . $ring} = 1;
>                 $req{'name'} = $ctx . '/' . $seqno;
>                 $req{'global'} = $tp{'global'};
> @@ -429,16 +425,29 @@ while (<>) {
>                 $ringmap{$rings{$ring}} = $ring;
>                 $db{$key} = \%req;
>         } elsif ($tp_name eq 'i915:i915_request_out:') {
> -               my $gkey = global_key($ring, $tp{'global'});
> +               my $gkey;
> +

# Must be paired with a previous i915_request_in
> +               die unless exists $ctxengines{$ctx};

I'd suggest next unless, because there's always a change the capture is
started part way though someone's workload.

> +               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
> +
> +               if ($tp{'completed?'}) {
> +                       die unless exists $db{$key};
> +                       die unless exists $db{$key}->{'start'};
> +                       die if exists $db{$key}->{'end'};
> +
> +                       $db{$key}->{'end'} = $time;
> +                       $db{$key}->{'notify'} = $notify{$gkey}
> +                                               if exists $notify{$gkey};

Hmm. With preempt-to-busy, a request can complete when we are no longer
tracking it (it completes before we preempt it).

They will still get the schedule-out tracepoint, but marked as
incomplete, and there will be a signaled tp later before we try and
resubmit.

> +               } else {
> +                       delete $db{$key};
> +               }
> +       } elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
> +               my $gkey;
>  
> -               die unless exists $db{$key};
> -               die unless exists $db{$key}->{'start'};
> -               die if exists $db{$key}->{'end'};
> +               die unless exists $ctxengines{$tp{'context'}};
>  
> -               $db{$key}->{'end'} = $time;
> -               $db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
> -       } elsif ($tp_name eq 'i915:intel_engine_notify:') {
> -               my $gkey = global_key($ring, $seqno);
> +               $gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
>  
>                 $notify{$gkey} = $time unless exists $notify{$gkey};
>         } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
> @@ -452,7 +461,7 @@ while (<>) {
>  # find the largest seqno to be used for timeline sorting purposes.
>  my $max_seqno = 0;
>  foreach my $key (keys %db) {
> -       my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
> +       my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
>  
>         die unless exists $db{$key}->{'start'};
>  
> @@ -478,14 +487,13 @@ my $key_count = scalar(keys %db);
>  
>  my %engine_timelines;
>  
> -sub sortEngine {
> -       my $as = $db{$a}->{'global'};
> -       my $bs = $db{$b}->{'global'};
> +sub sortStart {
> +       my $as = $db{$a}->{'start'};
> +       my $bs = $db{$b}->{'start'};
>         my $val;
>  
>         $val = $as <=> $bs;
> -
> -       die if $val == 0;
> +       $val = $a cmp $b if $val == 0;
>  
>         return $val;
>  }
> @@ -497,9 +505,7 @@ sub get_engine_timeline {
>         return $engine_timelines{$ring} if exists $engine_timelines{$ring};
>  
>         @timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
> -       # FIXME seqno restart
> -       @timeline = sort sortEngine @timeline;
> -
> +       @timeline = sort sortStart @timeline;
>         $engine_timelines{$ring} = \@timeline;
>  
>         return \@timeline;
> @@ -561,20 +567,10 @@ foreach my $gid (sort keys %rings) {
>                         $db{$key}->{'no-notify'} = 1;
>                 }
>                 $db{$key}->{'end'} = $end;
> +               $db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
>         }
>  }
>  
> -sub sortStart {
> -       my $as = $db{$a}->{'start'};
> -       my $bs = $db{$b}->{'start'};
> -       my $val;
> -
> -       $val = $as <=> $bs;
> -       $val = $a cmp $b if $val == 0;
> -
> -       return $val;
> -}
> -
>  my $re_sort = 1;
>  my @sorted_keys;
>  
> @@ -670,9 +666,13 @@ if ($correct_durations) {
>                         next unless exists $db{$key}->{'no-end'};
>                         last if $pos == $#{$timeline};
>  
> -                       # Shift following request to start after the current one
> +                       # Shift following request to start after the current
> +                       # one, but only if that wouldn't make it zero duration,
> +                       # which would indicate notify arrived after context
> +                       # complete.
>                         $next_key = ${$timeline}[$pos + 1];
> -                       if (exists $db{$key}->{'notify'}) {
> +                       if (exists $db{$key}->{'notify'} and
> +                           $db{$key}->{'notify'} < $db{$key}->{'end'}) {
>                                 $db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
>                                 $db{$next_key}->{'start'} = $db{$key}->{'notify'};
>                                 $re_sort = 1;
> @@ -750,9 +750,9 @@ foreach my $gid (sort keys %rings) {
>         # Extract all GPU busy intervals and sort them.
>         foreach my $key (@sorted_keys) {
>                 next unless $db{$key}->{'ring'} eq $ring;
> +               die if $db{$key}->{'start'} > $db{$key}->{'end'};

Heh, we're out of luck if we want to trace across seqno wraparound.

It makes enough sense,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH i-g-t 03/21] trace.pl: Virtual engine support
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 12:52     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 12:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:40)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Add virtual/queue timelines to both stdout and HTML output.
> 
> A new timeline is created for each queue/virtual engine to display
> associated requests in queued and runnable states. Once requests are
> submitted to a real engine for executing they show up on the physical
> engine timeline.

How does it cope with preemption events that shift the request onto
another engine?

Queues. So why are virtual engines treated differently, from my pov it's
just a timeline like any other, the only difference is that it my
execute on a different engine? My expectation would have been that
tracking would have been timeline centric.

However, I think I am confusing my perspective of timelines in the
kernel with the visualisation timelines.


> +sub is_veng
> +{
> +       my ($class, $instance) = split ':', shift;
> +
> +       return $instance eq '254';

Ok. I thought I might have caught you out.


> +               unless (exists $queue{$key}) {
> +                       # Virtual engine
> +                       my $vkey = db_key(VENG, $ctx, $seqno);
> +                       my %req;
> +
> +                       die unless exists $queues{$ctx};
> +                       die unless exists $queue{$vkey};
> +                       die unless exists $submit{$vkey};
> +
> +                       # Create separate request record on the queue timeline
> +                       $q = $queue{$vkey};
> +                       $s = $submit{$vkey};
> +                       $req{'queue'} = $q;
> +                       $req{'submit'} = $s;
> +                       $req{'start'} = $time;
> +                       $req{'end'} = $time;
> +                       $req{'ring'} = VENG;
> +                       $req{'seqno'} = $seqno;
> +                       $req{'ctx'} = $ctx;
> +                       $req{'name'} = $ctx . '/' . $seqno;
> +                       $req{'global'} = $tp{'global'};
> +                       $req{'port'} = $tp{'port'};

Just quietly thinking why not adopt this for each timeline; create a
on-engine event box for all.

> +
> +                       $vdb{$vkey} = \%req;
> +               } else {
> +                       $q = $queue{$key};
> +                       $s = $submit{$key};
> +               }
>  
>                 $req{'start'} = $time;
>                 $req{'ring'} = $ring;


>  sub stdio_stats
>  {
>         my ($stats, $group, $id) = @_;
> +       my $veng = exists $stats->{'virtual'} ? 1 : 0;
>         my $str;
>  
> -       $str = 'Ring' . $group . ': ';
> +       $str = $veng ? 'Virtual' : 'Ring';
> +       $str .= $group . ': ';
>         $str .= $stats->{'count'} . ' batches, ';
> -       $str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
> -       $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
> -       $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
> +       unless ($veng) {
> +               $str .= sprintf('%.2f (%.2f) avg batch us, ',
> +                               $stats->{'avg'}, $stats->{'total-avg'});
> +               $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
> +               $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
> +       }
> +
>         $str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
>         $str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
>         $str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';

So I'm looking that the utilisation, trying to figure out why veng
matters? Do we not breakdown utilisation for the real engines, plus
utilisation on each client timeline?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 03/21] trace.pl: Virtual engine support
@ 2019-05-10 12:52     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 12:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:40)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Add virtual/queue timelines to both stdout and HTML output.
> 
> A new timeline is created for each queue/virtual engine to display
> associated requests in queued and runnable states. Once requests are
> submitted to a real engine for executing they show up on the physical
> engine timeline.

How does it cope with preemption events that shift the request onto
another engine?

Queues. So why are virtual engines treated differently, from my pov it's
just a timeline like any other, the only difference is that it my
execute on a different engine? My expectation would have been that
tracking would have been timeline centric.

However, I think I am confusing my perspective of timelines in the
kernel with the visualisation timelines.


> +sub is_veng
> +{
> +       my ($class, $instance) = split ':', shift;
> +
> +       return $instance eq '254';

Ok. I thought I might have caught you out.


> +               unless (exists $queue{$key}) {
> +                       # Virtual engine
> +                       my $vkey = db_key(VENG, $ctx, $seqno);
> +                       my %req;
> +
> +                       die unless exists $queues{$ctx};
> +                       die unless exists $queue{$vkey};
> +                       die unless exists $submit{$vkey};
> +
> +                       # Create separate request record on the queue timeline
> +                       $q = $queue{$vkey};
> +                       $s = $submit{$vkey};
> +                       $req{'queue'} = $q;
> +                       $req{'submit'} = $s;
> +                       $req{'start'} = $time;
> +                       $req{'end'} = $time;
> +                       $req{'ring'} = VENG;
> +                       $req{'seqno'} = $seqno;
> +                       $req{'ctx'} = $ctx;
> +                       $req{'name'} = $ctx . '/' . $seqno;
> +                       $req{'global'} = $tp{'global'};
> +                       $req{'port'} = $tp{'port'};

Just quietly thinking why not adopt this for each timeline; create a
on-engine event box for all.

> +
> +                       $vdb{$vkey} = \%req;
> +               } else {
> +                       $q = $queue{$key};
> +                       $s = $submit{$key};
> +               }
>  
>                 $req{'start'} = $time;
>                 $req{'ring'} = $ring;


>  sub stdio_stats
>  {
>         my ($stats, $group, $id) = @_;
> +       my $veng = exists $stats->{'virtual'} ? 1 : 0;
>         my $str;
>  
> -       $str = 'Ring' . $group . ': ';
> +       $str = $veng ? 'Virtual' : 'Ring';
> +       $str .= $group . ': ';
>         $str .= $stats->{'count'} . ' batches, ';
> -       $str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
> -       $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
> -       $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
> +       unless ($veng) {
> +               $str .= sprintf('%.2f (%.2f) avg batch us, ',
> +                               $stats->{'avg'}, $stats->{'total-avg'});
> +               $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
> +               $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
> +       }
> +
>         $str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
>         $str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
>         $str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';

So I'm looking that the utilisation, trying to figure out why veng
matters? Do we not breakdown utilisation for the real engines, plus
utilisation on each client timeline?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 12:55     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 12:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:41)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Use the 'completed?' tracepoint field to detect more robustly when a
> request has been preempted and remove it from the engine database if so.
> 
> Otherwise the script can hit a scenario where the same global seqno will
> be mentioned multiple times (on an engine seqno) which aborts processing.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  scripts/trace.pl | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/scripts/trace.pl b/scripts/trace.pl
> index 6cc332bb6e2a..cb7cc46df22e 100755
> --- a/scripts/trace.pl
> +++ b/scripts/trace.pl
> @@ -483,17 +483,17 @@ while (<>) {
>                 $ringmap{$rings{$ring}} = $ring;
>                 $db{$key} = \%req;
>         } elsif ($tp_name eq 'i915:i915_request_out:') {
> -               my $gkey;
> -
>                 die unless exists $ctxengines{$ctx};
>  
> -               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
> -
>                 if ($tp{'completed?'}) {
> +                       my $gkey;
> +
>                         die unless exists $db{$key};
>                         die unless exists $db{$key}->{'start'};
>                         die if exists $db{$key}->{'end'};
>  
> +                       $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);

I'm lost, how does do the commit message? I thought db_key() just gave
the hash value and not alter the db?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support
@ 2019-05-10 12:55     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 12:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:41)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Use the 'completed?' tracepoint field to detect more robustly when a
> request has been preempted and remove it from the engine database if so.
> 
> Otherwise the script can hit a scenario where the same global seqno will
> be mentioned multiple times (on an engine seqno) which aborts processing.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  scripts/trace.pl | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/scripts/trace.pl b/scripts/trace.pl
> index 6cc332bb6e2a..cb7cc46df22e 100755
> --- a/scripts/trace.pl
> +++ b/scripts/trace.pl
> @@ -483,17 +483,17 @@ while (<>) {
>                 $ringmap{$rings{$ring}} = $ring;
>                 $db{$key} = \%req;
>         } elsif ($tp_name eq 'i915:i915_request_out:') {
> -               my $gkey;
> -
>                 die unless exists $ctxengines{$ctx};
>  
> -               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
> -
>                 if ($tp{'completed?'}) {
> +                       my $gkey;
> +
>                         die unless exists $db{$key};
>                         die unless exists $db{$key}->{'start'};
>                         die if exists $db{$key}->{'end'};
>  
> +                       $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);

I'm lost, how does do the commit message? I thought db_key() just gave
the hash value and not alter the db?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:14     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> +       /*
> +        * Create and configure contexts.
> +        */
> +       for (i = 0; i < wrk->nr_ctxs; i += 2) {
> +               struct ctx *ctx = &wrk->ctx_list[i];
> +               uint32_t ctx_id, share_vm = 0;
>  
> -                       wrk->ctx_list[w->context].id = arg.ctx_id;
> +               if (ctx->id)
> +                       continue;
>  
> -                       if (flags & GLOBAL_BALANCE) {
> -                               wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
> -                               context_vcs_rr ^= 1;
> -                       } else {
> -                               wrk->ctx_list[w->context].static_vcs = ctx_vcs;
> -                               ctx_vcs ^= 1;
> -                       }
> +               if (flags & I915) {

vm sharing shouldn't be a i915-balancer only option. For single jobs split
across multiple contexts, I would expect they will want to share vm.

> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +                               .param.param = I915_CONTEXT_PARAM_VM,
> +                       };
> +                       struct drm_i915_gem_context_create_ext args = { };
>  
> -                       if (wrk->prio) {
> +                       /* Find existing context to share ppgtt with. */
> +                       for (j = 0; j < wrk->nr_ctxs; j++) {
>                                 struct drm_i915_gem_context_param param = {
> -                                       .ctx_id = arg.ctx_id,
> -                                       .param = I915_CONTEXT_PARAM_PRIORITY,
> -                                       .value = wrk->prio,
> +                                       .param = I915_CONTEXT_PARAM_VM,
>                                 };
> -                               gem_context_set_param(fd, &param);
> +
> +                               if (!wrk->ctx_list[j].id)
> +                                       continue;
> +
> +                               param.ctx_id = wrk->ctx_list[j].id;
> +
> +                               gem_context_get_param(fd, &param);
> +                               igt_assert(param.value);
> +
> +                               share_vm = param.value;
> +
> +                               ext.param.value = share_vm;
> +                               args.flags =
> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
> +                               args.extensions = to_user_pointer(&ext);
> +                               break;
>                         }
> +
> +                       if (!ctx->targets_instance)
> +                               args.flags |=
> +                                    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
> +
> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
> +                                &args);
> +
> +                       ctx_id = args.ctx_id;
> +               } else {
> +                       struct drm_i915_gem_context_create args = {};
> +
> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
> +                       ctx_id = args.ctx_id;
> +               }
> +
> +               igt_assert(ctx_id);
> +               ctx->id = ctx_id;
> +
> +               if (flags & GLOBAL_BALANCE) {
> +                       ctx->static_vcs = context_vcs_rr;
> +                       context_vcs_rr ^= 1;
> +               } else {
> +                       ctx->static_vcs = ctx_vcs;
> +                       ctx_vcs ^= 1;
> +               }
> +
> +               __ctx_set_prio(ctx_id, wrk->prio);
> +
> +               /*
> +                * Do we need a separate context to satisfy this workloads which
> +                * both want to target specific engines and be balanced by i915?
> +                */
> +               if ((flags & I915) && ctx->wants_balance &&
> +                   ctx->targets_instance) {
> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +                               .param.param = I915_CONTEXT_PARAM_VM,
> +                               .param.value = share_vm,
> +                       };
> +                       struct drm_i915_gem_context_create_ext args = {
> +                               .extensions = to_user_pointer(&ext),
> +                               .flags =
> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
> +                                   I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
> +                       };
> +
> +                       igt_assert(share_vm);
> +
> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
> +                                &args);
> +
> +                       igt_assert(args.ctx_id);
> +                       ctx_id = args.ctx_id;
> +                       wrk->ctx_list[i + 1].id = args.ctx_id;
> +
> +                       __ctx_set_prio(ctx_id, wrk->prio);
> +               }
> +
> +               if (ctx->wants_balance) {
> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
> +                               .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
> +                               .num_siblings = 2,
> +                               .engines = {
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 0 },
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 1 },
> +                               },
> +                       };
> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
> +                               .extensions = to_user_pointer(&load_balance),
> +                               .engines = {
> +                                       { .engine_class = I915_ENGINE_CLASS_INVALID,
> +                                         .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 0 },
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 1 },
> +                               },
> +                       };
> +
> +                       struct drm_i915_gem_context_param param = {
> +                               .ctx_id = ctx_id,
> +                               .param = I915_CONTEXT_PARAM_ENGINES,
> +                               .size = sizeof(set_engines),
> +                               .value = to_user_pointer(&set_engines),
> +                       };
> +
> +                       gem_context_set_param(fd, &param);
>                 }

if (share_vm)
	gem_vm_destroy(share_vm);

Just to drop the local handle as the context has acquired its own
reference.

Other than that, it does what it sets out to do: create a context with
choice of engines and load balancing amongst them.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
@ 2019-05-10 13:14     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> +       /*
> +        * Create and configure contexts.
> +        */
> +       for (i = 0; i < wrk->nr_ctxs; i += 2) {
> +               struct ctx *ctx = &wrk->ctx_list[i];
> +               uint32_t ctx_id, share_vm = 0;
>  
> -                       wrk->ctx_list[w->context].id = arg.ctx_id;
> +               if (ctx->id)
> +                       continue;
>  
> -                       if (flags & GLOBAL_BALANCE) {
> -                               wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
> -                               context_vcs_rr ^= 1;
> -                       } else {
> -                               wrk->ctx_list[w->context].static_vcs = ctx_vcs;
> -                               ctx_vcs ^= 1;
> -                       }
> +               if (flags & I915) {

vm sharing shouldn't be a i915-balancer only option. For single jobs split
across multiple contexts, I would expect they will want to share vm.

> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +                               .param.param = I915_CONTEXT_PARAM_VM,
> +                       };
> +                       struct drm_i915_gem_context_create_ext args = { };
>  
> -                       if (wrk->prio) {
> +                       /* Find existing context to share ppgtt with. */
> +                       for (j = 0; j < wrk->nr_ctxs; j++) {
>                                 struct drm_i915_gem_context_param param = {
> -                                       .ctx_id = arg.ctx_id,
> -                                       .param = I915_CONTEXT_PARAM_PRIORITY,
> -                                       .value = wrk->prio,
> +                                       .param = I915_CONTEXT_PARAM_VM,
>                                 };
> -                               gem_context_set_param(fd, &param);
> +
> +                               if (!wrk->ctx_list[j].id)
> +                                       continue;
> +
> +                               param.ctx_id = wrk->ctx_list[j].id;
> +
> +                               gem_context_get_param(fd, &param);
> +                               igt_assert(param.value);
> +
> +                               share_vm = param.value;
> +
> +                               ext.param.value = share_vm;
> +                               args.flags =
> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
> +                               args.extensions = to_user_pointer(&ext);
> +                               break;
>                         }
> +
> +                       if (!ctx->targets_instance)
> +                               args.flags |=
> +                                    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
> +
> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
> +                                &args);
> +
> +                       ctx_id = args.ctx_id;
> +               } else {
> +                       struct drm_i915_gem_context_create args = {};
> +
> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
> +                       ctx_id = args.ctx_id;
> +               }
> +
> +               igt_assert(ctx_id);
> +               ctx->id = ctx_id;
> +
> +               if (flags & GLOBAL_BALANCE) {
> +                       ctx->static_vcs = context_vcs_rr;
> +                       context_vcs_rr ^= 1;
> +               } else {
> +                       ctx->static_vcs = ctx_vcs;
> +                       ctx_vcs ^= 1;
> +               }
> +
> +               __ctx_set_prio(ctx_id, wrk->prio);
> +
> +               /*
> +                * Do we need a separate context to satisfy this workloads which
> +                * both want to target specific engines and be balanced by i915?
> +                */
> +               if ((flags & I915) && ctx->wants_balance &&
> +                   ctx->targets_instance) {
> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
> +                               .param.param = I915_CONTEXT_PARAM_VM,
> +                               .param.value = share_vm,
> +                       };
> +                       struct drm_i915_gem_context_create_ext args = {
> +                               .extensions = to_user_pointer(&ext),
> +                               .flags =
> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
> +                                   I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
> +                       };
> +
> +                       igt_assert(share_vm);
> +
> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
> +                                &args);
> +
> +                       igt_assert(args.ctx_id);
> +                       ctx_id = args.ctx_id;
> +                       wrk->ctx_list[i + 1].id = args.ctx_id;
> +
> +                       __ctx_set_prio(ctx_id, wrk->prio);
> +               }
> +
> +               if (ctx->wants_balance) {
> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
> +                               .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
> +                               .num_siblings = 2,
> +                               .engines = {
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 0 },
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 1 },
> +                               },
> +                       };
> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
> +                               .extensions = to_user_pointer(&load_balance),
> +                               .engines = {
> +                                       { .engine_class = I915_ENGINE_CLASS_INVALID,
> +                                         .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 0 },
> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
> +                                         .engine_instance = 1 },
> +                               },
> +                       };
> +
> +                       struct drm_i915_gem_context_param param = {
> +                               .ctx_id = ctx_id,
> +                               .param = I915_CONTEXT_PARAM_ENGINES,
> +                               .size = sizeof(set_engines),
> +                               .value = to_user_pointer(&set_engines),
> +                       };
> +
> +                       gem_context_set_param(fd, &param);
>                 }

if (share_vm)
	gem_vm_destroy(share_vm);

Just to drop the local handle as the context has acquired its own
reference.

Other than that, it does what it sets out to do: create a context with
choice of engines and load balancing amongst them.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 06/21] gem_wsim: Use IGT uapi headers
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:15     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:43)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We are moving towards bumping the uAPI headers more often instead of using
> too much local struct/ioctl/param definitions since the latter are more
> challenging for rebase and maintenance.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 06/21] gem_wsim: Use IGT uapi headers
@ 2019-05-10 13:15     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:43)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We are moving towards bumping the uAPI headers more often instead of using
> too much local struct/ioctl/param definitions since the latter are more
> challenging for rebase and maintenance.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 07/21] gem_wsim: Factor out common error handling
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:15     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:44)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> There is a repeated pattern with error handling which can be moved to a
> macro to for better readability in the command parsing loop.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Bah, at the cost of including control-flow buried inside the macro.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 07/21] gem_wsim: Factor out common error handling
@ 2019-05-10 13:15     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:44)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> There is a repeated pattern with error handling which can be moved to a
> macro to for better readability in the command parsing loop.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Bah, at the cost of including control-flow buried inside the macro.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH i-g-t 08/21] gem_wsim: More wsim_err
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:16     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:45)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A few more opportunities to compact the code by using the error logging
> helper.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I had to double check that wsim_err() wasn't the magic cf macro.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [Intel-gfx] [PATCH i-g-t 08/21] gem_wsim: More wsim_err
@ 2019-05-10 13:16     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:45)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A few more opportunities to compact the code by using the error logging
> helper.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I had to double check that wsim_err() wasn't the magic cf macro.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 09/21] gem_wsim: Submit fence support
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:18     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:46)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Add support for submit fences in a way similar to how normal input fences
> are handled. Eg:
> 
>   1.RCS.500-1000.0.0
>   1.VCS1.3000.s-1.0
>   1.VCS2.3000.s-2.0

Looks like commands on a punch card. :-p

> Submit fences are signalled when the originating request enters the
> submission backend.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c  | 20 ++++++++++++++++----
>  benchmarks/wsim/README | 17 +++++++++++++++++
>  2 files changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index f1fcef5dcfba..5245692df6eb 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -87,6 +87,7 @@ enum w_type
>  struct deps
>  {
>         int nr;
> +       bool submit_fence;
>         int *list;
>  };
>  
> @@ -253,17 +254,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
>                    w->data_deps.list == w->fence_deps.list);
>  
>         while ((token = strtok_r(tstart, "/", &tctx)) != NULL) {
> +               bool submit_fence = false;
>                 char *str = token;
>                 struct deps *deps;
>                 int dep;
>  
>                 tstart = NULL;
>  
> -               if (strlen(token) > 1 && token[0] == 'f') {
> +               if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) {
> +                       deps = &w->data_deps;
> +               } else {
> +                       if (str[0] == 's')
> +                               submit_fence = true;
> +                       else if (str[0] != 'f')
> +                               return -1;
> +
>                         deps = &w->fence_deps;
>                         str++;
> -               } else {
> -                       deps = &w->data_deps;
>                 }
>  
>                 dep = atoi(str);
> @@ -281,6 +288,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
>                                              sizeof(*deps->list) * deps->nr);
>                         igt_assert(deps->list);
>                         deps->list[deps->nr - 1] = dep;
> +                       deps->submit_fence = submit_fence;
>                 }
>         }
>  
> @@ -1921,7 +1929,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
>                 igt_assert(tgt >= 0 && tgt < w->idx);
>                 igt_assert(wrk->steps[tgt].emit_fence > 0);
>  
> -               w->eb.flags |= I915_EXEC_FENCE_IN;
> +               if (w->fence_deps.submit_fence)
> +                       w->eb.flags |= I915_EXEC_FENCE_SUBMIT;
> +               else
> +                       w->eb.flags |= I915_EXEC_FENCE_IN;
> +
>                 w->eb.rsvd2 = wrk->steps[tgt].emit_fence;

That looked too easy.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 09/21] gem_wsim: Submit fence support
@ 2019-05-10 13:18     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:46)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Add support for submit fences in a way similar to how normal input fences
> are handled. Eg:
> 
>   1.RCS.500-1000.0.0
>   1.VCS1.3000.s-1.0
>   1.VCS2.3000.s-2.0

Looks like commands on a punch card. :-p

> Submit fences are signalled when the originating request enters the
> submission backend.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c  | 20 ++++++++++++++++----
>  benchmarks/wsim/README | 17 +++++++++++++++++
>  2 files changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index f1fcef5dcfba..5245692df6eb 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -87,6 +87,7 @@ enum w_type
>  struct deps
>  {
>         int nr;
> +       bool submit_fence;
>         int *list;
>  };
>  
> @@ -253,17 +254,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
>                    w->data_deps.list == w->fence_deps.list);
>  
>         while ((token = strtok_r(tstart, "/", &tctx)) != NULL) {
> +               bool submit_fence = false;
>                 char *str = token;
>                 struct deps *deps;
>                 int dep;
>  
>                 tstart = NULL;
>  
> -               if (strlen(token) > 1 && token[0] == 'f') {
> +               if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) {
> +                       deps = &w->data_deps;
> +               } else {
> +                       if (str[0] == 's')
> +                               submit_fence = true;
> +                       else if (str[0] != 'f')
> +                               return -1;
> +
>                         deps = &w->fence_deps;
>                         str++;
> -               } else {
> -                       deps = &w->data_deps;
>                 }
>  
>                 dep = atoi(str);
> @@ -281,6 +288,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
>                                              sizeof(*deps->list) * deps->nr);
>                         igt_assert(deps->list);
>                         deps->list[deps->nr - 1] = dep;
> +                       deps->submit_fence = submit_fence;
>                 }
>         }
>  
> @@ -1921,7 +1929,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
>                 igt_assert(tgt >= 0 && tgt < w->idx);
>                 igt_assert(wrk->steps[tgt].emit_fence > 0);
>  
> -               w->eb.flags |= I915_EXEC_FENCE_IN;
> +               if (w->fence_deps.submit_fence)
> +                       w->eb.flags |= I915_EXEC_FENCE_SUBMIT;
> +               else
> +                       w->eb.flags |= I915_EXEC_FENCE_IN;
> +
>                 w->eb.rsvd2 = wrk->steps[tgt].emit_fence;

That looked too easy.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:20     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:47)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
>  1 file changed, 21 insertions(+), 13 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index 5245692df6eb..f654decb24cc 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
>         } \
>  }
>  
> +static int str_to_engine(const char *str)
> +{
> +       unsigned int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
> +               if (!strcasecmp(str, ring_str_map[i]))
> +                       return i;
> +       }
> +
> +       return -1;
> +}
> +
>  static struct workload *
>  parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  {
> @@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>                 }
>  
>                 if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
> -                       unsigned int old_valid = valid;
> -
>                         fstart = NULL;
>  
> -                       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
> -                               if (!strcasecmp(field, ring_str_map[i])) {
> -                                       step.engine = i;
> -                                       if (step.engine == BCS)
> -                                               bcs_used = true;
> -                                       valid++;
> -                                       break;
> -                               }
> -                       }
> -
> -                       check_arg(old_valid == valid,
> +                       i = str_to_engine(field);
> +                       check_arg(i < 0,
>                                   "Invalid engine id at step %u!\n", nr_steps);
> +                       if (i >= 0)
> +                               valid++;

check_arg() returned already for all i < 0, no?
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup
@ 2019-05-10 13:20     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:47)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
>  1 file changed, 21 insertions(+), 13 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index 5245692df6eb..f654decb24cc 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
>         } \
>  }
>  
> +static int str_to_engine(const char *str)
> +{
> +       unsigned int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
> +               if (!strcasecmp(str, ring_str_map[i]))
> +                       return i;
> +       }
> +
> +       return -1;
> +}
> +
>  static struct workload *
>  parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  {
> @@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>                 }
>  
>                 if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
> -                       unsigned int old_valid = valid;
> -
>                         fstart = NULL;
>  
> -                       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
> -                               if (!strcasecmp(field, ring_str_map[i])) {
> -                                       step.engine = i;
> -                                       if (step.engine == BCS)
> -                                               bcs_used = true;
> -                                       valid++;
> -                                       break;
> -                               }
> -                       }
> -
> -                       check_arg(old_valid == valid,
> +                       i = str_to_engine(field);
> +                       check_arg(i < 0,
>                                   "Invalid engine id at step %u!\n", nr_steps);
> +                       if (i >= 0)
> +                               valid++;

check_arg() returned already for all i < 0, no?
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:23     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
> @@ -841,7 +846,11 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
>         if (engine == VCS2 && (flags & VCS2REMAP))
>                 engine = BCS;
>  
> -       eb->flags = eb_engine_map[engine];
> +       if ((flags & I915) && engine == VCS) {
> +               eb->flags = 0;
> +       } else {
> +               eb->flags = eb_engine_map[engine];
> +       }

You drop these brackets in a later patch.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
@ 2019-05-10 13:23     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
> @@ -841,7 +846,11 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
>         if (engine == VCS2 && (flags & VCS2REMAP))
>                 engine = BCS;
>  
> -       eb->flags = eb_engine_map[engine];
> +       if ((flags & I915) && engine == VCS) {
> +               eb->flags = 0;
> +       } else {
> +               eb->flags = eb_engine_map[engine];
> +       }

You drop these brackets in a later patch.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:26     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support new i915 uAPI for configuring contexts with engine maps.
> 
> Please refer to the README file for more detailed explanation.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> +static int parse_engine_map(struct w_step *step, const char *_str)
> +{
> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> +
> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> +               enum intel_engine_id engine;
> +
> +               tstart = NULL;
> +
> +               if (!strcmp(token, "DEFAULT"))
> +                       return -1;
> +               else if (!strcmp(token, "VCS"))
> +                       return -1;
> +
> +               engine = str_to_engine(token);
> +               if ((int)engine < 0)
> +                       return -1;
> +
> +               if (engine != VCS1 && engine != VCS2)
> +                       return -1; /* TODO */
> +
> +               step->engine_map_count++;
> +               step->engine_map = realloc(step->engine_map,
> +                                          step->engine_map_count *
> +                                          sizeof(step->engine_map[0]));
> +               step->engine_map[step->engine_map_count - 1] = engine;


> +               if (ctx->engine_map) {
> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> +                                                         ctx->engine_map_count + 1);
> +                       struct drm_i915_gem_context_param param = {
> +                               .ctx_id = ctx_id,
> +                               .param = I915_CONTEXT_PARAM_ENGINES,
> +                               .size = sizeof(set_engines),
> +                               .value = to_user_pointer(&set_engines),
> +                       };
> +
> +                       set_engines.extensions = 0;
> +
> +                       /* Reserve slot for virtual engine. */
> +                       set_engines.engines[0].engine_class =
> +                               I915_ENGINE_CLASS_INVALID;
> +                       set_engines.engines[0].engine_instance =
> +                               I915_ENGINE_CLASS_INVALID_NONE;
> +
> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
> +                               set_engines.engines[j].engine_class =
> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
> +                               set_engines.engines[j].engine_instance =
> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
> +                       }

I would suggest the file format starts with class:instance specifiers.
Too much FIXME that I think will need a file format change.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
@ 2019-05-10 13:26     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support new i915 uAPI for configuring contexts with engine maps.
> 
> Please refer to the README file for more detailed explanation.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> +static int parse_engine_map(struct w_step *step, const char *_str)
> +{
> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> +
> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> +               enum intel_engine_id engine;
> +
> +               tstart = NULL;
> +
> +               if (!strcmp(token, "DEFAULT"))
> +                       return -1;
> +               else if (!strcmp(token, "VCS"))
> +                       return -1;
> +
> +               engine = str_to_engine(token);
> +               if ((int)engine < 0)
> +                       return -1;
> +
> +               if (engine != VCS1 && engine != VCS2)
> +                       return -1; /* TODO */
> +
> +               step->engine_map_count++;
> +               step->engine_map = realloc(step->engine_map,
> +                                          step->engine_map_count *
> +                                          sizeof(step->engine_map[0]));
> +               step->engine_map[step->engine_map_count - 1] = engine;


> +               if (ctx->engine_map) {
> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> +                                                         ctx->engine_map_count + 1);
> +                       struct drm_i915_gem_context_param param = {
> +                               .ctx_id = ctx_id,
> +                               .param = I915_CONTEXT_PARAM_ENGINES,
> +                               .size = sizeof(set_engines),
> +                               .value = to_user_pointer(&set_engines),
> +                       };
> +
> +                       set_engines.extensions = 0;
> +
> +                       /* Reserve slot for virtual engine. */
> +                       set_engines.engines[0].engine_class =
> +                               I915_ENGINE_CLASS_INVALID;
> +                       set_engines.engines[0].engine_instance =
> +                               I915_ENGINE_CLASS_INVALID_NONE;
> +
> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
> +                               set_engines.engines[j].engine_class =
> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
> +                               set_engines.engines[j].engine_instance =
> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
> +                       }

I would suggest the file format starts with class:instance specifiers.
Too much FIXME that I think will need a file format change.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/21] gem_wsim: Save some lines by changing to implicit NULL checking
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:28     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:28 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:49)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We can improve the parsing loop readability a bit more by avoiding some
> line breaks caused by explicit NULL checks.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/21] gem_wsim: Save some lines by changing to implicit NULL checking
@ 2019-05-10 13:28     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:28 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:49)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We can improve the parsing loop readability a bit more by avoiding some
> line breaks caused by explicit NULL checks.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:29     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:50)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Parsing an integer workload descriptor field is a common pattern which we
> can extract to a helper macro and by doing so further improve the
> readability of the main parsing loop.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
>  1 file changed, 25 insertions(+), 55 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index 4dbfc3e922a9..c2e13d9939c2 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -370,6 +370,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
>         return 0;
>  }
>  
> +#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
> +       if ((field = strtok_r(fstart, ".", &fctx))) { \
> +               tmp = atoi(field); \
> +               check_arg(_COND_, _ERR_, nr_steps); \
> +               step.type = _STEP_; \
> +               step._FIELD_ = tmp; \
> +               goto add_step; \
> +       } \

More hidden control flow :-p
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro
@ 2019-05-10 13:29     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:50)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Parsing an integer workload descriptor field is a common pattern which we
> can extract to a helper macro and by doing so further improve the
> readability of the main parsing loop.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
>  1 file changed, 25 insertions(+), 55 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index 4dbfc3e922a9..c2e13d9939c2 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -370,6 +370,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
>         return 0;
>  }
>  
> +#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
> +       if ((field = strtok_r(fstart, ".", &fctx))) { \
> +               tmp = atoi(field); \
> +               check_arg(_COND_, _ERR_, nr_steps); \
> +               step.type = _STEP_; \
> +               step._FIELD_ = tmp; \
> +               goto add_step; \
> +       } \

More hidden control flow :-p
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:31     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.

Restriction makes sense for keeping linenoise^W file format simple.

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                 if (ctx->engine_map) {
>                         I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>                                                           ctx->engine_map_count + 1);
> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
> +                                                                ctx->engine_map_count);
>                         struct drm_i915_gem_context_param param = {
>                                 .ctx_id = ctx_id,
>                                 .param = I915_CONTEXT_PARAM_ENGINES,
> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                 .value = to_user_pointer(&set_engines),
>                         };
>  
> -                       set_engines.extensions = 0;
> +                       if (ctx->wants_balance) {
> +                               set_engines.extensions =
> +                                       to_user_pointer(&load_balance);
> +
> +                               memset(&load_balance, 0, sizeof(load_balance));
> +                               load_balance.base.name =
> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
> +                               load_balance.num_siblings =
> +                                       ctx->engine_map_count;
> +
> +                               for (j = 0; j < ctx->engine_map_count; j++) {
> +                                       load_balance.engines[j].engine_class =
> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
> +                                       load_balance.engines[j].engine_instance =
> +                                               ctx->engine_map[j] - VCS1; /* FIXME */

Ok, more fallout from fixing ctx->engine_map[] first?

Otherwise looks fine.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
@ 2019-05-10 13:31     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.

Restriction makes sense for keeping linenoise^W file format simple.

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                 if (ctx->engine_map) {
>                         I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>                                                           ctx->engine_map_count + 1);
> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
> +                                                                ctx->engine_map_count);
>                         struct drm_i915_gem_context_param param = {
>                                 .ctx_id = ctx_id,
>                                 .param = I915_CONTEXT_PARAM_ENGINES,
> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                 .value = to_user_pointer(&set_engines),
>                         };
>  
> -                       set_engines.extensions = 0;
> +                       if (ctx->wants_balance) {
> +                               set_engines.extensions =
> +                                       to_user_pointer(&load_balance);
> +
> +                               memset(&load_balance, 0, sizeof(load_balance));
> +                               load_balance.base.name =
> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
> +                               load_balance.num_siblings =
> +                                       ctx->engine_map_count;
> +
> +                               for (j = 0; j < ctx->engine_map_count; j++) {
> +                                       load_balance.engines[j].engine_class =
> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
> +                                       load_balance.engines[j].engine_instance =
> +                                               ctx->engine_map[j] - VCS1; /* FIXME */

Ok, more fallout from fixing ctx->engine_map[] first?

Otherwise looks fine.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/21] gem_wsim: Engine bond command
  2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-10 13:36     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:36 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:52)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Engine bonds are an i915 uAPI applicable to load balanced contexts with
> engine map. They allow expression rules of engine selection between two
> contexts when submissions are also tied with submit fences.
> 
> Please refer to the README for a more detailed description.

I would prefer not to have a hexadecimal mask in the file format? That's
harder than usual to read later on.

bond({master_class:master_instance}, {engine_class:engine_instance}),...
?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/21] gem_wsim: Engine bond command
@ 2019-05-10 13:36     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:36 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:52)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Engine bonds are an i915 uAPI applicable to load balanced contexts with
> engine map. They allow expression rules of engine selection between two
> contexts when submissions are also tied with submit fences.
> 
> Please refer to the README for a more detailed description.

I would prefer not to have a hexadecimal mask in the file format? That's
harder than usual to read later on.

bond({master_class:master_instance}, {engine_class:engine_instance}),...
?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:37     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A few additional workloads useful for experimenting with scheduling.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/21] gem_wsim: Some more example workloads
@ 2019-05-10 13:37     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:53)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A few additional workloads useful for experimenting with scheduling.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-10 13:48     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:54)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> For simulating frame split workloads it is useful to express a batch which
> ends at the same time as the parallel submission on the respective bonded
> engine. For this we add support for infinite batch durations and the batch
> terminate command ('T'). Syntax looks like this:
> 
>   1.RCS.*.0.0
>   T.-1
> 
> First step starts an infinite batch, and second command terminates the
> infinite batch with the usual relative workload step addressing.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c                  | 119 +++++++++++++++++++------
>  benchmarks/wsim/README                 |   9 +-
>  benchmarks/wsim/frame-split-60fps.wsim |   6 +-
>  3 files changed, 102 insertions(+), 32 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index cc6f4a742c12..97821b723b02 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -86,6 +86,7 @@ enum w_type
>         ENGINE_MAP,
>         LOAD_BALANCE,
>         BOND,
> +       TERMINATE,
>  };
>  
>  struct deps
> @@ -113,6 +114,7 @@ struct w_step
>         unsigned int context;
>         unsigned int engine;
>         struct duration duration;
> +       bool unbound_duration;
>         struct deps data_deps;
>         struct deps fence_deps;
>         int emit_fence;
> @@ -143,7 +145,7 @@ struct w_step
>  
>         struct drm_i915_gem_execbuffer2 eb;
>         struct drm_i915_gem_exec_object2 *obj;
> -       struct drm_i915_gem_relocation_entry reloc[4];
> +       struct drm_i915_gem_relocation_entry reloc[5];
>         unsigned long bb_sz;
>         uint32_t bb_handle;
>         uint32_t *seqno_value;
> @@ -153,6 +155,7 @@ struct w_step
>         uint32_t *rt1_address;
>         uint32_t *latch_value;
>         uint32_t *latch_address;
> +       uint32_t *recursive_bb_start;
>  };
>  
>  DECLARE_EWMA(uint64_t, rt, 4, 2)
> @@ -491,6 +494,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  
>                                 step.type = ENGINE_MAP;
>                                 goto add_step;
> +                       } else if (!strcmp(field, "T")) {
> +                               int_field(TERMINATE, target,
> +                                         tmp >= 0 || ((int)nr_steps + tmp) < 0,
> +                                         "Invalid terminate target at step %u!\n");
>                         } else if (!strcmp(field, "X")) {
>                                 unsigned int nr = 0;
>                                 while ((field = strtok_r(fstart, ".", &fctx))) {
> @@ -605,23 +612,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  
>                         fstart = NULL;
>  
> -                       tmpl = strtol(field, &sep, 10);
> -                       check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
> -                                 tmpl == LONG_MAX,
> -                                 "Invalid duration at step %u!\n", nr_steps);
> -                       step.duration.min = tmpl;
> -
> -                       if (sep && *sep == '-') {
> -                               tmpl = strtol(sep + 1, NULL, 10);
> -                               check_arg(tmpl <= 0 ||
> -                                         tmpl <= step.duration.min ||
> -                                         tmpl == LONG_MIN ||
> +                       if (field[0] == '*') {
> +                               step.unbound_duration = true;
> +                       } else {
> +                               tmpl = strtol(field, &sep, 10);
> +                               check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
>                                           tmpl == LONG_MAX,
> -                                         "Invalid duration range at step %u!\n",
> +                                         "Invalid duration at step %u!\n",
>                                           nr_steps);
> -                               step.duration.max = tmpl;
> -                       } else {
> -                               step.duration.max = step.duration.min;
> +                               step.duration.min = tmpl;
> +
> +                               if (sep && *sep == '-') {
> +                                       tmpl = strtol(sep + 1, NULL, 10);
> +                                       check_arg(tmpl <= 0 ||
> +                                               tmpl <= step.duration.min ||
> +                                               tmpl == LONG_MIN ||
> +                                               tmpl == LONG_MAX,
> +                                               "Invalid duration range at step %u!\n",
> +                                               nr_steps);
> +                                       step.duration.max = tmpl;
> +                               } else {
> +                                       step.duration.max = step.duration.min;
> +                               }
>                         }
>  
>                         valid++;
> @@ -781,7 +793,7 @@ init_bb(struct w_step *w, unsigned int flags)
>         unsigned int i;
>         uint32_t *ptr;
>  
> -       if (!arb_period)
> +       if (w->unbound_duration || !arb_period)
>                 return;
>  
>         gem_set_domain(fd, w->bb_handle,
> @@ -801,6 +813,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         const uint32_t bbe = 0xa << 23;
>         unsigned long mmap_start, mmap_len;
>         unsigned long batch_start = w->bb_sz;
> +       unsigned int r = 0;
>         uint32_t *ptr, *cs;
>  
>         igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
> @@ -811,6 +824,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         if (flags & RT)
>                 batch_start -= 12 * sizeof(uint32_t);
>  
> +       if (w->unbound_duration)
> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +
>         mmap_start = rounddown(batch_start, PAGE_SIZE);
>         mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>  
> @@ -820,8 +836,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>         cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>  
> +       if (w->unbound_duration) {
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
> +               batch_start += 4 * sizeof(uint32_t);
> +
> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
> +               w->recursive_bb_start = cs;
> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> +               *cs++ = 0;
> +               *cs++ = 0;

Hmm. Have we previously checked for gen >= 8?

So preemption check interval is given by batch_start - mmap_start.
Which is limited to a max of 64 bytes. That might be a bit excessive on
the frequency of doing MI_BB_START, certainly for gen7, gen8+ is a tad
more forgiving i.e. it has more bw and doesn't starve the cpu as much.

> +       }
> +
>         if (flags & SEQNO) {
> -               w->reloc[0].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -833,7 +860,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         }
>  
>         if (flags & RT) {
> -               w->reloc[1].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -843,7 +870,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 w->rt0_value = cs;
>                 *cs++ = 0;
>  
> -               w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
> @@ -852,7 +879,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 *cs++ = 0;
>                 *cs++ = 0;
>  
> -               w->reloc[3].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -984,19 +1011,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>                 }
>         }
>  
> -       w->bb_sz = get_bb_sz(w->duration.max);
> -       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
> +       if (w->unbound_duration)
> +               /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +               w->bb_sz = max(64, get_bb_sz(w->preempt_us)) +
> +                          (1 + 3) * sizeof(uint32_t);
> +       else
> +               w->bb_sz = get_bb_sz(w->duration.max);
> +       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
>         init_bb(w, flags);
>         terminate_bb(w, flags);
>  
> -       if (flags & SEQNO) {
> +       if ((flags & SEQNO) || w->unbound_duration) {
>                 w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
> +               if (flags & SEQNO)
> +                       w->obj[j].relocation_count++;
>                 if (flags & RT)
> -                       w->obj[j].relocation_count = 4;
> -               else
> -                       w->obj[j].relocation_count = 1;
> +                       w->obj[j].relocation_count += 3;
> +               if (w->unbound_duration)
> +                       w->obj[j].relocation_count++;

Huh, I expected to see w->obj[j].relocation_count = r;
Already out of scope?

>                 for (i = 0; i < w->obj[j].relocation_count; i++)
>                         w->reloc[i].target_handle = 1;
> +               if (w->unbound_duration)
> +                       w->reloc[0].target_handle = j;

Ok, recursive BB_START.
>         }
>  
>         w->eb.buffers_ptr = to_user_pointer(w->obj);
> @@ -2036,6 +2072,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
>         }
>  }
>  
> +static void
> +update_bb_start(struct w_step *w)
> +{
> +       if (!w->unbound_duration)
> +               return;
> +
> +       gem_set_domain(fd, w->bb_handle,
> +                      I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);

Hmm. A scary sync point. Do you just want to be sure you have flushed
the previous user?

> +       *w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
> +}
> +
>  static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
>  {
>         if (target < 0)
> @@ -2171,9 +2219,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
>         if (flags & RT)
>                 update_bb_rt(w, engine, seqno);
>  
> +       update_bb_start(w);
> +
>         w->eb.batch_start_offset =
> +               w->unbound_duration ?
> +               0 :
>                 ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
> -                       2 * sizeof(uint32_t));
> +                     2 * sizeof(uint32_t));
>  
>         for (i = 0; i < w->fence_deps.nr; i++) {
>                 int tgt = w->idx + w->fence_deps.list[i];
> @@ -2313,6 +2365,17 @@ static void *run_workload(void *data)
>                                                                     w->priority;
>                                 }
>                                 continue;
> +                       } else if (w->type == TERMINATE) {
> +                               unsigned int t_idx = i + w->target;
> +
> +                               igt_assert(t_idx >= 0 && t_idx < i);
> +                               igt_assert(wrk->steps[t_idx].type == BATCH);
> +                               igt_assert(wrk->steps[t_idx].unbound_duration);
> +
> +                               *wrk->steps[t_idx].recursive_bb_start =
> +                                       MI_BATCH_BUFFER_END;
> +                               __sync_synchronize();
> +                               continue;
>                         } else if (w->type == PREEMPTION ||
>                                    w->type == ENGINE_MAP ||
>                                    w->type == LOAD_BALANCE ||
> diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
> index 6aec718bc812..c94d01018419 100644
> --- a/benchmarks/wsim/README
> +++ b/benchmarks/wsim/README
> @@ -2,11 +2,11 @@ Workload descriptor format
>  ==========================
>  
>  ctx.engine.duration_us.dependency.wait,...
> -<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
> +<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
>  B.<uint>
>  M.<uint>.<str>[|<str>]...
>  P|X.<uint>.<int>
> -d|p|s|t|q|a.<int>,...
> +d|p|s|t|q|a|T.<int>,...
>  b.<uint>.<uint>.<str>
>  f
>  
> @@ -30,6 +30,7 @@ Additional workload steps are also supported:
>   'b' - Set up engine bonds.
>   'M' - Set up engine map.
>   'P' - Context priority.
> + 'T' - Terminate an infinite batch.
>   'X' - Context preemption control.
>  
>  Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
> @@ -77,6 +78,10 @@ Example:
>  
>  I this case the last step has a data dependency on both first and second steps.
>  
> +Batch durations can also be specified as infinite by using the '*' in the
> +duration field. Such batches must be ended by the terminate command ('T')
> +otherwise they will cause a GPU hang to be reported.
> +
>  Sync (fd) fences
>  ----------------
>  
> diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
> index cfbfcd39be7d..ea89da3add48 100644
> --- a/benchmarks/wsim/frame-split-60fps.wsim
> +++ b/benchmarks/wsim/frame-split-60fps.wsim
> @@ -6,10 +6,12 @@ M.2.VCS2
>  B.2
>  b.2.1.VCS1
>  f
> -1.DEFAULT.4000-6000.f-1.0
> +1.DEFAULT.*.f-1.0
>  2.DEFAULT.4000-6000.s-1.0
>  a.-3
> -3.RCS.2000-4000.-3/-2.0
> +s.-2
> +T.-4
> +3.RCS.2000-4000.-5/-4.0
>  3.VECS.2000.-1.0
>  4.BCS.1000.-1.0
>  s.-2

Usecase looks reasonable.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
@ 2019-05-10 13:48     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-10 13:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:54)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> For simulating frame split workloads it is useful to express a batch which
> ends at the same time as the parallel submission on the respective bonded
> engine. For this we add support for infinite batch durations and the batch
> terminate command ('T'). Syntax looks like this:
> 
>   1.RCS.*.0.0
>   T.-1
> 
> First step starts an infinite batch, and second command terminates the
> infinite batch with the usual relative workload step addressing.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c                  | 119 +++++++++++++++++++------
>  benchmarks/wsim/README                 |   9 +-
>  benchmarks/wsim/frame-split-60fps.wsim |   6 +-
>  3 files changed, 102 insertions(+), 32 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index cc6f4a742c12..97821b723b02 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -86,6 +86,7 @@ enum w_type
>         ENGINE_MAP,
>         LOAD_BALANCE,
>         BOND,
> +       TERMINATE,
>  };
>  
>  struct deps
> @@ -113,6 +114,7 @@ struct w_step
>         unsigned int context;
>         unsigned int engine;
>         struct duration duration;
> +       bool unbound_duration;
>         struct deps data_deps;
>         struct deps fence_deps;
>         int emit_fence;
> @@ -143,7 +145,7 @@ struct w_step
>  
>         struct drm_i915_gem_execbuffer2 eb;
>         struct drm_i915_gem_exec_object2 *obj;
> -       struct drm_i915_gem_relocation_entry reloc[4];
> +       struct drm_i915_gem_relocation_entry reloc[5];
>         unsigned long bb_sz;
>         uint32_t bb_handle;
>         uint32_t *seqno_value;
> @@ -153,6 +155,7 @@ struct w_step
>         uint32_t *rt1_address;
>         uint32_t *latch_value;
>         uint32_t *latch_address;
> +       uint32_t *recursive_bb_start;
>  };
>  
>  DECLARE_EWMA(uint64_t, rt, 4, 2)
> @@ -491,6 +494,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  
>                                 step.type = ENGINE_MAP;
>                                 goto add_step;
> +                       } else if (!strcmp(field, "T")) {
> +                               int_field(TERMINATE, target,
> +                                         tmp >= 0 || ((int)nr_steps + tmp) < 0,
> +                                         "Invalid terminate target at step %u!\n");
>                         } else if (!strcmp(field, "X")) {
>                                 unsigned int nr = 0;
>                                 while ((field = strtok_r(fstart, ".", &fctx))) {
> @@ -605,23 +612,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  
>                         fstart = NULL;
>  
> -                       tmpl = strtol(field, &sep, 10);
> -                       check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
> -                                 tmpl == LONG_MAX,
> -                                 "Invalid duration at step %u!\n", nr_steps);
> -                       step.duration.min = tmpl;
> -
> -                       if (sep && *sep == '-') {
> -                               tmpl = strtol(sep + 1, NULL, 10);
> -                               check_arg(tmpl <= 0 ||
> -                                         tmpl <= step.duration.min ||
> -                                         tmpl == LONG_MIN ||
> +                       if (field[0] == '*') {
> +                               step.unbound_duration = true;
> +                       } else {
> +                               tmpl = strtol(field, &sep, 10);
> +                               check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
>                                           tmpl == LONG_MAX,
> -                                         "Invalid duration range at step %u!\n",
> +                                         "Invalid duration at step %u!\n",
>                                           nr_steps);
> -                               step.duration.max = tmpl;
> -                       } else {
> -                               step.duration.max = step.duration.min;
> +                               step.duration.min = tmpl;
> +
> +                               if (sep && *sep == '-') {
> +                                       tmpl = strtol(sep + 1, NULL, 10);
> +                                       check_arg(tmpl <= 0 ||
> +                                               tmpl <= step.duration.min ||
> +                                               tmpl == LONG_MIN ||
> +                                               tmpl == LONG_MAX,
> +                                               "Invalid duration range at step %u!\n",
> +                                               nr_steps);
> +                                       step.duration.max = tmpl;
> +                               } else {
> +                                       step.duration.max = step.duration.min;
> +                               }
>                         }
>  
>                         valid++;
> @@ -781,7 +793,7 @@ init_bb(struct w_step *w, unsigned int flags)
>         unsigned int i;
>         uint32_t *ptr;
>  
> -       if (!arb_period)
> +       if (w->unbound_duration || !arb_period)
>                 return;
>  
>         gem_set_domain(fd, w->bb_handle,
> @@ -801,6 +813,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         const uint32_t bbe = 0xa << 23;
>         unsigned long mmap_start, mmap_len;
>         unsigned long batch_start = w->bb_sz;
> +       unsigned int r = 0;
>         uint32_t *ptr, *cs;
>  
>         igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
> @@ -811,6 +824,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         if (flags & RT)
>                 batch_start -= 12 * sizeof(uint32_t);
>  
> +       if (w->unbound_duration)
> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +
>         mmap_start = rounddown(batch_start, PAGE_SIZE);
>         mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>  
> @@ -820,8 +836,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>         cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>  
> +       if (w->unbound_duration) {
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
> +               batch_start += 4 * sizeof(uint32_t);
> +
> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
> +               w->recursive_bb_start = cs;
> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> +               *cs++ = 0;
> +               *cs++ = 0;

Hmm. Have we previously checked for gen >= 8?

So preemption check interval is given by batch_start - mmap_start.
Which is limited to a max of 64 bytes. That might be a bit excessive on
the frequency of doing MI_BB_START, certainly for gen7, gen8+ is a tad
more forgiving i.e. it has more bw and doesn't starve the cpu as much.

> +       }
> +
>         if (flags & SEQNO) {
> -               w->reloc[0].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -833,7 +860,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         }
>  
>         if (flags & RT) {
> -               w->reloc[1].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -843,7 +870,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 w->rt0_value = cs;
>                 *cs++ = 0;
>  
> -               w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
> @@ -852,7 +879,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 *cs++ = 0;
>                 *cs++ = 0;
>  
> -               w->reloc[3].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -984,19 +1011,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>                 }
>         }
>  
> -       w->bb_sz = get_bb_sz(w->duration.max);
> -       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
> +       if (w->unbound_duration)
> +               /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +               w->bb_sz = max(64, get_bb_sz(w->preempt_us)) +
> +                          (1 + 3) * sizeof(uint32_t);
> +       else
> +               w->bb_sz = get_bb_sz(w->duration.max);
> +       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
>         init_bb(w, flags);
>         terminate_bb(w, flags);
>  
> -       if (flags & SEQNO) {
> +       if ((flags & SEQNO) || w->unbound_duration) {
>                 w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
> +               if (flags & SEQNO)
> +                       w->obj[j].relocation_count++;
>                 if (flags & RT)
> -                       w->obj[j].relocation_count = 4;
> -               else
> -                       w->obj[j].relocation_count = 1;
> +                       w->obj[j].relocation_count += 3;
> +               if (w->unbound_duration)
> +                       w->obj[j].relocation_count++;

Huh, I expected to see w->obj[j].relocation_count = r;
Already out of scope?

>                 for (i = 0; i < w->obj[j].relocation_count; i++)
>                         w->reloc[i].target_handle = 1;
> +               if (w->unbound_duration)
> +                       w->reloc[0].target_handle = j;

Ok, recursive BB_START.
>         }
>  
>         w->eb.buffers_ptr = to_user_pointer(w->obj);
> @@ -2036,6 +2072,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
>         }
>  }
>  
> +static void
> +update_bb_start(struct w_step *w)
> +{
> +       if (!w->unbound_duration)
> +               return;
> +
> +       gem_set_domain(fd, w->bb_handle,
> +                      I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);

Hmm. A scary sync point. Do you just want to be sure you have flushed
the previous user?

> +       *w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
> +}
> +
>  static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
>  {
>         if (target < 0)
> @@ -2171,9 +2219,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
>         if (flags & RT)
>                 update_bb_rt(w, engine, seqno);
>  
> +       update_bb_start(w);
> +
>         w->eb.batch_start_offset =
> +               w->unbound_duration ?
> +               0 :
>                 ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
> -                       2 * sizeof(uint32_t));
> +                     2 * sizeof(uint32_t));
>  
>         for (i = 0; i < w->fence_deps.nr; i++) {
>                 int tgt = w->idx + w->fence_deps.list[i];
> @@ -2313,6 +2365,17 @@ static void *run_workload(void *data)
>                                                                     w->priority;
>                                 }
>                                 continue;
> +                       } else if (w->type == TERMINATE) {
> +                               unsigned int t_idx = i + w->target;
> +
> +                               igt_assert(t_idx >= 0 && t_idx < i);
> +                               igt_assert(wrk->steps[t_idx].type == BATCH);
> +                               igt_assert(wrk->steps[t_idx].unbound_duration);
> +
> +                               *wrk->steps[t_idx].recursive_bb_start =
> +                                       MI_BATCH_BUFFER_END;
> +                               __sync_synchronize();
> +                               continue;
>                         } else if (w->type == PREEMPTION ||
>                                    w->type == ENGINE_MAP ||
>                                    w->type == LOAD_BALANCE ||
> diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
> index 6aec718bc812..c94d01018419 100644
> --- a/benchmarks/wsim/README
> +++ b/benchmarks/wsim/README
> @@ -2,11 +2,11 @@ Workload descriptor format
>  ==========================
>  
>  ctx.engine.duration_us.dependency.wait,...
> -<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
> +<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
>  B.<uint>
>  M.<uint>.<str>[|<str>]...
>  P|X.<uint>.<int>
> -d|p|s|t|q|a.<int>,...
> +d|p|s|t|q|a|T.<int>,...
>  b.<uint>.<uint>.<str>
>  f
>  
> @@ -30,6 +30,7 @@ Additional workload steps are also supported:
>   'b' - Set up engine bonds.
>   'M' - Set up engine map.
>   'P' - Context priority.
> + 'T' - Terminate an infinite batch.
>   'X' - Context preemption control.
>  
>  Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
> @@ -77,6 +78,10 @@ Example:
>  
>  I this case the last step has a data dependency on both first and second steps.
>  
> +Batch durations can also be specified as infinite by using the '*' in the
> +duration field. Such batches must be ended by the terminate command ('T')
> +otherwise they will cause a GPU hang to be reported.
> +
>  Sync (fd) fences
>  ----------------
>  
> diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
> index cfbfcd39be7d..ea89da3add48 100644
> --- a/benchmarks/wsim/frame-split-60fps.wsim
> +++ b/benchmarks/wsim/frame-split-60fps.wsim
> @@ -6,10 +6,12 @@ M.2.VCS2
>  B.2
>  b.2.1.VCS1
>  f
> -1.DEFAULT.4000-6000.f-1.0
> +1.DEFAULT.*.f-1.0
>  2.DEFAULT.4000-6000.s-1.0
>  a.-3
> -3.RCS.2000-4000.-3/-2.0
> +s.-2
> +T.-4
> +3.RCS.2000-4000.-5/-4.0
>  3.VECS.2000.-1.0
>  4.BCS.1000.-1.0
>  s.-2

Usecase looks reasonable.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
  2019-05-10 12:33     ` Chris Wilson
@ 2019-05-13 12:16       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:16 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 13:33, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> After the removal of engine global seqnos and the corresponding
>> intel_engine_notify tracepoints the script needs to be adjusted to cope
>> with the new state of things.
>>
>> To keep working it switches over using the dma_fence:dma_fence_signaled:
>> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
>> with engines.
>>
>> It also needs to key the completion events on the full engine/ctx/seqno
>> tokens, and adjust correspondingly the timeline sorting logic.
>>
>> v2:
>>   * Do not use late notifications (received after context complete) when
>>     splitting up coalesced requests. They are now much more likely and can
>>     not be used.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   scripts/trace.pl | 82 ++++++++++++++++++++++++------------------------
>>   1 file changed, 41 insertions(+), 41 deletions(-)
>>
>> diff --git a/scripts/trace.pl b/scripts/trace.pl
>> index 18f9f3b18396..95dc3a645e8e 100755
>> --- a/scripts/trace.pl
>> +++ b/scripts/trace.pl
>> @@ -27,7 +27,8 @@ use warnings;
>>   use 5.010;
>>   
>>   my $gid = 0;
>> -my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
>> +my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
>> +    %ctxtimelines, %ctxengines);
>>   my @freqs;
> 
> So what's ctxengines? Or rings for that matter?

rings go back to the beginnings of the tool when I think the 
visualizaiton library needed an unique integer value for every timeline 
(so engine). And there is a ringmap from this id back to our engine 
name. Perhaps this would be clearer if reversed, but I am not sure how 
much churn would that be without actually doing it. Renaming rings to 
engines would also make sense.

> I take it ctxengines is really the last engine which we saw this context
> execute on?

Correct.

I guess there is a problem if dma_fence_signaled is delayed past another 
request_in. Hm but I also have a die if engine is different.. that 
cannot be right, but why it didn't fail.. I need to double check this.

> 
>>   
>>   my $max_items = 3000;
>> @@ -66,7 +67,7 @@ Notes:
>>                                 i915:i915_request_submit, \
>>                                 i915:i915_request_in, \
>>                                 i915:i915_request_out, \
>> -                              i915:intel_engine_notify, \
>> +                              dma_fence:dma_fence_signaled, \
>>                                 i915:i915_request_wait_begin, \
>>                                 i915:i915_request_wait_end \
>>                                 [command-to-be-profiled]
>> @@ -161,7 +162,7 @@ sub arg_trace
>>                         'i915:i915_request_submit',
>>                         'i915:i915_request_in',
>>                         'i915:i915_request_out',
>> -                      'i915:intel_engine_notify',
>> +                      'dma_fence:dma_fence_signaled',
>>                         'i915:i915_request_wait_begin',
>>                         'i915:i915_request_wait_end' );
>>   
>> @@ -312,13 +313,6 @@ sub db_key
>>          return $ring . '/' . $ctx . '/' . $seqno;
>>   }
>>   
>> -sub global_key
>> -{
>> -       my ($ring, $seqno) = @_;
>> -
>> -       return $ring . '/' . $seqno;
>> -}
>> -
>>   sub sanitize_ctx
>>   {
>>          my ($ctx, $ring) = @_;
>> @@ -419,6 +413,8 @@ while (<>) {
>>                  $req{'ring'} = $ring;
>>                  $req{'seqno'} = $seqno;
>>                  $req{'ctx'} = $ctx;
>> +               die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
>> +               $ctxengines{$ctx} = $ring;
>>                  $ctxtimelines{$ctx . '/' . $ring} = 1;
>>                  $req{'name'} = $ctx . '/' . $seqno;
>>                  $req{'global'} = $tp{'global'};
>> @@ -429,16 +425,29 @@ while (<>) {
>>                  $ringmap{$rings{$ring}} = $ring;
>>                  $db{$key} = \%req;
>>          } elsif ($tp_name eq 'i915:i915_request_out:') {
>> -               my $gkey = global_key($ring, $tp{'global'});
>> +               my $gkey;
>> +
> 
> # Must be paired with a previous i915_request_in
>> +               die unless exists $ctxengines{$ctx};
> 
> I'd suggest next unless, because there's always a change the capture is
> started part way though someone's workload.

That would need much more work.

> 
>> +               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
>> +
>> +               if ($tp{'completed?'}) {
>> +                       die unless exists $db{$key};
>> +                       die unless exists $db{$key}->{'start'};
>> +                       die if exists $db{$key}->{'end'};
>> +
>> +                       $db{$key}->{'end'} = $time;
>> +                       $db{$key}->{'notify'} = $notify{$gkey}
>> +                                               if exists $notify{$gkey};
> 
> Hmm. With preempt-to-busy, a request can complete when we are no longer
> tracking it (it completes before we preempt it).
> 
> They will still get the schedule-out tracepoint, but marked as
> incomplete, and there will be a signaled tp later before we try and
> resubmit.

This sounds like the request would disappear from the scripts view.

> 
>> +               } else {
>> +                       delete $db{$key};
>> +               }
>> +       } elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
>> +               my $gkey;
>>   
>> -               die unless exists $db{$key};
>> -               die unless exists $db{$key}->{'start'};
>> -               die if exists $db{$key}->{'end'};
>> +               die unless exists $ctxengines{$tp{'context'}};
>>   
>> -               $db{$key}->{'end'} = $time;
>> -               $db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
>> -       } elsif ($tp_name eq 'i915:intel_engine_notify:') {
>> -               my $gkey = global_key($ring, $seqno);
>> +               $gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
>>   
>>                  $notify{$gkey} = $time unless exists $notify{$gkey};
>>          } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
>> @@ -452,7 +461,7 @@ while (<>) {
>>   # find the largest seqno to be used for timeline sorting purposes.
>>   my $max_seqno = 0;
>>   foreach my $key (keys %db) {
>> -       my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
>> +       my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
>>   
>>          die unless exists $db{$key}->{'start'};
>>   
>> @@ -478,14 +487,13 @@ my $key_count = scalar(keys %db);
>>   
>>   my %engine_timelines;
>>   
>> -sub sortEngine {
>> -       my $as = $db{$a}->{'global'};
>> -       my $bs = $db{$b}->{'global'};
>> +sub sortStart {
>> +       my $as = $db{$a}->{'start'};
>> +       my $bs = $db{$b}->{'start'};
>>          my $val;
>>   
>>          $val = $as <=> $bs;
>> -
>> -       die if $val == 0;
>> +       $val = $a cmp $b if $val == 0;
>>   
>>          return $val;
>>   }
>> @@ -497,9 +505,7 @@ sub get_engine_timeline {
>>          return $engine_timelines{$ring} if exists $engine_timelines{$ring};
>>   
>>          @timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
>> -       # FIXME seqno restart
>> -       @timeline = sort sortEngine @timeline;
>> -
>> +       @timeline = sort sortStart @timeline;
>>          $engine_timelines{$ring} = \@timeline;
>>   
>>          return \@timeline;
>> @@ -561,20 +567,10 @@ foreach my $gid (sort keys %rings) {
>>                          $db{$key}->{'no-notify'} = 1;
>>                  }
>>                  $db{$key}->{'end'} = $end;
>> +               $db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
>>          }
>>   }
>>   
>> -sub sortStart {
>> -       my $as = $db{$a}->{'start'};
>> -       my $bs = $db{$b}->{'start'};
>> -       my $val;
>> -
>> -       $val = $as <=> $bs;
>> -       $val = $a cmp $b if $val == 0;
>> -
>> -       return $val;
>> -}
>> -
>>   my $re_sort = 1;
>>   my @sorted_keys;
>>   
>> @@ -670,9 +666,13 @@ if ($correct_durations) {
>>                          next unless exists $db{$key}->{'no-end'};
>>                          last if $pos == $#{$timeline};
>>   
>> -                       # Shift following request to start after the current one
>> +                       # Shift following request to start after the current
>> +                       # one, but only if that wouldn't make it zero duration,
>> +                       # which would indicate notify arrived after context
>> +                       # complete.
>>                          $next_key = ${$timeline}[$pos + 1];
>> -                       if (exists $db{$key}->{'notify'}) {
>> +                       if (exists $db{$key}->{'notify'} and
>> +                           $db{$key}->{'notify'} < $db{$key}->{'end'}) {
>>                                  $db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
>>                                  $db{$next_key}->{'start'} = $db{$key}->{'notify'};
>>                                  $re_sort = 1;
>> @@ -750,9 +750,9 @@ foreach my $gid (sort keys %rings) {
>>          # Extract all GPU busy intervals and sort them.
>>          foreach my $key (@sorted_keys) {
>>                  next unless $db{$key}->{'ring'} eq $ring;
>> +               die if $db{$key}->{'start'} > $db{$key}->{'end'};
> 
> Heh, we're out of luck if we want to trace across seqno wraparound.

Yeah, that's another missing thing.

> 
> It makes enough sense,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks. Overall the script could use a cleanup so I'll try to find some 
time towards it when this settles.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal
@ 2019-05-13 12:16       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:16 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 13:33, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:38)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> After the removal of engine global seqnos and the corresponding
>> intel_engine_notify tracepoints the script needs to be adjusted to cope
>> with the new state of things.
>>
>> To keep working it switches over using the dma_fence:dma_fence_signaled:
>> tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
>> with engines.
>>
>> It also needs to key the completion events on the full engine/ctx/seqno
>> tokens, and adjust correspondingly the timeline sorting logic.
>>
>> v2:
>>   * Do not use late notifications (received after context complete) when
>>     splitting up coalesced requests. They are now much more likely and can
>>     not be used.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   scripts/trace.pl | 82 ++++++++++++++++++++++++------------------------
>>   1 file changed, 41 insertions(+), 41 deletions(-)
>>
>> diff --git a/scripts/trace.pl b/scripts/trace.pl
>> index 18f9f3b18396..95dc3a645e8e 100755
>> --- a/scripts/trace.pl
>> +++ b/scripts/trace.pl
>> @@ -27,7 +27,8 @@ use warnings;
>>   use 5.010;
>>   
>>   my $gid = 0;
>> -my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
>> +my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
>> +    %ctxtimelines, %ctxengines);
>>   my @freqs;
> 
> So what's ctxengines? Or rings for that matter?

rings go back to the beginnings of the tool when I think the 
visualizaiton library needed an unique integer value for every timeline 
(so engine). And there is a ringmap from this id back to our engine 
name. Perhaps this would be clearer if reversed, but I am not sure how 
much churn would that be without actually doing it. Renaming rings to 
engines would also make sense.

> I take it ctxengines is really the last engine which we saw this context
> execute on?

Correct.

I guess there is a problem if dma_fence_signaled is delayed past another 
request_in. Hm but I also have a die if engine is different.. that 
cannot be right, but why it didn't fail.. I need to double check this.

> 
>>   
>>   my $max_items = 3000;
>> @@ -66,7 +67,7 @@ Notes:
>>                                 i915:i915_request_submit, \
>>                                 i915:i915_request_in, \
>>                                 i915:i915_request_out, \
>> -                              i915:intel_engine_notify, \
>> +                              dma_fence:dma_fence_signaled, \
>>                                 i915:i915_request_wait_begin, \
>>                                 i915:i915_request_wait_end \
>>                                 [command-to-be-profiled]
>> @@ -161,7 +162,7 @@ sub arg_trace
>>                         'i915:i915_request_submit',
>>                         'i915:i915_request_in',
>>                         'i915:i915_request_out',
>> -                      'i915:intel_engine_notify',
>> +                      'dma_fence:dma_fence_signaled',
>>                         'i915:i915_request_wait_begin',
>>                         'i915:i915_request_wait_end' );
>>   
>> @@ -312,13 +313,6 @@ sub db_key
>>          return $ring . '/' . $ctx . '/' . $seqno;
>>   }
>>   
>> -sub global_key
>> -{
>> -       my ($ring, $seqno) = @_;
>> -
>> -       return $ring . '/' . $seqno;
>> -}
>> -
>>   sub sanitize_ctx
>>   {
>>          my ($ctx, $ring) = @_;
>> @@ -419,6 +413,8 @@ while (<>) {
>>                  $req{'ring'} = $ring;
>>                  $req{'seqno'} = $seqno;
>>                  $req{'ctx'} = $ctx;
>> +               die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
>> +               $ctxengines{$ctx} = $ring;
>>                  $ctxtimelines{$ctx . '/' . $ring} = 1;
>>                  $req{'name'} = $ctx . '/' . $seqno;
>>                  $req{'global'} = $tp{'global'};
>> @@ -429,16 +425,29 @@ while (<>) {
>>                  $ringmap{$rings{$ring}} = $ring;
>>                  $db{$key} = \%req;
>>          } elsif ($tp_name eq 'i915:i915_request_out:') {
>> -               my $gkey = global_key($ring, $tp{'global'});
>> +               my $gkey;
>> +
> 
> # Must be paired with a previous i915_request_in
>> +               die unless exists $ctxengines{$ctx};
> 
> I'd suggest next unless, because there's always a change the capture is
> started part way though someone's workload.

That would need much more work.

> 
>> +               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
>> +
>> +               if ($tp{'completed?'}) {
>> +                       die unless exists $db{$key};
>> +                       die unless exists $db{$key}->{'start'};
>> +                       die if exists $db{$key}->{'end'};
>> +
>> +                       $db{$key}->{'end'} = $time;
>> +                       $db{$key}->{'notify'} = $notify{$gkey}
>> +                                               if exists $notify{$gkey};
> 
> Hmm. With preempt-to-busy, a request can complete when we are no longer
> tracking it (it completes before we preempt it).
> 
> They will still get the schedule-out tracepoint, but marked as
> incomplete, and there will be a signaled tp later before we try and
> resubmit.

This sounds like the request would disappear from the scripts view.

> 
>> +               } else {
>> +                       delete $db{$key};
>> +               }
>> +       } elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
>> +               my $gkey;
>>   
>> -               die unless exists $db{$key};
>> -               die unless exists $db{$key}->{'start'};
>> -               die if exists $db{$key}->{'end'};
>> +               die unless exists $ctxengines{$tp{'context'}};
>>   
>> -               $db{$key}->{'end'} = $time;
>> -               $db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
>> -       } elsif ($tp_name eq 'i915:intel_engine_notify:') {
>> -               my $gkey = global_key($ring, $seqno);
>> +               $gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
>>   
>>                  $notify{$gkey} = $time unless exists $notify{$gkey};
>>          } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
>> @@ -452,7 +461,7 @@ while (<>) {
>>   # find the largest seqno to be used for timeline sorting purposes.
>>   my $max_seqno = 0;
>>   foreach my $key (keys %db) {
>> -       my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
>> +       my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
>>   
>>          die unless exists $db{$key}->{'start'};
>>   
>> @@ -478,14 +487,13 @@ my $key_count = scalar(keys %db);
>>   
>>   my %engine_timelines;
>>   
>> -sub sortEngine {
>> -       my $as = $db{$a}->{'global'};
>> -       my $bs = $db{$b}->{'global'};
>> +sub sortStart {
>> +       my $as = $db{$a}->{'start'};
>> +       my $bs = $db{$b}->{'start'};
>>          my $val;
>>   
>>          $val = $as <=> $bs;
>> -
>> -       die if $val == 0;
>> +       $val = $a cmp $b if $val == 0;
>>   
>>          return $val;
>>   }
>> @@ -497,9 +505,7 @@ sub get_engine_timeline {
>>          return $engine_timelines{$ring} if exists $engine_timelines{$ring};
>>   
>>          @timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
>> -       # FIXME seqno restart
>> -       @timeline = sort sortEngine @timeline;
>> -
>> +       @timeline = sort sortStart @timeline;
>>          $engine_timelines{$ring} = \@timeline;
>>   
>>          return \@timeline;
>> @@ -561,20 +567,10 @@ foreach my $gid (sort keys %rings) {
>>                          $db{$key}->{'no-notify'} = 1;
>>                  }
>>                  $db{$key}->{'end'} = $end;
>> +               $db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
>>          }
>>   }
>>   
>> -sub sortStart {
>> -       my $as = $db{$a}->{'start'};
>> -       my $bs = $db{$b}->{'start'};
>> -       my $val;
>> -
>> -       $val = $as <=> $bs;
>> -       $val = $a cmp $b if $val == 0;
>> -
>> -       return $val;
>> -}
>> -
>>   my $re_sort = 1;
>>   my @sorted_keys;
>>   
>> @@ -670,9 +666,13 @@ if ($correct_durations) {
>>                          next unless exists $db{$key}->{'no-end'};
>>                          last if $pos == $#{$timeline};
>>   
>> -                       # Shift following request to start after the current one
>> +                       # Shift following request to start after the current
>> +                       # one, but only if that wouldn't make it zero duration,
>> +                       # which would indicate notify arrived after context
>> +                       # complete.
>>                          $next_key = ${$timeline}[$pos + 1];
>> -                       if (exists $db{$key}->{'notify'}) {
>> +                       if (exists $db{$key}->{'notify'} and
>> +                           $db{$key}->{'notify'} < $db{$key}->{'end'}) {
>>                                  $db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
>>                                  $db{$next_key}->{'start'} = $db{$key}->{'notify'};
>>                                  $re_sort = 1;
>> @@ -750,9 +750,9 @@ foreach my $gid (sort keys %rings) {
>>          # Extract all GPU busy intervals and sort them.
>>          foreach my $key (@sorted_keys) {
>>                  next unless $db{$key}->{'ring'} eq $ring;
>> +               die if $db{$key}->{'start'} > $db{$key}->{'end'};
> 
> Heh, we're out of luck if we want to trace across seqno wraparound.

Yeah, that's another missing thing.

> 
> It makes enough sense,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks. Overall the script could use a cleanup so I'll try to find some 
time towards it when this settles.

Regards,

Tvrtko


_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH i-g-t 03/21] trace.pl: Virtual engine support
  2019-05-10 12:52     ` [igt-dev] " Chris Wilson
@ 2019-05-13 12:30       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:30 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 13:52, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:40)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Add virtual/queue timelines to both stdout and HTML output.
>>
>> A new timeline is created for each queue/virtual engine to display
>> associated requests in queued and runnable states. Once requests are
>> submitted to a real engine for executing they show up on the physical
>> engine timeline.
> 
> How does it cope with preemption events that shift the request onto
> another engine?

Preemption handling works in a way that it is supposed to forget request 
ever existed before the final submit and complete. (It will expand the 
timeline by showing the runnable state all until the final request_in 
and shrink the execution boxes to the time between final request_in and 
request_out.) It's yet another weakness yeah. Perhaps I was feeding 
hopes a more capable tool would replace trace.pl. Adding support to show 
preemption properly certainly sounds like a lot of work.

> Queues. So why are virtual engines treated differently, from my pov it's
> just a timeline like any other, the only difference is that it my
> execute on a different engine? My expectation would have been that
> tracking would have been timeline centric.
> 
> However, I think I am confusing my perspective of timelines in the
> kernel with the visualisation timelines.

Could be. The tool remained at the first approach of showing physical 
and virtual timelines separately, but the execution boxes belonging to 
virtual timeline being on physical timelines.

Perhaps another useful approach would be to shadow the execution boxes 
on the virtual timelines.

> 
>> +sub is_veng
>> +{
>> +       my ($class, $instance) = split ':', shift;
>> +
>> +       return $instance eq '254';
> 
> Ok. I thought I might have caught you out.

You have, but I've worked around it. :)

> 
>> +               unless (exists $queue{$key}) {
>> +                       # Virtual engine
>> +                       my $vkey = db_key(VENG, $ctx, $seqno);
>> +                       my %req;
>> +
>> +                       die unless exists $queues{$ctx};
>> +                       die unless exists $queue{$vkey};
>> +                       die unless exists $submit{$vkey};
>> +
>> +                       # Create separate request record on the queue timeline
>> +                       $q = $queue{$vkey};
>> +                       $s = $submit{$vkey};
>> +                       $req{'queue'} = $q;
>> +                       $req{'submit'} = $s;
>> +                       $req{'start'} = $time;
>> +                       $req{'end'} = $time;
>> +                       $req{'ring'} = VENG;
>> +                       $req{'seqno'} = $seqno;
>> +                       $req{'ctx'} = $ctx;
>> +                       $req{'name'} = $ctx . '/' . $seqno;
>> +                       $req{'global'} = $tp{'global'};
>> +                       $req{'port'} = $tp{'port'};
> 
> Just quietly thinking why not adopt this for each timeline; create a
> on-engine event box for all.

Oh yeah, like I said above. Could do. But perhaps some 
cleanup/refactoring should come first.

>> +
>> +                       $vdb{$vkey} = \%req;
>> +               } else {
>> +                       $q = $queue{$key};
>> +                       $s = $submit{$key};
>> +               }
>>   
>>                  $req{'start'} = $time;
>>                  $req{'ring'} = $ring;
> 
> 
>>   sub stdio_stats
>>   {
>>          my ($stats, $group, $id) = @_;
>> +       my $veng = exists $stats->{'virtual'} ? 1 : 0;
>>          my $str;
>>   
>> -       $str = 'Ring' . $group . ': ';
>> +       $str = $veng ? 'Virtual' : 'Ring';
>> +       $str .= $group . ': ';
>>          $str .= $stats->{'count'} . ' batches, ';
>> -       $str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
>> -       $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
>> -       $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
>> +       unless ($veng) {
>> +               $str .= sprintf('%.2f (%.2f) avg batch us, ',
>> +                               $stats->{'avg'}, $stats->{'total-avg'});
>> +               $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
>> +               $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
>> +       }
>> +
>>          $str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
>>          $str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
>>          $str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';
> 
> So I'm looking that the utilisation, trying to figure out why veng
> matters? Do we not breakdown utilisation for the real engines, plus
> utilisation on each client timeline?

It does that, both in stdout and html.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 03/21] trace.pl: Virtual engine support
@ 2019-05-13 12:30       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:30 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 13:52, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:40)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Add virtual/queue timelines to both stdout and HTML output.
>>
>> A new timeline is created for each queue/virtual engine to display
>> associated requests in queued and runnable states. Once requests are
>> submitted to a real engine for executing they show up on the physical
>> engine timeline.
> 
> How does it cope with preemption events that shift the request onto
> another engine?

Preemption handling works in a way that it is supposed to forget request 
ever existed before the final submit and complete. (It will expand the 
timeline by showing the runnable state all until the final request_in 
and shrink the execution boxes to the time between final request_in and 
request_out.) It's yet another weakness yeah. Perhaps I was feeding 
hopes a more capable tool would replace trace.pl. Adding support to show 
preemption properly certainly sounds like a lot of work.

> Queues. So why are virtual engines treated differently, from my pov it's
> just a timeline like any other, the only difference is that it my
> execute on a different engine? My expectation would have been that
> tracking would have been timeline centric.
> 
> However, I think I am confusing my perspective of timelines in the
> kernel with the visualisation timelines.

Could be. The tool remained at the first approach of showing physical 
and virtual timelines separately, but the execution boxes belonging to 
virtual timeline being on physical timelines.

Perhaps another useful approach would be to shadow the execution boxes 
on the virtual timelines.

> 
>> +sub is_veng
>> +{
>> +       my ($class, $instance) = split ':', shift;
>> +
>> +       return $instance eq '254';
> 
> Ok. I thought I might have caught you out.

You have, but I've worked around it. :)

> 
>> +               unless (exists $queue{$key}) {
>> +                       # Virtual engine
>> +                       my $vkey = db_key(VENG, $ctx, $seqno);
>> +                       my %req;
>> +
>> +                       die unless exists $queues{$ctx};
>> +                       die unless exists $queue{$vkey};
>> +                       die unless exists $submit{$vkey};
>> +
>> +                       # Create separate request record on the queue timeline
>> +                       $q = $queue{$vkey};
>> +                       $s = $submit{$vkey};
>> +                       $req{'queue'} = $q;
>> +                       $req{'submit'} = $s;
>> +                       $req{'start'} = $time;
>> +                       $req{'end'} = $time;
>> +                       $req{'ring'} = VENG;
>> +                       $req{'seqno'} = $seqno;
>> +                       $req{'ctx'} = $ctx;
>> +                       $req{'name'} = $ctx . '/' . $seqno;
>> +                       $req{'global'} = $tp{'global'};
>> +                       $req{'port'} = $tp{'port'};
> 
> Just quietly thinking why not adopt this for each timeline; create a
> on-engine event box for all.

Oh yeah, like I said above. Could do. But perhaps some 
cleanup/refactoring should come first.

>> +
>> +                       $vdb{$vkey} = \%req;
>> +               } else {
>> +                       $q = $queue{$key};
>> +                       $s = $submit{$key};
>> +               }
>>   
>>                  $req{'start'} = $time;
>>                  $req{'ring'} = $ring;
> 
> 
>>   sub stdio_stats
>>   {
>>          my ($stats, $group, $id) = @_;
>> +       my $veng = exists $stats->{'virtual'} ? 1 : 0;
>>          my $str;
>>   
>> -       $str = 'Ring' . $group . ': ';
>> +       $str = $veng ? 'Virtual' : 'Ring';
>> +       $str .= $group . ': ';
>>          $str .= $stats->{'count'} . ' batches, ';
>> -       $str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
>> -       $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
>> -       $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
>> +       unless ($veng) {
>> +               $str .= sprintf('%.2f (%.2f) avg batch us, ',
>> +                               $stats->{'avg'}, $stats->{'total-avg'});
>> +               $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
>> +               $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
>> +       }
>> +
>>          $str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
>>          $str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
>>          $str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';
> 
> So I'm looking that the utilisation, trying to figure out why veng
> matters? Do we not breakdown utilisation for the real engines, plus
> utilisation on each client timeline?

It does that, both in stdout and html.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support
  2019-05-10 12:55     ` [igt-dev] [Intel-gfx] " Chris Wilson
@ 2019-05-13 12:38       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:38 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 13:55, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:41)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Use the 'completed?' tracepoint field to detect more robustly when a
>> request has been preempted and remove it from the engine database if so.
>>
>> Otherwise the script can hit a scenario where the same global seqno will
>> be mentioned multiple times (on an engine seqno) which aborts processing.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   scripts/trace.pl | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/scripts/trace.pl b/scripts/trace.pl
>> index 6cc332bb6e2a..cb7cc46df22e 100755
>> --- a/scripts/trace.pl
>> +++ b/scripts/trace.pl
>> @@ -483,17 +483,17 @@ while (<>) {
>>                  $ringmap{$rings{$ring}} = $ring;
>>                  $db{$key} = \%req;
>>          } elsif ($tp_name eq 'i915:i915_request_out:') {
>> -               my $gkey;
>> -
>>                  die unless exists $ctxengines{$ctx};
>>   
>> -               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
>> -
>>                  if ($tp{'completed?'}) {
>> +                       my $gkey;
>> +
>>                          die unless exists $db{$key};
>>                          die unless exists $db{$key}->{'start'};
>>                          die if exists $db{$key}->{'end'};
>>   
>> +                       $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
> 
> I'm lost, how does do the commit message? I thought db_key() just gave
> the hash value and not alter the db?

This seems to be a rebasing fail. I need to squash this with 
"scripts/trace.pl: Fix after intel_engine_notify removal". Or maybe 
better move this hunk back from that patch to this one.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support
@ 2019-05-13 12:38       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:38 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 13:55, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:41)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Use the 'completed?' tracepoint field to detect more robustly when a
>> request has been preempted and remove it from the engine database if so.
>>
>> Otherwise the script can hit a scenario where the same global seqno will
>> be mentioned multiple times (on an engine seqno) which aborts processing.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   scripts/trace.pl | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/scripts/trace.pl b/scripts/trace.pl
>> index 6cc332bb6e2a..cb7cc46df22e 100755
>> --- a/scripts/trace.pl
>> +++ b/scripts/trace.pl
>> @@ -483,17 +483,17 @@ while (<>) {
>>                  $ringmap{$rings{$ring}} = $ring;
>>                  $db{$key} = \%req;
>>          } elsif ($tp_name eq 'i915:i915_request_out:') {
>> -               my $gkey;
>> -
>>                  die unless exists $ctxengines{$ctx};
>>   
>> -               $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
>> -
>>                  if ($tp{'completed?'}) {
>> +                       my $gkey;
>> +
>>                          die unless exists $db{$key};
>>                          die unless exists $db{$key}->{'start'};
>>                          die if exists $db{$key}->{'end'};
>>   
>> +                       $gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
> 
> I'm lost, how does do the commit message? I thought db_key() just gave
> the hash value and not alter the db?

This seems to be a rebasing fail. I need to squash this with 
"scripts/trace.pl: Fix after intel_engine_notify removal". Or maybe 
better move this hunk back from that patch to this one.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
  2019-05-10 13:14     ` Chris Wilson
@ 2019-05-13 12:41       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:41 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> +       /*
>> +        * Create and configure contexts.
>> +        */
>> +       for (i = 0; i < wrk->nr_ctxs; i += 2) {
>> +               struct ctx *ctx = &wrk->ctx_list[i];
>> +               uint32_t ctx_id, share_vm = 0;
>>   
>> -                       wrk->ctx_list[w->context].id = arg.ctx_id;
>> +               if (ctx->id)
>> +                       continue;
>>   
>> -                       if (flags & GLOBAL_BALANCE) {
>> -                               wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
>> -                               context_vcs_rr ^= 1;
>> -                       } else {
>> -                               wrk->ctx_list[w->context].static_vcs = ctx_vcs;
>> -                               ctx_vcs ^= 1;
>> -                       }
>> +               if (flags & I915) {
> 
> vm sharing shouldn't be a i915-balancer only option. For single jobs split
> across multiple contexts, I would expect they will want to share vm.

Could do but I wanted to limit the new features to new features. :) 
Pencil in for later okay?

>> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
>> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
>> +                               .param.param = I915_CONTEXT_PARAM_VM,
>> +                       };
>> +                       struct drm_i915_gem_context_create_ext args = { };
>>   
>> -                       if (wrk->prio) {
>> +                       /* Find existing context to share ppgtt with. */
>> +                       for (j = 0; j < wrk->nr_ctxs; j++) {
>>                                  struct drm_i915_gem_context_param param = {
>> -                                       .ctx_id = arg.ctx_id,
>> -                                       .param = I915_CONTEXT_PARAM_PRIORITY,
>> -                                       .value = wrk->prio,
>> +                                       .param = I915_CONTEXT_PARAM_VM,
>>                                  };
>> -                               gem_context_set_param(fd, &param);
>> +
>> +                               if (!wrk->ctx_list[j].id)
>> +                                       continue;
>> +
>> +                               param.ctx_id = wrk->ctx_list[j].id;
>> +
>> +                               gem_context_get_param(fd, &param);
>> +                               igt_assert(param.value);
>> +
>> +                               share_vm = param.value;
>> +
>> +                               ext.param.value = share_vm;
>> +                               args.flags =
>> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
>> +                               args.extensions = to_user_pointer(&ext);
>> +                               break;
>>                          }
>> +
>> +                       if (!ctx->targets_instance)
>> +                               args.flags |=
>> +                                    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
>> +
>> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
>> +                                &args);
>> +
>> +                       ctx_id = args.ctx_id;
>> +               } else {
>> +                       struct drm_i915_gem_context_create args = {};
>> +
>> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
>> +                       ctx_id = args.ctx_id;
>> +               }
>> +
>> +               igt_assert(ctx_id);
>> +               ctx->id = ctx_id;
>> +
>> +               if (flags & GLOBAL_BALANCE) {
>> +                       ctx->static_vcs = context_vcs_rr;
>> +                       context_vcs_rr ^= 1;
>> +               } else {
>> +                       ctx->static_vcs = ctx_vcs;
>> +                       ctx_vcs ^= 1;
>> +               }
>> +
>> +               __ctx_set_prio(ctx_id, wrk->prio);
>> +
>> +               /*
>> +                * Do we need a separate context to satisfy this workloads which
>> +                * both want to target specific engines and be balanced by i915?
>> +                */
>> +               if ((flags & I915) && ctx->wants_balance &&
>> +                   ctx->targets_instance) {
>> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
>> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
>> +                               .param.param = I915_CONTEXT_PARAM_VM,
>> +                               .param.value = share_vm,
>> +                       };
>> +                       struct drm_i915_gem_context_create_ext args = {
>> +                               .extensions = to_user_pointer(&ext),
>> +                               .flags =
>> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
>> +                                   I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
>> +                       };
>> +
>> +                       igt_assert(share_vm);
>> +
>> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
>> +                                &args);
>> +
>> +                       igt_assert(args.ctx_id);
>> +                       ctx_id = args.ctx_id;
>> +                       wrk->ctx_list[i + 1].id = args.ctx_id;
>> +
>> +                       __ctx_set_prio(ctx_id, wrk->prio);
>> +               }
>> +
>> +               if (ctx->wants_balance) {
>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
>> +                               .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
>> +                               .num_siblings = 2,
>> +                               .engines = {
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 0 },
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 1 },
>> +                               },
>> +                       };
>> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
>> +                               .extensions = to_user_pointer(&load_balance),
>> +                               .engines = {
>> +                                       { .engine_class = I915_ENGINE_CLASS_INVALID,
>> +                                         .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 0 },
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 1 },
>> +                               },
>> +                       };
>> +
>> +                       struct drm_i915_gem_context_param param = {
>> +                               .ctx_id = ctx_id,
>> +                               .param = I915_CONTEXT_PARAM_ENGINES,
>> +                               .size = sizeof(set_engines),
>> +                               .value = to_user_pointer(&set_engines),
>> +                       };
>> +
>> +                       gem_context_set_param(fd, &param);
>>                  }
> 
> if (share_vm)
> 	gem_vm_destroy(share_vm);
> 
> Just to drop the local handle as the context has acquired its own
> reference.

Well spotted!

> Other than that, it does what it sets out to do: create a context with
> choice of engines and load balancing amongst them.
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
@ 2019-05-13 12:41       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 12:41 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> +       /*
>> +        * Create and configure contexts.
>> +        */
>> +       for (i = 0; i < wrk->nr_ctxs; i += 2) {
>> +               struct ctx *ctx = &wrk->ctx_list[i];
>> +               uint32_t ctx_id, share_vm = 0;
>>   
>> -                       wrk->ctx_list[w->context].id = arg.ctx_id;
>> +               if (ctx->id)
>> +                       continue;
>>   
>> -                       if (flags & GLOBAL_BALANCE) {
>> -                               wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
>> -                               context_vcs_rr ^= 1;
>> -                       } else {
>> -                               wrk->ctx_list[w->context].static_vcs = ctx_vcs;
>> -                               ctx_vcs ^= 1;
>> -                       }
>> +               if (flags & I915) {
> 
> vm sharing shouldn't be a i915-balancer only option. For single jobs split
> across multiple contexts, I would expect they will want to share vm.

Could do but I wanted to limit the new features to new features. :) 
Pencil in for later okay?

>> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
>> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
>> +                               .param.param = I915_CONTEXT_PARAM_VM,
>> +                       };
>> +                       struct drm_i915_gem_context_create_ext args = { };
>>   
>> -                       if (wrk->prio) {
>> +                       /* Find existing context to share ppgtt with. */
>> +                       for (j = 0; j < wrk->nr_ctxs; j++) {
>>                                  struct drm_i915_gem_context_param param = {
>> -                                       .ctx_id = arg.ctx_id,
>> -                                       .param = I915_CONTEXT_PARAM_PRIORITY,
>> -                                       .value = wrk->prio,
>> +                                       .param = I915_CONTEXT_PARAM_VM,
>>                                  };
>> -                               gem_context_set_param(fd, &param);
>> +
>> +                               if (!wrk->ctx_list[j].id)
>> +                                       continue;
>> +
>> +                               param.ctx_id = wrk->ctx_list[j].id;
>> +
>> +                               gem_context_get_param(fd, &param);
>> +                               igt_assert(param.value);
>> +
>> +                               share_vm = param.value;
>> +
>> +                               ext.param.value = share_vm;
>> +                               args.flags =
>> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
>> +                               args.extensions = to_user_pointer(&ext);
>> +                               break;
>>                          }
>> +
>> +                       if (!ctx->targets_instance)
>> +                               args.flags |=
>> +                                    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
>> +
>> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
>> +                                &args);
>> +
>> +                       ctx_id = args.ctx_id;
>> +               } else {
>> +                       struct drm_i915_gem_context_create args = {};
>> +
>> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
>> +                       ctx_id = args.ctx_id;
>> +               }
>> +
>> +               igt_assert(ctx_id);
>> +               ctx->id = ctx_id;
>> +
>> +               if (flags & GLOBAL_BALANCE) {
>> +                       ctx->static_vcs = context_vcs_rr;
>> +                       context_vcs_rr ^= 1;
>> +               } else {
>> +                       ctx->static_vcs = ctx_vcs;
>> +                       ctx_vcs ^= 1;
>> +               }
>> +
>> +               __ctx_set_prio(ctx_id, wrk->prio);
>> +
>> +               /*
>> +                * Do we need a separate context to satisfy this workloads which
>> +                * both want to target specific engines and be balanced by i915?
>> +                */
>> +               if ((flags & I915) && ctx->wants_balance &&
>> +                   ctx->targets_instance) {
>> +                       struct drm_i915_gem_context_create_ext_setparam ext = {
>> +                               .base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
>> +                               .param.param = I915_CONTEXT_PARAM_VM,
>> +                               .param.value = share_vm,
>> +                       };
>> +                       struct drm_i915_gem_context_create_ext args = {
>> +                               .extensions = to_user_pointer(&ext),
>> +                               .flags =
>> +                                   I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
>> +                                   I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
>> +                       };
>> +
>> +                       igt_assert(share_vm);
>> +
>> +                       drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
>> +                                &args);
>> +
>> +                       igt_assert(args.ctx_id);
>> +                       ctx_id = args.ctx_id;
>> +                       wrk->ctx_list[i + 1].id = args.ctx_id;
>> +
>> +                       __ctx_set_prio(ctx_id, wrk->prio);
>> +               }
>> +
>> +               if (ctx->wants_balance) {
>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
>> +                               .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
>> +                               .num_siblings = 2,
>> +                               .engines = {
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 0 },
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 1 },
>> +                               },
>> +                       };
>> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
>> +                               .extensions = to_user_pointer(&load_balance),
>> +                               .engines = {
>> +                                       { .engine_class = I915_ENGINE_CLASS_INVALID,
>> +                                         .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 0 },
>> +                                       { .engine_class = I915_ENGINE_CLASS_VIDEO,
>> +                                         .engine_instance = 1 },
>> +                               },
>> +                       };
>> +
>> +                       struct drm_i915_gem_context_param param = {
>> +                               .ctx_id = ctx_id,
>> +                               .param = I915_CONTEXT_PARAM_ENGINES,
>> +                               .size = sizeof(set_engines),
>> +                               .value = to_user_pointer(&set_engines),
>> +                       };
>> +
>> +                       gem_context_set_param(fd, &param);
>>                  }
> 
> if (share_vm)
> 	gem_vm_destroy(share_vm);
> 
> Just to drop the local handle as the context has acquired its own
> reference.

Well spotted!

> Other than that, it does what it sets out to do: create a context with
> choice of engines and load balancing amongst them.
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks,

Tvrtko

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
  2019-05-13 12:41       ` Tvrtko Ursulin
@ 2019-05-13 12:54         ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-13 12:54 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-13 13:41:47)
> 
> On 10/05/2019 14:14, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> +       /*
> >> +        * Create and configure contexts.
> >> +        */
> >> +       for (i = 0; i < wrk->nr_ctxs; i += 2) {
> >> +               struct ctx *ctx = &wrk->ctx_list[i];
> >> +               uint32_t ctx_id, share_vm = 0;
> >>   
> >> -                       wrk->ctx_list[w->context].id = arg.ctx_id;
> >> +               if (ctx->id)
> >> +                       continue;
> >>   
> >> -                       if (flags & GLOBAL_BALANCE) {
> >> -                               wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
> >> -                               context_vcs_rr ^= 1;
> >> -                       } else {
> >> -                               wrk->ctx_list[w->context].static_vcs = ctx_vcs;
> >> -                               ctx_vcs ^= 1;
> >> -                       }
> >> +               if (flags & I915) {
> > 
> > vm sharing shouldn't be a i915-balancer only option. For single jobs split
> > across multiple contexts, I would expect they will want to share vm.
> 
> Could do but I wanted to limit the new features to new features. :) 
> Pencil in for later okay?

Sure. Just checking I'm in the same ballpark with my understanding. I
did hope to enable vm sharing here by default -- in reality, I doubt
these wsim are impacted by vm switches as they are tiny. However, I
don't have any measurements for shared vm, and had better start
somewhere.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing
@ 2019-05-13 12:54         ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-13 12:54 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-13 13:41:47)
> 
> On 10/05/2019 14:14, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:42)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> +       /*
> >> +        * Create and configure contexts.
> >> +        */
> >> +       for (i = 0; i < wrk->nr_ctxs; i += 2) {
> >> +               struct ctx *ctx = &wrk->ctx_list[i];
> >> +               uint32_t ctx_id, share_vm = 0;
> >>   
> >> -                       wrk->ctx_list[w->context].id = arg.ctx_id;
> >> +               if (ctx->id)
> >> +                       continue;
> >>   
> >> -                       if (flags & GLOBAL_BALANCE) {
> >> -                               wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
> >> -                               context_vcs_rr ^= 1;
> >> -                       } else {
> >> -                               wrk->ctx_list[w->context].static_vcs = ctx_vcs;
> >> -                               ctx_vcs ^= 1;
> >> -                       }
> >> +               if (flags & I915) {
> > 
> > vm sharing shouldn't be a i915-balancer only option. For single jobs split
> > across multiple contexts, I would expect they will want to share vm.
> 
> Could do but I wanted to limit the new features to new features. :) 
> Pencil in for later okay?

Sure. Just checking I'm in the same ballpark with my understanding. I
did hope to enable vm sharing here by default -- in reality, I doubt
these wsim are impacted by vm switches as they are tiny. However, I
don't have any measurements for shared vm, and had better start
somewhere.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup
  2019-05-10 13:20     ` Chris Wilson
@ 2019-05-13 13:08       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:08 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:47)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
>>   1 file changed, 21 insertions(+), 13 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index 5245692df6eb..f654decb24cc 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
>>          } \
>>   }
>>   
>> +static int str_to_engine(const char *str)
>> +{
>> +       unsigned int i;
>> +
>> +       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
>> +               if (!strcasecmp(str, ring_str_map[i]))
>> +                       return i;
>> +       }
>> +
>> +       return -1;
>> +}
>> +
>>   static struct workload *
>>   parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   {
>> @@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>                  }
>>   
>>                  if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
>> -                       unsigned int old_valid = valid;
>> -
>>                          fstart = NULL;
>>   
>> -                       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
>> -                               if (!strcasecmp(field, ring_str_map[i])) {
>> -                                       step.engine = i;
>> -                                       if (step.engine == BCS)
>> -                                               bcs_used = true;
>> -                                       valid++;
>> -                                       break;
>> -                               }
>> -                       }
>> -
>> -                       check_arg(old_valid == valid,
>> +                       i = str_to_engine(field);
>> +                       check_arg(i < 0,
>>                                    "Invalid engine id at step %u!\n", nr_steps);
>> +                       if (i >= 0)
>> +                               valid++;
> 
> check_arg() returned already for all i < 0, no?

Yes, and it looks the very next patch removes the if. I'll pull it here.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks!

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup
@ 2019-05-13 13:08       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:08 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:47)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
>>   1 file changed, 21 insertions(+), 13 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index 5245692df6eb..f654decb24cc 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
>>          } \
>>   }
>>   
>> +static int str_to_engine(const char *str)
>> +{
>> +       unsigned int i;
>> +
>> +       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
>> +               if (!strcasecmp(str, ring_str_map[i]))
>> +                       return i;
>> +       }
>> +
>> +       return -1;
>> +}
>> +
>>   static struct workload *
>>   parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   {
>> @@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>                  }
>>   
>>                  if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
>> -                       unsigned int old_valid = valid;
>> -
>>                          fstart = NULL;
>>   
>> -                       for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
>> -                               if (!strcasecmp(field, ring_str_map[i])) {
>> -                                       step.engine = i;
>> -                                       if (step.engine == BCS)
>> -                                               bcs_used = true;
>> -                                       valid++;
>> -                                       break;
>> -                               }
>> -                       }
>> -
>> -                       check_arg(old_valid == valid,
>> +                       i = str_to_engine(field);
>> +                       check_arg(i < 0,
>>                                    "Invalid engine id at step %u!\n", nr_steps);
>> +                       if (i >= 0)
>> +                               valid++;
> 
> check_arg() returned already for all i < 0, no?

Yes, and it looks the very next patch removes the if. I'll pull it here.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks!

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
  2019-05-10 13:26     ` Chris Wilson
@ 2019-05-13 13:18       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:18 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Support new i915 uAPI for configuring contexts with engine maps.
>>
>> Please refer to the README file for more detailed explanation.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> +static int parse_engine_map(struct w_step *step, const char *_str)
>> +{
>> +       char *token, *tctx = NULL, *tstart = (char *)_str;
>> +
>> +       while ((token = strtok_r(tstart, "|", &tctx))) {
>> +               enum intel_engine_id engine;
>> +
>> +               tstart = NULL;
>> +
>> +               if (!strcmp(token, "DEFAULT"))
>> +                       return -1;
>> +               else if (!strcmp(token, "VCS"))
>> +                       return -1;
>> +
>> +               engine = str_to_engine(token);
>> +               if ((int)engine < 0)
>> +                       return -1;
>> +
>> +               if (engine != VCS1 && engine != VCS2)
>> +                       return -1; /* TODO */
>> +
>> +               step->engine_map_count++;
>> +               step->engine_map = realloc(step->engine_map,
>> +                                          step->engine_map_count *
>> +                                          sizeof(step->engine_map[0]));
>> +               step->engine_map[step->engine_map_count - 1] = engine;
> 
> 
>> +               if (ctx->engine_map) {
>> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>> +                                                         ctx->engine_map_count + 1);
>> +                       struct drm_i915_gem_context_param param = {
>> +                               .ctx_id = ctx_id,
>> +                               .param = I915_CONTEXT_PARAM_ENGINES,
>> +                               .size = sizeof(set_engines),
>> +                               .value = to_user_pointer(&set_engines),
>> +                       };
>> +
>> +                       set_engines.extensions = 0;
>> +
>> +                       /* Reserve slot for virtual engine. */
>> +                       set_engines.engines[0].engine_class =
>> +                               I915_ENGINE_CLASS_INVALID;
>> +                       set_engines.engines[0].engine_instance =
>> +                               I915_ENGINE_CLASS_INVALID_NONE;
>> +
>> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
>> +                               set_engines.engines[j].engine_class =
>> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
>> +                               set_engines.engines[j].engine_instance =
>> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
>> +                       }
> 
> I would suggest the file format starts with class:instance specifiers.
> Too much FIXME that I think will need a file format change.

Where do you see the need for a file format change?

These FIXMEs can be addressed by either adding engine discovery or 
fixing the code to not assume class and engines to be balanced.

Larger rework might be needed to deal with the internal engine 
representation after adding engine discovery. Or at least an audit and 
checking legacy paths. Might be that refactor would be limited to engine 
string to internal engine id lookup.

But to change file format I don't see an immediate need. VCS is already 
defined as any VCS and there are explicit VCS1 and VCS2.

But even if the need to change it arises, I think wouldn't be a problem 
if left for later.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
@ 2019-05-13 13:18       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:18 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Support new i915 uAPI for configuring contexts with engine maps.
>>
>> Please refer to the README file for more detailed explanation.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> +static int parse_engine_map(struct w_step *step, const char *_str)
>> +{
>> +       char *token, *tctx = NULL, *tstart = (char *)_str;
>> +
>> +       while ((token = strtok_r(tstart, "|", &tctx))) {
>> +               enum intel_engine_id engine;
>> +
>> +               tstart = NULL;
>> +
>> +               if (!strcmp(token, "DEFAULT"))
>> +                       return -1;
>> +               else if (!strcmp(token, "VCS"))
>> +                       return -1;
>> +
>> +               engine = str_to_engine(token);
>> +               if ((int)engine < 0)
>> +                       return -1;
>> +
>> +               if (engine != VCS1 && engine != VCS2)
>> +                       return -1; /* TODO */
>> +
>> +               step->engine_map_count++;
>> +               step->engine_map = realloc(step->engine_map,
>> +                                          step->engine_map_count *
>> +                                          sizeof(step->engine_map[0]));
>> +               step->engine_map[step->engine_map_count - 1] = engine;
> 
> 
>> +               if (ctx->engine_map) {
>> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>> +                                                         ctx->engine_map_count + 1);
>> +                       struct drm_i915_gem_context_param param = {
>> +                               .ctx_id = ctx_id,
>> +                               .param = I915_CONTEXT_PARAM_ENGINES,
>> +                               .size = sizeof(set_engines),
>> +                               .value = to_user_pointer(&set_engines),
>> +                       };
>> +
>> +                       set_engines.extensions = 0;
>> +
>> +                       /* Reserve slot for virtual engine. */
>> +                       set_engines.engines[0].engine_class =
>> +                               I915_ENGINE_CLASS_INVALID;
>> +                       set_engines.engines[0].engine_instance =
>> +                               I915_ENGINE_CLASS_INVALID_NONE;
>> +
>> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
>> +                               set_engines.engines[j].engine_class =
>> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
>> +                               set_engines.engines[j].engine_instance =
>> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
>> +                       }
> 
> I would suggest the file format starts with class:instance specifiers.
> Too much FIXME that I think will need a file format change.

Where do you see the need for a file format change?

These FIXMEs can be addressed by either adding engine discovery or 
fixing the code to not assume class and engines to be balanced.

Larger rework might be needed to deal with the internal engine 
representation after adding engine discovery. Or at least an audit and 
checking legacy paths. Might be that refactor would be limited to engine 
string to internal engine id lookup.

But to change file format I don't see an immediate need. VCS is already 
defined as any VCS and there are explicit VCS1 and VCS2.

But even if the need to change it arises, I think wouldn't be a problem 
if left for later.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro
  2019-05-10 13:29     ` Chris Wilson
@ 2019-05-13 13:24       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:24 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:50)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Parsing an integer workload descriptor field is a common pattern which we
>> can extract to a helper macro and by doing so further improve the
>> readability of the main parsing loop.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
>>   1 file changed, 25 insertions(+), 55 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index 4dbfc3e922a9..c2e13d9939c2 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -370,6 +370,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
>>          return 0;
>>   }
>>   
>> +#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
>> +       if ((field = strtok_r(fstart, ".", &fctx))) { \
>> +               tmp = atoi(field); \
>> +               check_arg(_COND_, _ERR_, nr_steps); \
>> +               step.type = _STEP_; \
>> +               step._FIELD_ = tmp; \
>> +               goto add_step; \
>> +       } \
> 
> More hidden control flow :-p

It's not the pretties I admit. It started as a quick project to test 
feasibility of userspace balancing and when it has shown itself somewhat 
useful I added more and more features to it. It's at the point where 
splitting int separate files and refactoring the data structures could 
be beneficial.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro
@ 2019-05-13 13:24       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:24 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:50)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Parsing an integer workload descriptor field is a common pattern which we
>> can extract to a helper macro and by doing so further improve the
>> readability of the main parsing loop.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
>>   1 file changed, 25 insertions(+), 55 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index 4dbfc3e922a9..c2e13d9939c2 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -370,6 +370,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
>>          return 0;
>>   }
>>   
>> +#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
>> +       if ((field = strtok_r(fstart, ".", &fctx))) { \
>> +               tmp = atoi(field); \
>> +               check_arg(_COND_, _ERR_, nr_steps); \
>> +               step.type = _STEP_; \
>> +               step._FIELD_ = tmp; \
>> +               goto add_step; \
>> +       } \
> 
> More hidden control flow :-p

It's not the pretties I admit. It started as a quick project to test 
feasibility of userspace balancing and when it has shown itself somewhat 
useful I added more and more features to it. It's at the point where 
splitting int separate files and refactoring the data structures could 
be beneficial.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/21] gem_wsim: Engine bond command
  2019-05-10 13:36     ` Chris Wilson
@ 2019-05-13 13:28       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:28 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:52)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Engine bonds are an i915 uAPI applicable to load balanced contexts with
>> engine map. They allow expression rules of engine selection between two
>> contexts when submissions are also tied with submit fences.
>>
>> Please refer to the README for a more detailed description.
> 
> I would prefer not to have a hexadecimal mask in the file format? That's
> harder than usual to read later on.
> 
> bond({master_class:master_instance}, {engine_class:engine_instance}),...
> ?

I agree hexadecimal mask is bad. I'll try to see quickly if piggying 
back to some existing list of engines processing could be done cheaply 
(quickly). And without adding more horror to the file parsing loop.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/21] gem_wsim: Engine bond command
@ 2019-05-13 13:28       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:28 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:52)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Engine bonds are an i915 uAPI applicable to load balanced contexts with
>> engine map. They allow expression rules of engine selection between two
>> contexts when submissions are also tied with submit fences.
>>
>> Please refer to the README for a more detailed description.
> 
> I would prefer not to have a hexadecimal mask in the file format? That's
> harder than usual to read later on.
> 
> bond({master_class:master_instance}, {engine_class:engine_instance}),...
> ?

I agree hexadecimal mask is bad. I'll try to see quickly if piggying 
back to some existing list of engines processing could be done cheaply 
(quickly). And without adding more horror to the file parsing loop.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
  2019-05-13 13:18       ` Tvrtko Ursulin
@ 2019-05-13 13:29         ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-13 13:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-13 14:18:59)
> 
> On 10/05/2019 14:26, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Support new i915 uAPI for configuring contexts with engine maps.
> >>
> >> Please refer to the README file for more detailed explanation.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> +static int parse_engine_map(struct w_step *step, const char *_str)
> >> +{
> >> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> >> +
> >> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> >> +               enum intel_engine_id engine;
> >> +
> >> +               tstart = NULL;
> >> +
> >> +               if (!strcmp(token, "DEFAULT"))
> >> +                       return -1;
> >> +               else if (!strcmp(token, "VCS"))
> >> +                       return -1;
> >> +
> >> +               engine = str_to_engine(token);
> >> +               if ((int)engine < 0)
> >> +                       return -1;
> >> +
> >> +               if (engine != VCS1 && engine != VCS2)
> >> +                       return -1; /* TODO */
> >> +
> >> +               step->engine_map_count++;
> >> +               step->engine_map = realloc(step->engine_map,
> >> +                                          step->engine_map_count *
> >> +                                          sizeof(step->engine_map[0]));
> >> +               step->engine_map[step->engine_map_count - 1] = engine;
> > 
> > 
> >> +               if (ctx->engine_map) {
> >> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> >> +                                                         ctx->engine_map_count + 1);
> >> +                       struct drm_i915_gem_context_param param = {
> >> +                               .ctx_id = ctx_id,
> >> +                               .param = I915_CONTEXT_PARAM_ENGINES,
> >> +                               .size = sizeof(set_engines),
> >> +                               .value = to_user_pointer(&set_engines),
> >> +                       };
> >> +
> >> +                       set_engines.extensions = 0;
> >> +
> >> +                       /* Reserve slot for virtual engine. */
> >> +                       set_engines.engines[0].engine_class =
> >> +                               I915_ENGINE_CLASS_INVALID;
> >> +                       set_engines.engines[0].engine_instance =
> >> +                               I915_ENGINE_CLASS_INVALID_NONE;
> >> +
> >> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
> >> +                               set_engines.engines[j].engine_class =
> >> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
> >> +                               set_engines.engines[j].engine_instance =
> >> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
> >> +                       }
> > 
> > I would suggest the file format starts with class:instance specifiers.
> > Too much FIXME that I think will need a file format change.
> 
> Where do you see the need for a file format change?

Nah, I made the assumption the FIXMEs were because the implementation
was dictated by the file format.
 
> These FIXMEs can be addressed by either adding engine discovery or 
> fixing the code to not assume class and engines to be balanced.

The code is just obeying the .wsim; the question is how to handle a
mismatch between the file and hw -- whether to do a transparent fixup to
use bcs instead of a secondary vcs?
 
> Larger rework might be needed to deal with the internal engine 
> representation after adding engine discovery. Or at least an audit and 
> checking legacy paths. Might be that refactor would be limited to engine 
> string to internal engine id lookup.
> 
> But to change file format I don't see an immediate need. VCS is already 
> defined as any VCS and there are explicit VCS1 and VCS2.

I was more concerned in case vcs was implicit since it was heavily
assumed by the code.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
@ 2019-05-13 13:29         ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-13 13:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-13 14:18:59)
> 
> On 10/05/2019 14:26, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Support new i915 uAPI for configuring contexts with engine maps.
> >>
> >> Please refer to the README file for more detailed explanation.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> +static int parse_engine_map(struct w_step *step, const char *_str)
> >> +{
> >> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> >> +
> >> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> >> +               enum intel_engine_id engine;
> >> +
> >> +               tstart = NULL;
> >> +
> >> +               if (!strcmp(token, "DEFAULT"))
> >> +                       return -1;
> >> +               else if (!strcmp(token, "VCS"))
> >> +                       return -1;
> >> +
> >> +               engine = str_to_engine(token);
> >> +               if ((int)engine < 0)
> >> +                       return -1;
> >> +
> >> +               if (engine != VCS1 && engine != VCS2)
> >> +                       return -1; /* TODO */
> >> +
> >> +               step->engine_map_count++;
> >> +               step->engine_map = realloc(step->engine_map,
> >> +                                          step->engine_map_count *
> >> +                                          sizeof(step->engine_map[0]));
> >> +               step->engine_map[step->engine_map_count - 1] = engine;
> > 
> > 
> >> +               if (ctx->engine_map) {
> >> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> >> +                                                         ctx->engine_map_count + 1);
> >> +                       struct drm_i915_gem_context_param param = {
> >> +                               .ctx_id = ctx_id,
> >> +                               .param = I915_CONTEXT_PARAM_ENGINES,
> >> +                               .size = sizeof(set_engines),
> >> +                               .value = to_user_pointer(&set_engines),
> >> +                       };
> >> +
> >> +                       set_engines.extensions = 0;
> >> +
> >> +                       /* Reserve slot for virtual engine. */
> >> +                       set_engines.engines[0].engine_class =
> >> +                               I915_ENGINE_CLASS_INVALID;
> >> +                       set_engines.engines[0].engine_instance =
> >> +                               I915_ENGINE_CLASS_INVALID_NONE;
> >> +
> >> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
> >> +                               set_engines.engines[j].engine_class =
> >> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
> >> +                               set_engines.engines[j].engine_instance =
> >> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
> >> +                       }
> > 
> > I would suggest the file format starts with class:instance specifiers.
> > Too much FIXME that I think will need a file format change.
> 
> Where do you see the need for a file format change?

Nah, I made the assumption the FIXMEs were because the implementation
was dictated by the file format.
 
> These FIXMEs can be addressed by either adding engine discovery or 
> fixing the code to not assume class and engines to be balanced.

The code is just obeying the .wsim; the question is how to handle a
mismatch between the file and hw -- whether to do a transparent fixup to
use bcs instead of a secondary vcs?
 
> Larger rework might be needed to deal with the internal engine 
> representation after adding engine discovery. Or at least an audit and 
> checking legacy paths. Might be that refactor would be limited to engine 
> string to internal engine id lookup.
> 
> But to change file format I don't see an immediate need. VCS is already 
> defined as any VCS and there are explicit VCS1 and VCS2.

I was more concerned in case vcs was implicit since it was heavily
assumed by the code.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
  2019-05-13 13:29         ` Chris Wilson
@ 2019-05-13 13:40           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:40 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 13/05/2019 14:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-13 14:18:59)
>>
>> On 10/05/2019 14:26, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Support new i915 uAPI for configuring contexts with engine maps.
>>>>
>>>> Please refer to the README file for more detailed explanation.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> +static int parse_engine_map(struct w_step *step, const char *_str)
>>>> +{
>>>> +       char *token, *tctx = NULL, *tstart = (char *)_str;
>>>> +
>>>> +       while ((token = strtok_r(tstart, "|", &tctx))) {
>>>> +               enum intel_engine_id engine;
>>>> +
>>>> +               tstart = NULL;
>>>> +
>>>> +               if (!strcmp(token, "DEFAULT"))
>>>> +                       return -1;
>>>> +               else if (!strcmp(token, "VCS"))
>>>> +                       return -1;
>>>> +
>>>> +               engine = str_to_engine(token);
>>>> +               if ((int)engine < 0)
>>>> +                       return -1;
>>>> +
>>>> +               if (engine != VCS1 && engine != VCS2)
>>>> +                       return -1; /* TODO */
>>>> +
>>>> +               step->engine_map_count++;
>>>> +               step->engine_map = realloc(step->engine_map,
>>>> +                                          step->engine_map_count *
>>>> +                                          sizeof(step->engine_map[0]));
>>>> +               step->engine_map[step->engine_map_count - 1] = engine;
>>>
>>>
>>>> +               if (ctx->engine_map) {
>>>> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>>> +                                                         ctx->engine_map_count + 1);
>>>> +                       struct drm_i915_gem_context_param param = {
>>>> +                               .ctx_id = ctx_id,
>>>> +                               .param = I915_CONTEXT_PARAM_ENGINES,
>>>> +                               .size = sizeof(set_engines),
>>>> +                               .value = to_user_pointer(&set_engines),
>>>> +                       };
>>>> +
>>>> +                       set_engines.extensions = 0;
>>>> +
>>>> +                       /* Reserve slot for virtual engine. */
>>>> +                       set_engines.engines[0].engine_class =
>>>> +                               I915_ENGINE_CLASS_INVALID;
>>>> +                       set_engines.engines[0].engine_instance =
>>>> +                               I915_ENGINE_CLASS_INVALID_NONE;
>>>> +
>>>> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
>>>> +                               set_engines.engines[j].engine_class =
>>>> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
>>>> +                               set_engines.engines[j].engine_instance =
>>>> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
>>>> +                       }
>>>
>>> I would suggest the file format starts with class:instance specifiers.
>>> Too much FIXME that I think will need a file format change.
>>
>> Where do you see the need for a file format change?
> 
> Nah, I made the assumption the FIXMEs were because the implementation
> was dictated by the file format.
>   
>> These FIXMEs can be addressed by either adding engine discovery or
>> fixing the code to not assume class and engines to be balanced.
> 
> The code is just obeying the .wsim; the question is how to handle a
> mismatch between the file and hw -- whether to do a transparent fixup to
> use bcs instead of a secondary vcs?

It would be of little use since we cannot load balance the two...

>> Larger rework might be needed to deal with the internal engine
>> representation after adding engine discovery. Or at least an audit and
>> checking legacy paths. Might be that refactor would be limited to engine
>> string to internal engine id lookup.
>>
>> But to change file format I don't see an immediate need. VCS is already
>> defined as any VCS and there are explicit VCS1 and VCS2.
> 
> I was more concerned in case vcs was implicit since it was heavily
> assumed by the code.

...and the thing I just realised I am not actually happy with is the 
lack of ability to write portable .wsim's when using engine maps.

Legacy files can configure implicit engine maps based on class (VCS), so 
I think the engine map command needs the same capability. Otherwise the 
.wsim's won't be portable. I want to be able to do:

M.1.VCS
B.1

And that to mean create engine map with all VCS class engines and enable 
load balancing. It can be achieved with legacy (implicit) load balancing 
but that cannot be tied with frame split.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/21] gem_wsim: Engine map support
@ 2019-05-13 13:40           ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:40 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 13/05/2019 14:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-13 14:18:59)
>>
>> On 10/05/2019 14:26, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:48)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Support new i915 uAPI for configuring contexts with engine maps.
>>>>
>>>> Please refer to the README file for more detailed explanation.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> +static int parse_engine_map(struct w_step *step, const char *_str)
>>>> +{
>>>> +       char *token, *tctx = NULL, *tstart = (char *)_str;
>>>> +
>>>> +       while ((token = strtok_r(tstart, "|", &tctx))) {
>>>> +               enum intel_engine_id engine;
>>>> +
>>>> +               tstart = NULL;
>>>> +
>>>> +               if (!strcmp(token, "DEFAULT"))
>>>> +                       return -1;
>>>> +               else if (!strcmp(token, "VCS"))
>>>> +                       return -1;
>>>> +
>>>> +               engine = str_to_engine(token);
>>>> +               if ((int)engine < 0)
>>>> +                       return -1;
>>>> +
>>>> +               if (engine != VCS1 && engine != VCS2)
>>>> +                       return -1; /* TODO */
>>>> +
>>>> +               step->engine_map_count++;
>>>> +               step->engine_map = realloc(step->engine_map,
>>>> +                                          step->engine_map_count *
>>>> +                                          sizeof(step->engine_map[0]));
>>>> +               step->engine_map[step->engine_map_count - 1] = engine;
>>>
>>>
>>>> +               if (ctx->engine_map) {
>>>> +                       I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>>> +                                                         ctx->engine_map_count + 1);
>>>> +                       struct drm_i915_gem_context_param param = {
>>>> +                               .ctx_id = ctx_id,
>>>> +                               .param = I915_CONTEXT_PARAM_ENGINES,
>>>> +                               .size = sizeof(set_engines),
>>>> +                               .value = to_user_pointer(&set_engines),
>>>> +                       };
>>>> +
>>>> +                       set_engines.extensions = 0;
>>>> +
>>>> +                       /* Reserve slot for virtual engine. */
>>>> +                       set_engines.engines[0].engine_class =
>>>> +                               I915_ENGINE_CLASS_INVALID;
>>>> +                       set_engines.engines[0].engine_instance =
>>>> +                               I915_ENGINE_CLASS_INVALID_NONE;
>>>> +
>>>> +                       for (j = 1; j <= ctx->engine_map_count; j++) {
>>>> +                               set_engines.engines[j].engine_class =
>>>> +                                       I915_ENGINE_CLASS_VIDEO; /* FIXME */
>>>> +                               set_engines.engines[j].engine_instance =
>>>> +                                       ctx->engine_map[j - 1] - VCS1; /* FIXME */
>>>> +                       }
>>>
>>> I would suggest the file format starts with class:instance specifiers.
>>> Too much FIXME that I think will need a file format change.
>>
>> Where do you see the need for a file format change?
> 
> Nah, I made the assumption the FIXMEs were because the implementation
> was dictated by the file format.
>   
>> These FIXMEs can be addressed by either adding engine discovery or
>> fixing the code to not assume class and engines to be balanced.
> 
> The code is just obeying the .wsim; the question is how to handle a
> mismatch between the file and hw -- whether to do a transparent fixup to
> use bcs instead of a secondary vcs?

It would be of little use since we cannot load balance the two...

>> Larger rework might be needed to deal with the internal engine
>> representation after adding engine discovery. Or at least an audit and
>> checking legacy paths. Might be that refactor would be limited to engine
>> string to internal engine id lookup.
>>
>> But to change file format I don't see an immediate need. VCS is already
>> defined as any VCS and there are explicit VCS1 and VCS2.
> 
> I was more concerned in case vcs was implicit since it was heavily
> assumed by the code.

...and the thing I just realised I am not actually happy with is the 
lack of ability to write portable .wsim's when using engine maps.

Legacy files can configure implicit engine maps based on class (VCS), so 
I think the engine map command needs the same capability. Otherwise the 
.wsim's won't be portable. I want to be able to do:

M.1.VCS
B.1

And that to mean create engine map with all VCS class engines and enable 
load balancing. It can be achieved with legacy (implicit) load balancing 
but that cannot be tied with frame split.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
  2019-05-10 13:48     ` Chris Wilson
@ 2019-05-13 13:59       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:59 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:54)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> For simulating frame split workloads it is useful to express a batch which
>> ends at the same time as the parallel submission on the respective bonded
>> engine. For this we add support for infinite batch durations and the batch
>> terminate command ('T'). Syntax looks like this:
>>
>>    1.RCS.*.0.0
>>    T.-1
>>
>> First step starts an infinite batch, and second command terminates the
>> infinite batch with the usual relative workload step addressing.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c                  | 119 +++++++++++++++++++------
>>   benchmarks/wsim/README                 |   9 +-
>>   benchmarks/wsim/frame-split-60fps.wsim |   6 +-
>>   3 files changed, 102 insertions(+), 32 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index cc6f4a742c12..97821b723b02 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -86,6 +86,7 @@ enum w_type
>>          ENGINE_MAP,
>>          LOAD_BALANCE,
>>          BOND,
>> +       TERMINATE,
>>   };
>>   
>>   struct deps
>> @@ -113,6 +114,7 @@ struct w_step
>>          unsigned int context;
>>          unsigned int engine;
>>          struct duration duration;
>> +       bool unbound_duration;
>>          struct deps data_deps;
>>          struct deps fence_deps;
>>          int emit_fence;
>> @@ -143,7 +145,7 @@ struct w_step
>>   
>>          struct drm_i915_gem_execbuffer2 eb;
>>          struct drm_i915_gem_exec_object2 *obj;
>> -       struct drm_i915_gem_relocation_entry reloc[4];
>> +       struct drm_i915_gem_relocation_entry reloc[5];
>>          unsigned long bb_sz;
>>          uint32_t bb_handle;
>>          uint32_t *seqno_value;
>> @@ -153,6 +155,7 @@ struct w_step
>>          uint32_t *rt1_address;
>>          uint32_t *latch_value;
>>          uint32_t *latch_address;
>> +       uint32_t *recursive_bb_start;
>>   };
>>   
>>   DECLARE_EWMA(uint64_t, rt, 4, 2)
>> @@ -491,6 +494,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   
>>                                  step.type = ENGINE_MAP;
>>                                  goto add_step;
>> +                       } else if (!strcmp(field, "T")) {
>> +                               int_field(TERMINATE, target,
>> +                                         tmp >= 0 || ((int)nr_steps + tmp) < 0,
>> +                                         "Invalid terminate target at step %u!\n");
>>                          } else if (!strcmp(field, "X")) {
>>                                  unsigned int nr = 0;
>>                                  while ((field = strtok_r(fstart, ".", &fctx))) {
>> @@ -605,23 +612,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   
>>                          fstart = NULL;
>>   
>> -                       tmpl = strtol(field, &sep, 10);
>> -                       check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
>> -                                 tmpl == LONG_MAX,
>> -                                 "Invalid duration at step %u!\n", nr_steps);
>> -                       step.duration.min = tmpl;
>> -
>> -                       if (sep && *sep == '-') {
>> -                               tmpl = strtol(sep + 1, NULL, 10);
>> -                               check_arg(tmpl <= 0 ||
>> -                                         tmpl <= step.duration.min ||
>> -                                         tmpl == LONG_MIN ||
>> +                       if (field[0] == '*') {
>> +                               step.unbound_duration = true;
>> +                       } else {
>> +                               tmpl = strtol(field, &sep, 10);
>> +                               check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
>>                                            tmpl == LONG_MAX,
>> -                                         "Invalid duration range at step %u!\n",
>> +                                         "Invalid duration at step %u!\n",
>>                                            nr_steps);
>> -                               step.duration.max = tmpl;
>> -                       } else {
>> -                               step.duration.max = step.duration.min;
>> +                               step.duration.min = tmpl;
>> +
>> +                               if (sep && *sep == '-') {
>> +                                       tmpl = strtol(sep + 1, NULL, 10);
>> +                                       check_arg(tmpl <= 0 ||
>> +                                               tmpl <= step.duration.min ||
>> +                                               tmpl == LONG_MIN ||
>> +                                               tmpl == LONG_MAX,
>> +                                               "Invalid duration range at step %u!\n",
>> +                                               nr_steps);
>> +                                       step.duration.max = tmpl;
>> +                               } else {
>> +                                       step.duration.max = step.duration.min;
>> +                               }
>>                          }
>>   
>>                          valid++;
>> @@ -781,7 +793,7 @@ init_bb(struct w_step *w, unsigned int flags)
>>          unsigned int i;
>>          uint32_t *ptr;
>>   
>> -       if (!arb_period)
>> +       if (w->unbound_duration || !arb_period)
>>                  return;
>>   
>>          gem_set_domain(fd, w->bb_handle,
>> @@ -801,6 +813,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          const uint32_t bbe = 0xa << 23;
>>          unsigned long mmap_start, mmap_len;
>>          unsigned long batch_start = w->bb_sz;
>> +       unsigned int r = 0;
>>          uint32_t *ptr, *cs;
>>   
>>          igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
>> @@ -811,6 +824,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          if (flags & RT)
>>                  batch_start -= 12 * sizeof(uint32_t);
>>   
>> +       if (w->unbound_duration)
>> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
>> +
>>          mmap_start = rounddown(batch_start, PAGE_SIZE);
>>          mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>>   
>> @@ -820,8 +836,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>>          cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>>   
>> +       if (w->unbound_duration) {
>> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>> +               batch_start += 4 * sizeof(uint32_t);
>> +
>> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
>> +               w->recursive_bb_start = cs;
>> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
>> +               *cs++ = 0;
>> +               *cs++ = 0;
> 
> Hmm. Have we previously checked for gen >= 8?

No, will add.

> So preemption check interval is given by batch_start - mmap_start.
> Which is limited to a max of 64 bytes. That might be a bit excessive on
> the frequency of doing MI_BB_START, certainly for gen7, gen8+ is a tad
> more forgiving i.e. it has more bw and doesn't starve the cpu as much.

Nope, mmap_start is not controlling the batch buffer at all. It is just 
to find the calculated batch_start given that mmap() was given a 
round-down PAGE_ALIGN start address. Actual preemption check interval is 
one MI_NOOP. /o\ How much would you recommend to be safe?

>> +       }
>> +
>>          if (flags & SEQNO) {
>> -               w->reloc[0].offset = batch_start + sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = MI_STORE_DWORD_IMM;
>> @@ -833,7 +860,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          }
>>   
>>          if (flags & RT) {
>> -               w->reloc[1].offset = batch_start + sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = MI_STORE_DWORD_IMM;
>> @@ -843,7 +870,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>                  w->rt0_value = cs;
>>                  *cs++ = 0;
>>   
>> -               w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
>> @@ -852,7 +879,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>                  *cs++ = 0;
>>                  *cs++ = 0;
>>   
>> -               w->reloc[3].offset = batch_start + sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = MI_STORE_DWORD_IMM;
>> @@ -984,19 +1011,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>>                  }
>>          }
>>   
>> -       w->bb_sz = get_bb_sz(w->duration.max);
>> -       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
>> +       if (w->unbound_duration)
>> +               /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
>> +               w->bb_sz = max(64, get_bb_sz(w->preempt_us)) +
>> +                          (1 + 3) * sizeof(uint32_t);
>> +       else
>> +               w->bb_sz = get_bb_sz(w->duration.max);
>> +       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
>>          init_bb(w, flags);
>>          terminate_bb(w, flags);
>>   
>> -       if (flags & SEQNO) {
>> +       if ((flags & SEQNO) || w->unbound_duration) {
>>                  w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
>> +               if (flags & SEQNO)
>> +                       w->obj[j].relocation_count++;
>>                  if (flags & RT)
>> -                       w->obj[j].relocation_count = 4;
>> -               else
>> -                       w->obj[j].relocation_count = 1;
>> +                       w->obj[j].relocation_count += 3;
>> +               if (w->unbound_duration)
>> +                       w->obj[j].relocation_count++;
> 
> Huh, I expected to see w->obj[j].relocation_count = r;
> Already out of scope?

In a helper yes. Under danger that I got confused about what's what, I 
think I could make the helper return the count.

> 
>>                  for (i = 0; i < w->obj[j].relocation_count; i++)
>>                          w->reloc[i].target_handle = 1;
>> +               if (w->unbound_duration)
>> +                       w->reloc[0].target_handle = j;
> 
> Ok, recursive BB_START.
>>          }
>>   
>>          w->eb.buffers_ptr = to_user_pointer(w->obj);
>> @@ -2036,6 +2072,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
>>          }
>>   }
>>   
>> +static void
>> +update_bb_start(struct w_step *w)
>> +{
>> +       if (!w->unbound_duration)
>> +               return;
>> +
>> +       gem_set_domain(fd, w->bb_handle,
>> +                      I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
> 
> Hmm. A scary sync point. Do you just want to be sure you have flushed
> the previous user?

Yes. By definition one infinite batch runs max once per frame. If it has 
been terminated the code needs to re-instate the BB_START, so I thought 
I need to be sure it is not executing before I do that. I guess if 
someone forgot to terminate it this would hang on second loop. But I 
think that's better than just carrying on with a potentially no-op 
instead of infinite batch.

> 
>> +       *w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
>> +}
>> +
>>   static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
>>   {
>>          if (target < 0)
>> @@ -2171,9 +2219,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
>>          if (flags & RT)
>>                  update_bb_rt(w, engine, seqno);
>>   
>> +       update_bb_start(w);
>> +
>>          w->eb.batch_start_offset =
>> +               w->unbound_duration ?
>> +               0 :
>>                  ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
>> -                       2 * sizeof(uint32_t));
>> +                     2 * sizeof(uint32_t));
>>   
>>          for (i = 0; i < w->fence_deps.nr; i++) {
>>                  int tgt = w->idx + w->fence_deps.list[i];
>> @@ -2313,6 +2365,17 @@ static void *run_workload(void *data)
>>                                                                      w->priority;
>>                                  }
>>                                  continue;
>> +                       } else if (w->type == TERMINATE) {
>> +                               unsigned int t_idx = i + w->target;
>> +
>> +                               igt_assert(t_idx >= 0 && t_idx < i);
>> +                               igt_assert(wrk->steps[t_idx].type == BATCH);
>> +                               igt_assert(wrk->steps[t_idx].unbound_duration);
>> +
>> +                               *wrk->steps[t_idx].recursive_bb_start =
>> +                                       MI_BATCH_BUFFER_END;
>> +                               __sync_synchronize();
>> +                               continue;
>>                          } else if (w->type == PREEMPTION ||
>>                                     w->type == ENGINE_MAP ||
>>                                     w->type == LOAD_BALANCE ||
>> diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
>> index 6aec718bc812..c94d01018419 100644
>> --- a/benchmarks/wsim/README
>> +++ b/benchmarks/wsim/README
>> @@ -2,11 +2,11 @@ Workload descriptor format
>>   ==========================
>>   
>>   ctx.engine.duration_us.dependency.wait,...
>> -<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
>> +<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
>>   B.<uint>
>>   M.<uint>.<str>[|<str>]...
>>   P|X.<uint>.<int>
>> -d|p|s|t|q|a.<int>,...
>> +d|p|s|t|q|a|T.<int>,...
>>   b.<uint>.<uint>.<str>
>>   f
>>   
>> @@ -30,6 +30,7 @@ Additional workload steps are also supported:
>>    'b' - Set up engine bonds.
>>    'M' - Set up engine map.
>>    'P' - Context priority.
>> + 'T' - Terminate an infinite batch.
>>    'X' - Context preemption control.
>>   
>>   Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
>> @@ -77,6 +78,10 @@ Example:
>>   
>>   I this case the last step has a data dependency on both first and second steps.
>>   
>> +Batch durations can also be specified as infinite by using the '*' in the
>> +duration field. Such batches must be ended by the terminate command ('T')
>> +otherwise they will cause a GPU hang to be reported.
>> +
>>   Sync (fd) fences
>>   ----------------
>>   
>> diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
>> index cfbfcd39be7d..ea89da3add48 100644
>> --- a/benchmarks/wsim/frame-split-60fps.wsim
>> +++ b/benchmarks/wsim/frame-split-60fps.wsim
>> @@ -6,10 +6,12 @@ M.2.VCS2
>>   B.2
>>   b.2.1.VCS1
>>   f
>> -1.DEFAULT.4000-6000.f-1.0
>> +1.DEFAULT.*.f-1.0
>>   2.DEFAULT.4000-6000.s-1.0
>>   a.-3
>> -3.RCS.2000-4000.-3/-2.0
>> +s.-2
>> +T.-4
>> +3.RCS.2000-4000.-5/-4.0
>>   3.VECS.2000.-1.0
>>   4.BCS.1000.-1.0
>>   s.-2
> 
> Usecase looks reasonable.
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
@ 2019-05-13 13:59       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-13 13:59 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:54)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> For simulating frame split workloads it is useful to express a batch which
>> ends at the same time as the parallel submission on the respective bonded
>> engine. For this we add support for infinite batch durations and the batch
>> terminate command ('T'). Syntax looks like this:
>>
>>    1.RCS.*.0.0
>>    T.-1
>>
>> First step starts an infinite batch, and second command terminates the
>> infinite batch with the usual relative workload step addressing.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c                  | 119 +++++++++++++++++++------
>>   benchmarks/wsim/README                 |   9 +-
>>   benchmarks/wsim/frame-split-60fps.wsim |   6 +-
>>   3 files changed, 102 insertions(+), 32 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index cc6f4a742c12..97821b723b02 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -86,6 +86,7 @@ enum w_type
>>          ENGINE_MAP,
>>          LOAD_BALANCE,
>>          BOND,
>> +       TERMINATE,
>>   };
>>   
>>   struct deps
>> @@ -113,6 +114,7 @@ struct w_step
>>          unsigned int context;
>>          unsigned int engine;
>>          struct duration duration;
>> +       bool unbound_duration;
>>          struct deps data_deps;
>>          struct deps fence_deps;
>>          int emit_fence;
>> @@ -143,7 +145,7 @@ struct w_step
>>   
>>          struct drm_i915_gem_execbuffer2 eb;
>>          struct drm_i915_gem_exec_object2 *obj;
>> -       struct drm_i915_gem_relocation_entry reloc[4];
>> +       struct drm_i915_gem_relocation_entry reloc[5];
>>          unsigned long bb_sz;
>>          uint32_t bb_handle;
>>          uint32_t *seqno_value;
>> @@ -153,6 +155,7 @@ struct w_step
>>          uint32_t *rt1_address;
>>          uint32_t *latch_value;
>>          uint32_t *latch_address;
>> +       uint32_t *recursive_bb_start;
>>   };
>>   
>>   DECLARE_EWMA(uint64_t, rt, 4, 2)
>> @@ -491,6 +494,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   
>>                                  step.type = ENGINE_MAP;
>>                                  goto add_step;
>> +                       } else if (!strcmp(field, "T")) {
>> +                               int_field(TERMINATE, target,
>> +                                         tmp >= 0 || ((int)nr_steps + tmp) < 0,
>> +                                         "Invalid terminate target at step %u!\n");
>>                          } else if (!strcmp(field, "X")) {
>>                                  unsigned int nr = 0;
>>                                  while ((field = strtok_r(fstart, ".", &fctx))) {
>> @@ -605,23 +612,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   
>>                          fstart = NULL;
>>   
>> -                       tmpl = strtol(field, &sep, 10);
>> -                       check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
>> -                                 tmpl == LONG_MAX,
>> -                                 "Invalid duration at step %u!\n", nr_steps);
>> -                       step.duration.min = tmpl;
>> -
>> -                       if (sep && *sep == '-') {
>> -                               tmpl = strtol(sep + 1, NULL, 10);
>> -                               check_arg(tmpl <= 0 ||
>> -                                         tmpl <= step.duration.min ||
>> -                                         tmpl == LONG_MIN ||
>> +                       if (field[0] == '*') {
>> +                               step.unbound_duration = true;
>> +                       } else {
>> +                               tmpl = strtol(field, &sep, 10);
>> +                               check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
>>                                            tmpl == LONG_MAX,
>> -                                         "Invalid duration range at step %u!\n",
>> +                                         "Invalid duration at step %u!\n",
>>                                            nr_steps);
>> -                               step.duration.max = tmpl;
>> -                       } else {
>> -                               step.duration.max = step.duration.min;
>> +                               step.duration.min = tmpl;
>> +
>> +                               if (sep && *sep == '-') {
>> +                                       tmpl = strtol(sep + 1, NULL, 10);
>> +                                       check_arg(tmpl <= 0 ||
>> +                                               tmpl <= step.duration.min ||
>> +                                               tmpl == LONG_MIN ||
>> +                                               tmpl == LONG_MAX,
>> +                                               "Invalid duration range at step %u!\n",
>> +                                               nr_steps);
>> +                                       step.duration.max = tmpl;
>> +                               } else {
>> +                                       step.duration.max = step.duration.min;
>> +                               }
>>                          }
>>   
>>                          valid++;
>> @@ -781,7 +793,7 @@ init_bb(struct w_step *w, unsigned int flags)
>>          unsigned int i;
>>          uint32_t *ptr;
>>   
>> -       if (!arb_period)
>> +       if (w->unbound_duration || !arb_period)
>>                  return;
>>   
>>          gem_set_domain(fd, w->bb_handle,
>> @@ -801,6 +813,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          const uint32_t bbe = 0xa << 23;
>>          unsigned long mmap_start, mmap_len;
>>          unsigned long batch_start = w->bb_sz;
>> +       unsigned int r = 0;
>>          uint32_t *ptr, *cs;
>>   
>>          igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
>> @@ -811,6 +824,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          if (flags & RT)
>>                  batch_start -= 12 * sizeof(uint32_t);
>>   
>> +       if (w->unbound_duration)
>> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
>> +
>>          mmap_start = rounddown(batch_start, PAGE_SIZE);
>>          mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>>   
>> @@ -820,8 +836,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>>          cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>>   
>> +       if (w->unbound_duration) {
>> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>> +               batch_start += 4 * sizeof(uint32_t);
>> +
>> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
>> +               w->recursive_bb_start = cs;
>> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
>> +               *cs++ = 0;
>> +               *cs++ = 0;
> 
> Hmm. Have we previously checked for gen >= 8?

No, will add.

> So preemption check interval is given by batch_start - mmap_start.
> Which is limited to a max of 64 bytes. That might be a bit excessive on
> the frequency of doing MI_BB_START, certainly for gen7, gen8+ is a tad
> more forgiving i.e. it has more bw and doesn't starve the cpu as much.

Nope, mmap_start is not controlling the batch buffer at all. It is just 
to find the calculated batch_start given that mmap() was given a 
round-down PAGE_ALIGN start address. Actual preemption check interval is 
one MI_NOOP. /o\ How much would you recommend to be safe?

>> +       }
>> +
>>          if (flags & SEQNO) {
>> -               w->reloc[0].offset = batch_start + sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = MI_STORE_DWORD_IMM;
>> @@ -833,7 +860,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>          }
>>   
>>          if (flags & RT) {
>> -               w->reloc[1].offset = batch_start + sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = MI_STORE_DWORD_IMM;
>> @@ -843,7 +870,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>                  w->rt0_value = cs;
>>                  *cs++ = 0;
>>   
>> -               w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
>> @@ -852,7 +879,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>                  *cs++ = 0;
>>                  *cs++ = 0;
>>   
>> -               w->reloc[3].offset = batch_start + sizeof(uint32_t);
>> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>>                  batch_start += 4 * sizeof(uint32_t);
>>   
>>                  *cs++ = MI_STORE_DWORD_IMM;
>> @@ -984,19 +1011,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>>                  }
>>          }
>>   
>> -       w->bb_sz = get_bb_sz(w->duration.max);
>> -       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
>> +       if (w->unbound_duration)
>> +               /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
>> +               w->bb_sz = max(64, get_bb_sz(w->preempt_us)) +
>> +                          (1 + 3) * sizeof(uint32_t);
>> +       else
>> +               w->bb_sz = get_bb_sz(w->duration.max);
>> +       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
>>          init_bb(w, flags);
>>          terminate_bb(w, flags);
>>   
>> -       if (flags & SEQNO) {
>> +       if ((flags & SEQNO) || w->unbound_duration) {
>>                  w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
>> +               if (flags & SEQNO)
>> +                       w->obj[j].relocation_count++;
>>                  if (flags & RT)
>> -                       w->obj[j].relocation_count = 4;
>> -               else
>> -                       w->obj[j].relocation_count = 1;
>> +                       w->obj[j].relocation_count += 3;
>> +               if (w->unbound_duration)
>> +                       w->obj[j].relocation_count++;
> 
> Huh, I expected to see w->obj[j].relocation_count = r;
> Already out of scope?

In a helper yes. Under danger that I got confused about what's what, I 
think I could make the helper return the count.

> 
>>                  for (i = 0; i < w->obj[j].relocation_count; i++)
>>                          w->reloc[i].target_handle = 1;
>> +               if (w->unbound_duration)
>> +                       w->reloc[0].target_handle = j;
> 
> Ok, recursive BB_START.
>>          }
>>   
>>          w->eb.buffers_ptr = to_user_pointer(w->obj);
>> @@ -2036,6 +2072,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
>>          }
>>   }
>>   
>> +static void
>> +update_bb_start(struct w_step *w)
>> +{
>> +       if (!w->unbound_duration)
>> +               return;
>> +
>> +       gem_set_domain(fd, w->bb_handle,
>> +                      I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
> 
> Hmm. A scary sync point. Do you just want to be sure you have flushed
> the previous user?

Yes. By definition one infinite batch runs max once per frame. If it has 
been terminated the code needs to re-instate the BB_START, so I thought 
I need to be sure it is not executing before I do that. I guess if 
someone forgot to terminate it this would hang on second loop. But I 
think that's better than just carrying on with a potentially no-op 
instead of infinite batch.

> 
>> +       *w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
>> +}
>> +
>>   static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
>>   {
>>          if (target < 0)
>> @@ -2171,9 +2219,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
>>          if (flags & RT)
>>                  update_bb_rt(w, engine, seqno);
>>   
>> +       update_bb_start(w);
>> +
>>          w->eb.batch_start_offset =
>> +               w->unbound_duration ?
>> +               0 :
>>                  ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
>> -                       2 * sizeof(uint32_t));
>> +                     2 * sizeof(uint32_t));
>>   
>>          for (i = 0; i < w->fence_deps.nr; i++) {
>>                  int tgt = w->idx + w->fence_deps.list[i];
>> @@ -2313,6 +2365,17 @@ static void *run_workload(void *data)
>>                                                                      w->priority;
>>                                  }
>>                                  continue;
>> +                       } else if (w->type == TERMINATE) {
>> +                               unsigned int t_idx = i + w->target;
>> +
>> +                               igt_assert(t_idx >= 0 && t_idx < i);
>> +                               igt_assert(wrk->steps[t_idx].type == BATCH);
>> +                               igt_assert(wrk->steps[t_idx].unbound_duration);
>> +
>> +                               *wrk->steps[t_idx].recursive_bb_start =
>> +                                       MI_BATCH_BUFFER_END;
>> +                               __sync_synchronize();
>> +                               continue;
>>                          } else if (w->type == PREEMPTION ||
>>                                     w->type == ENGINE_MAP ||
>>                                     w->type == LOAD_BALANCE ||
>> diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
>> index 6aec718bc812..c94d01018419 100644
>> --- a/benchmarks/wsim/README
>> +++ b/benchmarks/wsim/README
>> @@ -2,11 +2,11 @@ Workload descriptor format
>>   ==========================
>>   
>>   ctx.engine.duration_us.dependency.wait,...
>> -<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
>> +<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
>>   B.<uint>
>>   M.<uint>.<str>[|<str>]...
>>   P|X.<uint>.<int>
>> -d|p|s|t|q|a.<int>,...
>> +d|p|s|t|q|a|T.<int>,...
>>   b.<uint>.<uint>.<str>
>>   f
>>   
>> @@ -30,6 +30,7 @@ Additional workload steps are also supported:
>>    'b' - Set up engine bonds.
>>    'M' - Set up engine map.
>>    'P' - Context priority.
>> + 'T' - Terminate an infinite batch.
>>    'X' - Context preemption control.
>>   
>>   Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
>> @@ -77,6 +78,10 @@ Example:
>>   
>>   I this case the last step has a data dependency on both first and second steps.
>>   
>> +Batch durations can also be specified as infinite by using the '*' in the
>> +duration field. Such batches must be ended by the terminate command ('T')
>> +otherwise they will cause a GPU hang to be reported.
>> +
>>   Sync (fd) fences
>>   ----------------
>>   
>> diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
>> index cfbfcd39be7d..ea89da3add48 100644
>> --- a/benchmarks/wsim/frame-split-60fps.wsim
>> +++ b/benchmarks/wsim/frame-split-60fps.wsim
>> @@ -6,10 +6,12 @@ M.2.VCS2
>>   B.2
>>   b.2.1.VCS1
>>   f
>> -1.DEFAULT.4000-6000.f-1.0
>> +1.DEFAULT.*.f-1.0
>>   2.DEFAULT.4000-6000.s-1.0
>>   a.-3
>> -3.RCS.2000-4000.-3/-2.0
>> +s.-2
>> +T.-4
>> +3.RCS.2000-4000.-5/-4.0
>>   3.VECS.2000.-1.0
>>   4.BCS.1000.-1.0
>>   s.-2
> 
> Usecase looks reasonable.
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks,

Tvrtko

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
  2019-05-13 13:59       ` Tvrtko Ursulin
@ 2019-05-13 14:11         ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-13 14:11 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-13 14:59:01)
> 
> On 10/05/2019 14:48, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:54)
> > So preemption check interval is given by batch_start - mmap_start.
> > Which is limited to a max of 64 bytes. That might be a bit excessive on
> > the frequency of doing MI_BB_START, certainly for gen7, gen8+ is a tad
> > more forgiving i.e. it has more bw and doesn't starve the cpu as much.
> 
> Nope, mmap_start is not controlling the batch buffer at all. It is just 
> to find the calculated batch_start given that mmap() was given a 
> round-down PAGE_ALIGN start address. Actual preemption check interval is 
> one MI_NOOP. /o\ How much would you recommend to be safe?

We've been using 64 bytes routinely without too much hassle, but that
can be noticeable. For the dummyload we use roughly the full page and
that seems ok, with a few microseconds of extra latency. If that's
tolerable, I would opt for trying to use a full page for the recursive
batch. Alternatively, we can use a MI_SEMA_WAIT | POLL on a user
address (just throwing it out there as an option).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 17/21] gem_wsim: Infinite batch support
@ 2019-05-13 14:11         ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-13 14:11 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-13 14:59:01)
> 
> On 10/05/2019 14:48, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:54)
> > So preemption check interval is given by batch_start - mmap_start.
> > Which is limited to a max of 64 bytes. That might be a bit excessive on
> > the frequency of doing MI_BB_START, certainly for gen7, gen8+ is a tad
> > more forgiving i.e. it has more bw and doesn't starve the cpu as much.
> 
> Nope, mmap_start is not controlling the batch buffer at all. It is just 
> to find the calculated batch_start given that mmap() was given a 
> round-down PAGE_ALIGN start address. Actual preemption check interval is 
> one MI_NOOP. /o\ How much would you recommend to be safe?

We've been using 64 bytes routinely without too much hassle, but that
can be noticeable. For the dummyload we use roughly the full page and
that seems ok, with a few microseconds of extra latency. If that's
tolerable, I would opt for trying to use a full page for the recursive
batch. Alternatively, we can use a MI_SEMA_WAIT | POLL on a user
address (just throwing it out there as an option).
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 19/21] gem_wsim: Per context SSEU control
  2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-14 21:53     ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-14 21:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-08 13:10:56)
> +static void get_device_sseu(void)
> +{
> +       struct drm_i915_gem_context_param param = { };
> +
> +       param.param = I915_CONTEXT_PARAM_SSEU;
> +       param.value = (uintptr_t)&device_sseu;
> +
> +       gem_context_get_param(fd, &param);

This is an annoying assert that prevents running on v4.19. Looks fine to
fail?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 19/21] gem_wsim: Per context SSEU control
@ 2019-05-14 21:53     ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-14 21:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-08 13:10:56)
> +static void get_device_sseu(void)
> +{
> +       struct drm_i915_gem_context_param param = { };
> +
> +       param.param = I915_CONTEXT_PARAM_SSEU;
> +       param.value = (uintptr_t)&device_sseu;
> +
> +       gem_context_get_param(fd, &param);

This is an annoying assert that prevents running on v4.19. Looks fine to
fail?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
  2019-05-10 13:31     ` Chris Wilson
@ 2019-05-15 11:44       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-15 11:44 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 10/05/2019 14:31, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
> 
> Restriction makes sense for keeping linenoise^W file format simple.
> 
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>                  if (ctx->engine_map) {
>>                          I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>                                                            ctx->engine_map_count + 1);
>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
>> +                                                                ctx->engine_map_count);
>>                          struct drm_i915_gem_context_param param = {
>>                                  .ctx_id = ctx_id,
>>                                  .param = I915_CONTEXT_PARAM_ENGINES,
>> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>                                  .value = to_user_pointer(&set_engines),
>>                          };
>>   
>> -                       set_engines.extensions = 0;
>> +                       if (ctx->wants_balance) {
>> +                               set_engines.extensions =
>> +                                       to_user_pointer(&load_balance);
>> +
>> +                               memset(&load_balance, 0, sizeof(load_balance));
>> +                               load_balance.base.name =
>> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
>> +                               load_balance.num_siblings =
>> +                                       ctx->engine_map_count;
>> +
>> +                               for (j = 0; j < ctx->engine_map_count; j++) {
>> +                                       load_balance.engines[j].engine_class =
>> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
>> +                                       load_balance.engines[j].engine_instance =
>> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
> 
> Ok, more fallout from fixing ctx->engine_map[] first?

Not sure I understand the question.

I am at the moment updating the series with review feedback and some 
small thing here and there. When done with that I'll see if these VCS 
hardcoded assumptions can be easily solved. Basically I will have a go 
at integrating engine discovery which I think its definitely needed now 
that I have added class based engine map building ability.

Regards,

Tvrtko

> Otherwise looks fine.
> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
@ 2019-05-15 11:44       ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-15 11:44 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 10/05/2019 14:31, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
> 
> Restriction makes sense for keeping linenoise^W file format simple.
> 
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>                  if (ctx->engine_map) {
>>                          I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>                                                            ctx->engine_map_count + 1);
>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
>> +                                                                ctx->engine_map_count);
>>                          struct drm_i915_gem_context_param param = {
>>                                  .ctx_id = ctx_id,
>>                                  .param = I915_CONTEXT_PARAM_ENGINES,
>> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>                                  .value = to_user_pointer(&set_engines),
>>                          };
>>   
>> -                       set_engines.extensions = 0;
>> +                       if (ctx->wants_balance) {
>> +                               set_engines.extensions =
>> +                                       to_user_pointer(&load_balance);
>> +
>> +                               memset(&load_balance, 0, sizeof(load_balance));
>> +                               load_balance.base.name =
>> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
>> +                               load_balance.num_siblings =
>> +                                       ctx->engine_map_count;
>> +
>> +                               for (j = 0; j < ctx->engine_map_count; j++) {
>> +                                       load_balance.engines[j].engine_class =
>> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
>> +                                       load_balance.engines[j].engine_instance =
>> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
> 
> Ok, more fallout from fixing ctx->engine_map[] first?

Not sure I understand the question.

I am at the moment updating the series with review feedback and some 
small thing here and there. When done with that I'll see if these VCS 
hardcoded assumptions can be easily solved. Basically I will have a go 
at integrating engine discovery which I think its definitely needed now 
that I have added class based engine map building ability.

Regards,

Tvrtko

> Otherwise looks fine.
> -Chris
> 
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
  2019-05-15 11:44       ` Tvrtko Ursulin
@ 2019-05-15 11:48         ` Chris Wilson
  -1 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-15 11:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-15 12:44:41)
> 
> On 10/05/2019 14:31, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> A new workload command for enabling a load balanced context map (aka
> >> Virtual Engine). Example usage:
> >>
> >>    B.1
> >>
> >> This turns on load balancing for context one, assuming it has already been
> >> configured with an engine map. Only DEFAULT engine specifier can be used
> >> with load balanced engine maps.
> > 
> > Restriction makes sense for keeping linenoise^W file format simple.
> > 
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
> >>                  if (ctx->engine_map) {
> >>                          I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> >>                                                            ctx->engine_map_count + 1);
> >> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
> >> +                                                                ctx->engine_map_count);
> >>                          struct drm_i915_gem_context_param param = {
> >>                                  .ctx_id = ctx_id,
> >>                                  .param = I915_CONTEXT_PARAM_ENGINES,
> >> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
> >>                                  .value = to_user_pointer(&set_engines),
> >>                          };
> >>   
> >> -                       set_engines.extensions = 0;
> >> +                       if (ctx->wants_balance) {
> >> +                               set_engines.extensions =
> >> +                                       to_user_pointer(&load_balance);
> >> +
> >> +                               memset(&load_balance, 0, sizeof(load_balance));
> >> +                               load_balance.base.name =
> >> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
> >> +                               load_balance.num_siblings =
> >> +                                       ctx->engine_map_count;
> >> +
> >> +                               for (j = 0; j < ctx->engine_map_count; j++) {
> >> +                                       load_balance.engines[j].engine_class =
> >> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
> >> +                                       load_balance.engines[j].engine_instance =
> >> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
> > 
> > Ok, more fallout from fixing ctx->engine_map[] first?
> 
> Not sure I understand the question.

The proliferation of FIXME, the assumption of CLASS_VIDEO and an
impedance mismatch between engine_map and class:instance. Basically
those FIXME raise the question of what do you intend this to look like?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
@ 2019-05-15 11:48         ` Chris Wilson
  0 siblings, 0 replies; 126+ messages in thread
From: Chris Wilson @ 2019-05-15 11:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-15 12:44:41)
> 
> On 10/05/2019 14:31, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> A new workload command for enabling a load balanced context map (aka
> >> Virtual Engine). Example usage:
> >>
> >>    B.1
> >>
> >> This turns on load balancing for context one, assuming it has already been
> >> configured with an engine map. Only DEFAULT engine specifier can be used
> >> with load balanced engine maps.
> > 
> > Restriction makes sense for keeping linenoise^W file format simple.
> > 
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
> >>                  if (ctx->engine_map) {
> >>                          I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
> >>                                                            ctx->engine_map_count + 1);
> >> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
> >> +                                                                ctx->engine_map_count);
> >>                          struct drm_i915_gem_context_param param = {
> >>                                  .ctx_id = ctx_id,
> >>                                  .param = I915_CONTEXT_PARAM_ENGINES,
> >> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
> >>                                  .value = to_user_pointer(&set_engines),
> >>                          };
> >>   
> >> -                       set_engines.extensions = 0;
> >> +                       if (ctx->wants_balance) {
> >> +                               set_engines.extensions =
> >> +                                       to_user_pointer(&load_balance);
> >> +
> >> +                               memset(&load_balance, 0, sizeof(load_balance));
> >> +                               load_balance.base.name =
> >> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
> >> +                               load_balance.num_siblings =
> >> +                                       ctx->engine_map_count;
> >> +
> >> +                               for (j = 0; j < ctx->engine_map_count; j++) {
> >> +                                       load_balance.engines[j].engine_class =
> >> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
> >> +                                       load_balance.engines[j].engine_instance =
> >> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
> > 
> > Ok, more fallout from fixing ctx->engine_map[] first?
> 
> Not sure I understand the question.

The proliferation of FIXME, the assumption of CLASS_VIDEO and an
impedance mismatch between engine_map and class:instance. Basically
those FIXME raise the question of what do you intend this to look like?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
  2019-05-15 11:48         ` Chris Wilson
@ 2019-05-15 11:55           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-15 11:55 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 15/05/2019 12:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-15 12:44:41)
>>
>> On 10/05/2019 14:31, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> A new workload command for enabling a load balanced context map (aka
>>>> Virtual Engine). Example usage:
>>>>
>>>>     B.1
>>>>
>>>> This turns on load balancing for context one, assuming it has already been
>>>> configured with an engine map. Only DEFAULT engine specifier can be used
>>>> with load balanced engine maps.
>>>
>>> Restriction makes sense for keeping linenoise^W file format simple.
>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>>>                   if (ctx->engine_map) {
>>>>                           I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>>>                                                             ctx->engine_map_count + 1);
>>>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
>>>> +                                                                ctx->engine_map_count);
>>>>                           struct drm_i915_gem_context_param param = {
>>>>                                   .ctx_id = ctx_id,
>>>>                                   .param = I915_CONTEXT_PARAM_ENGINES,
>>>> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>>>                                   .value = to_user_pointer(&set_engines),
>>>>                           };
>>>>    
>>>> -                       set_engines.extensions = 0;
>>>> +                       if (ctx->wants_balance) {
>>>> +                               set_engines.extensions =
>>>> +                                       to_user_pointer(&load_balance);
>>>> +
>>>> +                               memset(&load_balance, 0, sizeof(load_balance));
>>>> +                               load_balance.base.name =
>>>> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
>>>> +                               load_balance.num_siblings =
>>>> +                                       ctx->engine_map_count;
>>>> +
>>>> +                               for (j = 0; j < ctx->engine_map_count; j++) {
>>>> +                                       load_balance.engines[j].engine_class =
>>>> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
>>>> +                                       load_balance.engines[j].engine_instance =
>>>> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
>>>
>>> Ok, more fallout from fixing ctx->engine_map[] first?
>>
>> Not sure I understand the question.
> 
> The proliferation of FIXME, the assumption of CLASS_VIDEO and an
> impedance mismatch between engine_map and class:instance. Basically
> those FIXME raise the question of what do you intend this to look like?

Intention that implicit and explicit engine maps get populated by 
available engines.

When "-b i915":

1.VCS.1000.0.0 -> implicit map of available vcs engines

M.1.VCS
B.1	\-> explicit map of available vcs engines

That would support Icelake vcs0+vcs2 SKUs. And explicit engine map wsims 
would be more portable, like the original ones were.

Also, I am contemplating using VCS2 in wsim as meaning the 2nd VCS 
engine, so logical instances. So:

M.1.VCS1|VCS2 -> also works on both SKL and ICL (two vcs SKUs)
B.1

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 126+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command
@ 2019-05-15 11:55           ` Tvrtko Ursulin
  0 siblings, 0 replies; 126+ messages in thread
From: Tvrtko Ursulin @ 2019-05-15 11:55 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 15/05/2019 12:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-15 12:44:41)
>>
>> On 10/05/2019 14:31, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-08 13:10:51)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> A new workload command for enabling a load balanced context map (aka
>>>> Virtual Engine). Example usage:
>>>>
>>>>     B.1
>>>>
>>>> This turns on load balancing for context one, assuming it has already been
>>>> configured with an engine map. Only DEFAULT engine specifier can be used
>>>> with load balanced engine maps.
>>>
>>> Restriction makes sense for keeping linenoise^W file format simple.
>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>> @@ -1172,6 +1210,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>>>                   if (ctx->engine_map) {
>>>>                           I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
>>>>                                                             ctx->engine_map_count + 1);
>>>> +                       I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
>>>> +                                                                ctx->engine_map_count);
>>>>                           struct drm_i915_gem_context_param param = {
>>>>                                   .ctx_id = ctx_id,
>>>>                                   .param = I915_CONTEXT_PARAM_ENGINES,
>>>> @@ -1179,7 +1219,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>>>                                   .value = to_user_pointer(&set_engines),
>>>>                           };
>>>>    
>>>> -                       set_engines.extensions = 0;
>>>> +                       if (ctx->wants_balance) {
>>>> +                               set_engines.extensions =
>>>> +                                       to_user_pointer(&load_balance);
>>>> +
>>>> +                               memset(&load_balance, 0, sizeof(load_balance));
>>>> +                               load_balance.base.name =
>>>> +                                       I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
>>>> +                               load_balance.num_siblings =
>>>> +                                       ctx->engine_map_count;
>>>> +
>>>> +                               for (j = 0; j < ctx->engine_map_count; j++) {
>>>> +                                       load_balance.engines[j].engine_class =
>>>> +                                               I915_ENGINE_CLASS_VIDEO; /* FIXME */
>>>> +                                       load_balance.engines[j].engine_instance =
>>>> +                                               ctx->engine_map[j] - VCS1; /* FIXME */
>>>
>>> Ok, more fallout from fixing ctx->engine_map[] first?
>>
>> Not sure I understand the question.
> 
> The proliferation of FIXME, the assumption of CLASS_VIDEO and an
> impedance mismatch between engine_map and class:instance. Basically
> those FIXME raise the question of what do you intend this to look like?

Intention that implicit and explicit engine maps get populated by 
available engines.

When "-b i915":

1.VCS.1000.0.0 -> implicit map of available vcs engines

M.1.VCS
B.1	\-> explicit map of available vcs engines

That would support Icelake vcs0+vcs2 SKUs. And explicit engine map wsims 
would be more portable, like the original ones were.

Also, I am contemplating using VCS2 in wsim as meaning the 2nd VCS 
engine, so logical instances. So:

M.1.VCS1|VCS2 -> also works on both SKL and ICL (two vcs SKUs)
B.1

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 126+ messages in thread

end of thread, other threads:[~2019-05-15 11:55 UTC | newest]

Thread overview: 126+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-08 12:10 [PATCH i-g-t 00/21] Media scalability tooling Tvrtko Ursulin
2019-05-08 12:10 ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 01/21] scripts/trace.pl: Fix after intel_engine_notify removal Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:17   ` Chris Wilson
2019-05-08 12:17     ` Chris Wilson
2019-05-09  9:27     ` Tvrtko Ursulin
2019-05-09  9:27       ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 12:33   ` Chris Wilson
2019-05-10 12:33     ` Chris Wilson
2019-05-13 12:16     ` Tvrtko Ursulin
2019-05-13 12:16       ` Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 02/21] headers: bump Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 03/21] trace.pl: Virtual engine support Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 12:52   ` Chris Wilson
2019-05-10 12:52     ` [igt-dev] " Chris Wilson
2019-05-13 12:30     ` Tvrtko Ursulin
2019-05-13 12:30       ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 04/21] trace.pl: Virtual engine preemption support Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 12:55   ` Chris Wilson
2019-05-10 12:55     ` [igt-dev] [Intel-gfx] " Chris Wilson
2019-05-13 12:38     ` Tvrtko Ursulin
2019-05-13 12:38       ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 05/21] wsim/media-bench: i915 balancing Tvrtko Ursulin
2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 13:14   ` [igt-dev] " Chris Wilson
2019-05-10 13:14     ` Chris Wilson
2019-05-13 12:41     ` Tvrtko Ursulin
2019-05-13 12:41       ` Tvrtko Ursulin
2019-05-13 12:54       ` Chris Wilson
2019-05-13 12:54         ` Chris Wilson
2019-05-10 13:23   ` Chris Wilson
2019-05-10 13:23     ` [Intel-gfx] " Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 06/21] gem_wsim: Use IGT uapi headers Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 13:15   ` Chris Wilson
2019-05-10 13:15     ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 07/21] gem_wsim: Factor out common error handling Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 13:15   ` Chris Wilson
2019-05-10 13:15     ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 08/21] gem_wsim: More wsim_err Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 13:16   ` Chris Wilson
2019-05-10 13:16     ` [Intel-gfx] " Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 09/21] gem_wsim: Submit fence support Tvrtko Ursulin
2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 13:18   ` [igt-dev] " Chris Wilson
2019-05-10 13:18     ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 10/21] gem_wsim: Extract str to engine lookup Tvrtko Ursulin
2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 13:20   ` [igt-dev] " Chris Wilson
2019-05-10 13:20     ` Chris Wilson
2019-05-13 13:08     ` Tvrtko Ursulin
2019-05-13 13:08       ` Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 11/21] gem_wsim: Engine map support Tvrtko Ursulin
2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 13:26   ` [igt-dev] " Chris Wilson
2019-05-10 13:26     ` Chris Wilson
2019-05-13 13:18     ` Tvrtko Ursulin
2019-05-13 13:18       ` Tvrtko Ursulin
2019-05-13 13:29       ` Chris Wilson
2019-05-13 13:29         ` Chris Wilson
2019-05-13 13:40         ` Tvrtko Ursulin
2019-05-13 13:40           ` Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 12/21] gem_wsim: Save some lines by changing to implicit NULL checking Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 13:28   ` Chris Wilson
2019-05-10 13:28     ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 13/21] gem_wsim: Compact int command parsing with a macro Tvrtko Ursulin
2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 13:29   ` [igt-dev] " Chris Wilson
2019-05-10 13:29     ` Chris Wilson
2019-05-13 13:24     ` Tvrtko Ursulin
2019-05-13 13:24       ` Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 14/21] gem_wsim: Engine map load balance command Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 13:31   ` Chris Wilson
2019-05-10 13:31     ` Chris Wilson
2019-05-15 11:44     ` Tvrtko Ursulin
2019-05-15 11:44       ` Tvrtko Ursulin
2019-05-15 11:48       ` Chris Wilson
2019-05-15 11:48         ` Chris Wilson
2019-05-15 11:55         ` Tvrtko Ursulin
2019-05-15 11:55           ` Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 15/21] gem_wsim: Engine bond command Tvrtko Ursulin
2019-05-08 12:10   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-10 13:36   ` [igt-dev] " Chris Wilson
2019-05-10 13:36     ` Chris Wilson
2019-05-13 13:28     ` Tvrtko Ursulin
2019-05-13 13:28       ` Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 16/21] gem_wsim: Some more example workloads Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:27   ` Chris Wilson
2019-05-08 12:27     ` Chris Wilson
2019-05-08 13:50     ` Tvrtko Ursulin
2019-05-08 13:50       ` Tvrtko Ursulin
2019-05-08 13:56       ` Chris Wilson
2019-05-08 13:56         ` Chris Wilson
2019-05-08 14:16         ` Tvrtko Ursulin
2019-05-08 14:16           ` Tvrtko Ursulin
2019-05-10 13:37   ` Chris Wilson
2019-05-10 13:37     ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 17/21] gem_wsim: Infinite batch support Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-10 13:48   ` Chris Wilson
2019-05-10 13:48     ` Chris Wilson
2019-05-13 13:59     ` Tvrtko Ursulin
2019-05-13 13:59       ` Tvrtko Ursulin
2019-05-13 14:11       ` Chris Wilson
2019-05-13 14:11         ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 18/21] gem_wsim: Command line switch for specifying low slice count workloads Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 19/21] gem_wsim: Per context SSEU control Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-14 21:53   ` Chris Wilson
2019-05-14 21:53     ` Chris Wilson
2019-05-08 12:10 ` [PATCH i-g-t 20/21] gem_wsim: Allow RCS virtual engine with " Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:10 ` [PATCH i-g-t 21/21] tests/i915_query: Engine discovery tests Tvrtko Ursulin
2019-05-08 12:10   ` [igt-dev] " Tvrtko Ursulin
2019-05-08 12:53 ` [igt-dev] ✓ Fi.CI.BAT: success for Media scalability tooling (rev2) Patchwork
2019-05-08 16:01 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.