All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 00/25] Media scalability tooling
@ 2019-05-17 11:25 ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Review feedback and some new stuff - most notably removal of a lot of hardcoded
assumptions by adding engine discovery and speculative support to run on
Icelake.

Tvrtko Ursulin (25):
  scripts/trace.pl: Fix after intel_engine_notify removal
  trace.pl: Ignore signaling on non i915 fences
  headers: bump
  trace.pl: Virtual engine support
  trace.pl: Virtual engine preemption support
  wsim/media-bench: i915 balancing
  gem_wsim: Use IGT uapi headers
  gem_wsim: Factor out common error handling
  gem_wsim: More wsim_err
  gem_wsim: Submit fence support
  gem_wsim: Extract str to engine lookup
  gem_wsim: Engine map support
  gem_wsim: Save some lines by changing to implicit NULL checking
  gem_wsim: Compact int command parsing with a macro
  gem_wsim: Engine map load balance command
  gem_wsim: Engine bond command
  gem_wsim: Some more example workloads
  gem_wsim: Infinite batch support
  gem_wsim: Command line switch for specifying low slice count workloads
  gem_wsim: Per context SSEU control
  gem_wsim: Allow RCS virtual engine with SSEU control
  tests/i915_query: Engine discovery tests
  gem_wsim: Consolidate engine assignments into helpers
  gem_wsim: Discover engines
  gem_wsim: Support Icelake parts

 benchmarks/gem_wsim.c                       | 1523 +++++++++++++++----
 benchmarks/wsim/README                      |  142 +-
 benchmarks/wsim/frame-split-60fps.wsim      |   18 +
 benchmarks/wsim/high-composited-game.wsim   |   11 +
 benchmarks/wsim/media-1080p-player.wsim     |    5 +
 benchmarks/wsim/medium-composited-game.wsim |    9 +
 include/drm-uapi/amdgpu_drm.h               |   52 +-
 include/drm-uapi/drm.h                      |   36 +
 include/drm-uapi/drm_mode.h                 |    4 +-
 include/drm-uapi/i915_drm.h                 |  209 ++-
 include/drm-uapi/lima_drm.h                 |  169 ++
 include/drm-uapi/msm_drm.h                  |   14 +
 include/drm-uapi/nouveau_drm.h              |   51 +
 include/drm-uapi/v3d_drm.h                  |   28 +
 scripts/media-bench.pl                      |    9 +-
 scripts/trace.pl                            |  319 +++-
 tests/i915/i915_query.c                     |  247 +++
 17 files changed, 2425 insertions(+), 421 deletions(-)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim
 create mode 100644 include/drm-uapi/lima_drm.h

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 00/25] Media scalability tooling
@ 2019-05-17 11:25 ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Review feedback and some new stuff - most notably removal of a lot of hardcoded
assumptions by adding engine discovery and speculative support to run on
Icelake.

Tvrtko Ursulin (25):
  scripts/trace.pl: Fix after intel_engine_notify removal
  trace.pl: Ignore signaling on non i915 fences
  headers: bump
  trace.pl: Virtual engine support
  trace.pl: Virtual engine preemption support
  wsim/media-bench: i915 balancing
  gem_wsim: Use IGT uapi headers
  gem_wsim: Factor out common error handling
  gem_wsim: More wsim_err
  gem_wsim: Submit fence support
  gem_wsim: Extract str to engine lookup
  gem_wsim: Engine map support
  gem_wsim: Save some lines by changing to implicit NULL checking
  gem_wsim: Compact int command parsing with a macro
  gem_wsim: Engine map load balance command
  gem_wsim: Engine bond command
  gem_wsim: Some more example workloads
  gem_wsim: Infinite batch support
  gem_wsim: Command line switch for specifying low slice count workloads
  gem_wsim: Per context SSEU control
  gem_wsim: Allow RCS virtual engine with SSEU control
  tests/i915_query: Engine discovery tests
  gem_wsim: Consolidate engine assignments into helpers
  gem_wsim: Discover engines
  gem_wsim: Support Icelake parts

 benchmarks/gem_wsim.c                       | 1523 +++++++++++++++----
 benchmarks/wsim/README                      |  142 +-
 benchmarks/wsim/frame-split-60fps.wsim      |   18 +
 benchmarks/wsim/high-composited-game.wsim   |   11 +
 benchmarks/wsim/media-1080p-player.wsim     |    5 +
 benchmarks/wsim/medium-composited-game.wsim |    9 +
 include/drm-uapi/amdgpu_drm.h               |   52 +-
 include/drm-uapi/drm.h                      |   36 +
 include/drm-uapi/drm_mode.h                 |    4 +-
 include/drm-uapi/i915_drm.h                 |  209 ++-
 include/drm-uapi/lima_drm.h                 |  169 ++
 include/drm-uapi/msm_drm.h                  |   14 +
 include/drm-uapi/nouveau_drm.h              |   51 +
 include/drm-uapi/v3d_drm.h                  |   28 +
 scripts/media-bench.pl                      |    9 +-
 scripts/trace.pl                            |  319 +++-
 tests/i915/i915_query.c                     |  247 +++
 17 files changed, 2425 insertions(+), 421 deletions(-)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim
 create mode 100644 include/drm-uapi/lima_drm.h

-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 01/25] scripts/trace.pl: Fix after intel_engine_notify removal
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

After the removal of engine global seqnos and the corresponding
intel_engine_notify tracepoints the script needs to be adjusted to cope
with the new state of things.

To keep working it switches over using the dma_fence:dma_fence_signaled:
tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
with engines.

It also needs to key the completion events on the full engine/ctx/seqno
tokens, and adjust correspondingly the timeline sorting logic.

v2:
 * Do not use late notifications (received after context complete) when
   splitting up coalesced requests. They are now much more likely and can
   not be used.

v3:
 * Pull a hunk which moved forward during rebases back here.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 scripts/trace.pl | 66 ++++++++++++++++++++++--------------------------
 1 file changed, 30 insertions(+), 36 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 18f9f3b18396..b7bbabc79f68 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,8 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+    %ctxtimelines, %ctxengines);
 my @freqs;
 
 my $max_items = 3000;
@@ -66,7 +67,7 @@ Notes:
 			       i915:i915_request_submit, \
 			       i915:i915_request_in, \
 			       i915:i915_request_out, \
-			       i915:intel_engine_notify, \
+			       dma_fence:dma_fence_signaled, \
 			       i915:i915_request_wait_begin, \
 			       i915:i915_request_wait_end \
 			       [command-to-be-profiled]
@@ -161,7 +162,7 @@ sub arg_trace
 		       'i915:i915_request_submit',
 		       'i915:i915_request_in',
 		       'i915:i915_request_out',
-		       'i915:intel_engine_notify',
+		       'dma_fence:dma_fence_signaled',
 		       'i915:i915_request_wait_begin',
 		       'i915:i915_request_wait_end' );
 
@@ -312,13 +313,6 @@ sub db_key
 	return $ring . '/' . $ctx . '/' . $seqno;
 }
 
-sub global_key
-{
-	my ($ring, $seqno) = @_;
-
-	return $ring . '/' . $seqno;
-}
-
 sub sanitize_ctx
 {
 	my ($ctx, $ring) = @_;
@@ -419,6 +413,8 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
+		$ctxengines{$ctx} = $ring;
 		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
@@ -429,16 +425,23 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey = global_key($ring, $tp{'global'});
+		my $gkey;
 
+		die unless exists $ctxengines{$ctx};
 		die unless exists $db{$key};
 		die unless exists $db{$key}->{'start'};
 		die if exists $db{$key}->{'end'};
 
+		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
 		$db{$key}->{'end'} = $time;
 		$db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
-	} elsif ($tp_name eq 'i915:intel_engine_notify:') {
-		my $gkey = global_key($ring, $seqno);
+	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
+		my $gkey;
+
+		die unless exists $ctxengines{$tp{'context'}};
+
+		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
 
 		$notify{$gkey} = $time unless exists $notify{$gkey};
 	} elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
@@ -452,7 +455,7 @@ while (<>) {
 # find the largest seqno to be used for timeline sorting purposes.
 my $max_seqno = 0;
 foreach my $key (keys %db) {
-	my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
+	my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 
 	die unless exists $db{$key}->{'start'};
 
@@ -478,14 +481,13 @@ my $key_count = scalar(keys %db);
 
 my %engine_timelines;
 
-sub sortEngine {
-	my $as = $db{$a}->{'global'};
-	my $bs = $db{$b}->{'global'};
+sub sortStart {
+	my $as = $db{$a}->{'start'};
+	my $bs = $db{$b}->{'start'};
 	my $val;
 
 	$val = $as <=> $bs;
-
-	die if $val == 0;
+	$val = $a cmp $b if $val == 0;
 
 	return $val;
 }
@@ -497,9 +499,7 @@ sub get_engine_timeline {
 	return $engine_timelines{$ring} if exists $engine_timelines{$ring};
 
 	@timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
-	# FIXME seqno restart
-	@timeline = sort sortEngine @timeline;
-
+	@timeline = sort sortStart @timeline;
 	$engine_timelines{$ring} = \@timeline;
 
 	return \@timeline;
@@ -561,20 +561,10 @@ foreach my $gid (sort keys %rings) {
 			$db{$key}->{'no-notify'} = 1;
 		}
 		$db{$key}->{'end'} = $end;
+		$db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
 	}
 }
 
-sub sortStart {
-	my $as = $db{$a}->{'start'};
-	my $bs = $db{$b}->{'start'};
-	my $val;
-
-	$val = $as <=> $bs;
-	$val = $a cmp $b if $val == 0;
-
-	return $val;
-}
-
 my $re_sort = 1;
 my @sorted_keys;
 
@@ -670,9 +660,13 @@ if ($correct_durations) {
 			next unless exists $db{$key}->{'no-end'};
 			last if $pos == $#{$timeline};
 
-			# Shift following request to start after the current one
+			# Shift following request to start after the current
+			# one, but only if that wouldn't make it zero duration,
+			# which would indicate notify arrived after context
+			# complete.
 			$next_key = ${$timeline}[$pos + 1];
-			if (exists $db{$key}->{'notify'}) {
+			if (exists $db{$key}->{'notify'} and
+			    $db{$key}->{'notify'} < $db{$key}->{'end'}) {
 				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
 				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
 				$re_sort = 1;
@@ -750,9 +744,9 @@ foreach my $gid (sort keys %rings) {
 	# Extract all GPU busy intervals and sort them.
 	foreach my $key (@sorted_keys) {
 		next unless $db{$key}->{'ring'} eq $ring;
+		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 		push @s_, $db{$key}->{'start'};
 		push @e_, $db{$key}->{'end'};
-		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 	}
 
 	die unless $#s_ == $#e_;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 01/25] scripts/trace.pl: Fix after intel_engine_notify removal
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

After the removal of engine global seqnos and the corresponding
intel_engine_notify tracepoints the script needs to be adjusted to cope
with the new state of things.

To keep working it switches over using the dma_fence:dma_fence_signaled:
tracepoint and keeps one extra internal map to connect the ctx-seqno pairs
with engines.

It also needs to key the completion events on the full engine/ctx/seqno
tokens, and adjust correspondingly the timeline sorting logic.

v2:
 * Do not use late notifications (received after context complete) when
   splitting up coalesced requests. They are now much more likely and can
   not be used.

v3:
 * Pull a hunk which moved forward during rebases back here.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 scripts/trace.pl | 66 ++++++++++++++++++++++--------------------------
 1 file changed, 30 insertions(+), 36 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 18f9f3b18396..b7bbabc79f68 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,7 +27,8 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines);
+my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+    %ctxtimelines, %ctxengines);
 my @freqs;
 
 my $max_items = 3000;
@@ -66,7 +67,7 @@ Notes:
 			       i915:i915_request_submit, \
 			       i915:i915_request_in, \
 			       i915:i915_request_out, \
-			       i915:intel_engine_notify, \
+			       dma_fence:dma_fence_signaled, \
 			       i915:i915_request_wait_begin, \
 			       i915:i915_request_wait_end \
 			       [command-to-be-profiled]
@@ -161,7 +162,7 @@ sub arg_trace
 		       'i915:i915_request_submit',
 		       'i915:i915_request_in',
 		       'i915:i915_request_out',
-		       'i915:intel_engine_notify',
+		       'dma_fence:dma_fence_signaled',
 		       'i915:i915_request_wait_begin',
 		       'i915:i915_request_wait_end' );
 
@@ -312,13 +313,6 @@ sub db_key
 	return $ring . '/' . $ctx . '/' . $seqno;
 }
 
-sub global_key
-{
-	my ($ring, $seqno) = @_;
-
-	return $ring . '/' . $seqno;
-}
-
 sub sanitize_ctx
 {
 	my ($ctx, $ring) = @_;
@@ -419,6 +413,8 @@ while (<>) {
 		$req{'ring'} = $ring;
 		$req{'seqno'} = $seqno;
 		$req{'ctx'} = $ctx;
+		die if exists $ctxengines{$ctx} and $ctxengines{$ctx} ne $ring;
+		$ctxengines{$ctx} = $ring;
 		$ctxtimelines{$ctx . '/' . $ring} = 1;
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
@@ -429,16 +425,23 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey = global_key($ring, $tp{'global'});
+		my $gkey;
 
+		die unless exists $ctxengines{$ctx};
 		die unless exists $db{$key};
 		die unless exists $db{$key}->{'start'};
 		die if exists $db{$key}->{'end'};
 
+		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
 		$db{$key}->{'end'} = $time;
 		$db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
-	} elsif ($tp_name eq 'i915:intel_engine_notify:') {
-		my $gkey = global_key($ring, $seqno);
+	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
+		my $gkey;
+
+		die unless exists $ctxengines{$tp{'context'}};
+
+		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
 
 		$notify{$gkey} = $time unless exists $notify{$gkey};
 	} elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
@@ -452,7 +455,7 @@ while (<>) {
 # find the largest seqno to be used for timeline sorting purposes.
 my $max_seqno = 0;
 foreach my $key (keys %db) {
-	my $gkey = global_key($db{$key}->{'ring'}, $db{$key}->{'global'});
+	my $gkey = db_key($db{$key}->{'ring'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 
 	die unless exists $db{$key}->{'start'};
 
@@ -478,14 +481,13 @@ my $key_count = scalar(keys %db);
 
 my %engine_timelines;
 
-sub sortEngine {
-	my $as = $db{$a}->{'global'};
-	my $bs = $db{$b}->{'global'};
+sub sortStart {
+	my $as = $db{$a}->{'start'};
+	my $bs = $db{$b}->{'start'};
 	my $val;
 
 	$val = $as <=> $bs;
-
-	die if $val == 0;
+	$val = $a cmp $b if $val == 0;
 
 	return $val;
 }
@@ -497,9 +499,7 @@ sub get_engine_timeline {
 	return $engine_timelines{$ring} if exists $engine_timelines{$ring};
 
 	@timeline = grep { $db{$_}->{'ring'} eq $ring } keys %db;
-	# FIXME seqno restart
-	@timeline = sort sortEngine @timeline;
-
+	@timeline = sort sortStart @timeline;
 	$engine_timelines{$ring} = \@timeline;
 
 	return \@timeline;
@@ -561,20 +561,10 @@ foreach my $gid (sort keys %rings) {
 			$db{$key}->{'no-notify'} = 1;
 		}
 		$db{$key}->{'end'} = $end;
+		$db{$key}->{'notify'} = $end if $db{$key}->{'notify'} > $end;
 	}
 }
 
-sub sortStart {
-	my $as = $db{$a}->{'start'};
-	my $bs = $db{$b}->{'start'};
-	my $val;
-
-	$val = $as <=> $bs;
-	$val = $a cmp $b if $val == 0;
-
-	return $val;
-}
-
 my $re_sort = 1;
 my @sorted_keys;
 
@@ -670,9 +660,13 @@ if ($correct_durations) {
 			next unless exists $db{$key}->{'no-end'};
 			last if $pos == $#{$timeline};
 
-			# Shift following request to start after the current one
+			# Shift following request to start after the current
+			# one, but only if that wouldn't make it zero duration,
+			# which would indicate notify arrived after context
+			# complete.
 			$next_key = ${$timeline}[$pos + 1];
-			if (exists $db{$key}->{'notify'}) {
+			if (exists $db{$key}->{'notify'} and
+			    $db{$key}->{'notify'} < $db{$key}->{'end'}) {
 				$db{$next_key}->{'engine-start'} = $db{$next_key}->{'start'};
 				$db{$next_key}->{'start'} = $db{$key}->{'notify'};
 				$re_sort = 1;
@@ -750,9 +744,9 @@ foreach my $gid (sort keys %rings) {
 	# Extract all GPU busy intervals and sort them.
 	foreach my $key (@sorted_keys) {
 		next unless $db{$key}->{'ring'} eq $ring;
+		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 		push @s_, $db{$key}->{'start'};
 		push @e_, $db{$key}->{'end'};
-		die if $db{$key}->{'start'} > $db{$key}->{'end'};
 	}
 
 	die unless $#s_ == $#e_;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

gem_wsim uses the sw_fence timeline and confuses the script.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index b7bbabc79f68..930e502ad8eb 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -439,6 +439,7 @@ while (<>) {
 	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
 		my $gkey;
 
+		next unless $tp{'driver'} eq 'i915';
 		die unless exists $ctxengines{$tp{'context'}};
 
 		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

gem_wsim uses the sw_fence timeline and confuses the script.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index b7bbabc79f68..930e502ad8eb 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -439,6 +439,7 @@ while (<>) {
 	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
 		my $gkey;
 
+		next unless $tp{'driver'} eq 'i915';
 		die unless exists $ctxengines{$tp{'context'}};
 
 		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 03/25] headers: bump
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Catch up to drm-tip headers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 include/drm-uapi/amdgpu_drm.h  |  52 +++++++-
 include/drm-uapi/drm.h         |  36 ++++++
 include/drm-uapi/drm_mode.h    |   4 +-
 include/drm-uapi/i915_drm.h    | 209 ++++++++++++++++++++++++++++++++-
 include/drm-uapi/lima_drm.h    | 169 ++++++++++++++++++++++++++
 include/drm-uapi/msm_drm.h     |  14 +++
 include/drm-uapi/nouveau_drm.h |  51 ++++++++
 include/drm-uapi/v3d_drm.h     |  28 +++++
 8 files changed, 557 insertions(+), 6 deletions(-)
 create mode 100644 include/drm-uapi/lima_drm.h

diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h
index be84e43c1e19..4788730dbe78 100644
--- a/include/drm-uapi/amdgpu_drm.h
+++ b/include/drm-uapi/amdgpu_drm.h
@@ -210,6 +210,9 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_QUERY2_FLAGS_VRAMLOST (1<<1)
 /* indicate some job from this context once cause gpu hang */
 #define AMDGPU_CTX_QUERY2_FLAGS_GUILTY   (1<<2)
+/* indicate some errors are detected by RAS */
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_CE   (1<<3)
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_UE   (1<<4)
 
 /* Context priority level */
 #define AMDGPU_CTX_PRIORITY_UNSET       -2048
@@ -272,13 +275,14 @@ union drm_amdgpu_vm {
 
 /* sched ioctl */
 #define AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE	1
+#define AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE	2
 
 struct drm_amdgpu_sched_in {
 	/* AMDGPU_SCHED_OP_* */
 	__u32	op;
 	__u32	fd;
 	__s32	priority;
-	__u32	flags;
+	__u32   ctx_id;
 };
 
 union drm_amdgpu_sched {
@@ -523,6 +527,9 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN      0x04
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT     0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES      0x06
+#define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES	0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT    0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
 
 struct drm_amdgpu_cs_chunk {
 	__u32		chunk_id;
@@ -565,6 +572,11 @@ union drm_amdgpu_cs {
  * caches (L2/vL1/sL1/I$). */
 #define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
 
+/* Set GDS_COMPUTE_MAX_WAVE_ID = DEFAULT before PACKET3_INDIRECT_BUFFER.
+ * This will reset wave ID counters for the IB.
+ */
+#define AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID (1 << 4)
+
 struct drm_amdgpu_cs_chunk_ib {
 	__u32 _pad;
 	/** AMDGPU_IB_FLAG_* */
@@ -598,6 +610,12 @@ struct drm_amdgpu_cs_chunk_sem {
 	__u32 handle;
 };
 
+struct drm_amdgpu_cs_chunk_syncobj {
+       __u32 handle;
+       __u32 flags;
+       __u64 point;
+};
+
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ	0
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD	1
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD	2
@@ -673,6 +691,7 @@ struct drm_amdgpu_cs_chunk_data {
 	#define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11
 	/* Subquery id: Query DMCU firmware version */
 	#define AMDGPU_INFO_FW_DMCU		0x12
+	#define AMDGPU_INFO_FW_TA		0x13
 /* number of bytes moved for TTM migration */
 #define AMDGPU_INFO_NUM_BYTES_MOVED		0x0f
 /* the used VRAM size */
@@ -726,6 +745,37 @@ struct drm_amdgpu_cs_chunk_data {
 /* Number of VRAM page faults on CPU access. */
 #define AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS	0x1E
 #define AMDGPU_INFO_VRAM_LOST_COUNTER		0x1F
+/* query ras mask of enabled features*/
+#define AMDGPU_INFO_RAS_ENABLED_FEATURES	0x20
+
+/* RAS MASK: UMC (VRAM) */
+#define AMDGPU_INFO_RAS_ENABLED_UMC			(1 << 0)
+/* RAS MASK: SDMA */
+#define AMDGPU_INFO_RAS_ENABLED_SDMA			(1 << 1)
+/* RAS MASK: GFX */
+#define AMDGPU_INFO_RAS_ENABLED_GFX			(1 << 2)
+/* RAS MASK: MMHUB */
+#define AMDGPU_INFO_RAS_ENABLED_MMHUB			(1 << 3)
+/* RAS MASK: ATHUB */
+#define AMDGPU_INFO_RAS_ENABLED_ATHUB			(1 << 4)
+/* RAS MASK: PCIE */
+#define AMDGPU_INFO_RAS_ENABLED_PCIE			(1 << 5)
+/* RAS MASK: HDP */
+#define AMDGPU_INFO_RAS_ENABLED_HDP			(1 << 6)
+/* RAS MASK: XGMI */
+#define AMDGPU_INFO_RAS_ENABLED_XGMI			(1 << 7)
+/* RAS MASK: DF */
+#define AMDGPU_INFO_RAS_ENABLED_DF			(1 << 8)
+/* RAS MASK: SMN */
+#define AMDGPU_INFO_RAS_ENABLED_SMN			(1 << 9)
+/* RAS MASK: SEM */
+#define AMDGPU_INFO_RAS_ENABLED_SEM			(1 << 10)
+/* RAS MASK: MP0 */
+#define AMDGPU_INFO_RAS_ENABLED_MP0			(1 << 11)
+/* RAS MASK: MP1 */
+#define AMDGPU_INFO_RAS_ENABLED_MP1			(1 << 12)
+/* RAS MASK: FUSE */
+#define AMDGPU_INFO_RAS_ENABLED_FUSE			(1 << 13)
 
 #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT	0
 #define AMDGPU_INFO_MMR_SE_INDEX_MASK	0xff
diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index 85c685a2075e..c893f3b4a895 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -729,8 +729,18 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+struct drm_syncobj_transfer {
+	__u32 src_handle;
+	__u32 dst_handle;
+	__u64 src_point;
+	__u64 dst_point;
+	__u32 flags;
+	__u32 pad;
+};
+
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE (1 << 2) /* wait for time point to become available */
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
@@ -741,12 +751,33 @@ struct drm_syncobj_wait {
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_wait {
+	__u64 handles;
+	/* wait on specific timeline point for every handles*/
+	__u64 points;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
+
 struct drm_syncobj_array {
 	__u64 handles;
 	__u32 count_handles;
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_array {
+	__u64 handles;
+	__u64 points;
+	__u32 count_handles;
+	__u32 pad;
+};
+
+
 /* Query current scanout sequence number */
 struct drm_crtc_get_sequence {
 	__u32 crtc_id;		/* requested crtc_id */
@@ -903,6 +934,11 @@ extern "C" {
 #define DRM_IOCTL_MODE_GET_LEASE	DRM_IOWR(0xC8, struct drm_mode_get_lease)
 #define DRM_IOCTL_MODE_REVOKE_LEASE	DRM_IOWR(0xC9, struct drm_mode_revoke_lease)
 
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT	DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait)
+#define DRM_IOCTL_SYNCOBJ_QUERY		DRM_IOWR(0xCB, struct drm_syncobj_timeline_array)
+#define DRM_IOCTL_SYNCOBJ_TRANSFER	DRM_IOWR(0xCC, struct drm_syncobj_transfer)
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL	DRM_IOWR(0xCD, struct drm_syncobj_timeline_array)
+
 /**
  * Device specific ioctls should only be in their respective headers
  * The device specific ioctl range is from 0x40 to 0x9f.
diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h
index a439c2e67896..83cd1636b9be 100644
--- a/include/drm-uapi/drm_mode.h
+++ b/include/drm-uapi/drm_mode.h
@@ -33,7 +33,6 @@
 extern "C" {
 #endif
 
-#define DRM_DISPLAY_INFO_LEN	32
 #define DRM_CONNECTOR_NAME_LEN	32
 #define DRM_DISPLAY_MODE_LEN	32
 #define DRM_PROP_NAME_LEN	32
@@ -622,7 +621,8 @@ struct drm_color_ctm {
 
 struct drm_color_lut {
 	/*
-	 * Data is U0.16 fixed point format.
+	 * Values are mapped linearly to 0.0 - 1.0 range, with 0x0 == 0.0 and
+	 * 0xffff == 1.0.
 	 */
 	__u16 red;
 	__u16 green;
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index e01b3e1fd6d6..761517f15368 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -136,6 +136,8 @@ enum drm_i915_gem_engine_class {
 struct i915_engine_class_instance {
 	__u16 engine_class; /* see enum drm_i915_gem_engine_class */
 	__u16 engine_instance;
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL -2
 };
 
 /**
@@ -355,6 +357,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
 #define DRM_I915_QUERY			0x39
+#define DRM_I915_GEM_VM_CREATE		0x3a
+#define DRM_I915_GEM_VM_DESTROY		0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -415,6 +419,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -598,6 +604,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1120,7 +1132,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT		(1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1464,8 +1485,9 @@ struct drm_i915_gem_context_create_ext {
 	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
 	__u64 extensions;
 };
 
@@ -1507,6 +1529,41 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE	0x8
+
+	/*
+	 * The id of the associated virtual memory address space (ppGTT) of
+	 * this context. Can be retrieved and passed to another context
+	 * (on the same fd) for both to use the same ppGTT and so share
+	 * address layouts, and avoid reloading the page tables on context
+	 * switches between themselves.
+	 *
+	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+	 */
+#define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1540,9 +1597,10 @@ struct drm_i915_gem_context_param_sseu {
 	struct i915_engine_class_instance engine;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * Unknown flags must be cleared to zero.
 	 */
 	__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
 
 	/*
 	 * Mask of slices to enable for the context. Valid values are a subset
@@ -1570,12 +1628,115 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 num_siblings;
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 mbz64; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(name__, N__) struct { \
+	struct i915_user_extension base; \
+	__u16 engine_index; \
+	__u16 num_siblings; \
+	__u32 flags; \
+	__u64 mbz64; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by specifying bonding pairs, for any given master engine we will
+ * only execute on one of the corresponding siblings within the virtual engine.
+ *
+ * To execute a request in parallel on the master engine and a sibling requires
+ * coordination with a I915_EXEC_FENCE_SUBMIT.
+ */
+struct i915_context_engines_bond {
+	struct i915_user_extension base;
+
+	struct i915_engine_class_instance master;
+
+	__u16 virtual_index; /* index of virtual engine in ctx->engines[] */
+	__u16 num_bonds;
+
+	__u64 flags; /* all undefined flags must be zero */
+	__u64 mbz64[4]; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_BOND(name__, N__) struct { \
+	struct i915_user_extension base; \
+	struct i915_engine_class_instance master; \
+	__u16 virtual_index; \
+	__u16 num_bonds; \
+	__u64 flags; \
+	__u64 mbz64[4]; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see i915_context_engines_load_balance */
+#define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
+#define I915_CONTEXT_CLONE_VM		(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
@@ -1821,6 +1982,7 @@ struct drm_i915_perf_oa_config {
 struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
+#define DRM_I915_QUERY_ENGINE_INFO	2
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -1919,6 +2081,47 @@ struct drm_i915_query_topology_info {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_engine_info
+ *
+ * Describes one engine and it's capabilities as known to the driver.
+ */
+struct drm_i915_engine_info {
+	/** Engine class and instance. */
+	struct i915_engine_class_instance engine;
+
+	/** Reserved field. */
+	__u32 rsvd0;
+
+	/** Engine flags. */
+	__u64 flags;
+
+	/** Capabilities of this engine. */
+	__u64 capabilities;
+#define I915_VIDEO_CLASS_CAPABILITY_HEVC		(1 << 0)
+#define I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC	(1 << 1)
+
+	/** Reserved fields. */
+	__u64 rsvd1[4];
+};
+
+/**
+ * struct drm_i915_query_engine_info
+ *
+ * Engine info query enumerates all engines known to the driver by filling in
+ * an array of struct drm_i915_engine_info structures.
+ */
+struct drm_i915_query_engine_info {
+	/** Number of struct drm_i915_engine_info structs following. */
+	__u32 num_engines;
+
+	/** MBZ */
+	__u32 rsvd[3];
+
+	/** Marker for drm_i915_engine_info structures. */
+	struct drm_i915_engine_info engines[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/drm-uapi/lima_drm.h b/include/drm-uapi/lima_drm.h
new file mode 100644
index 000000000000..95a00fb867e6
--- /dev/null
+++ b/include/drm-uapi/lima_drm.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */
+/* Copyright 2017-2018 Qiang Yu <yuq825@gmail.com> */
+
+#ifndef __LIMA_DRM_H__
+#define __LIMA_DRM_H__
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+enum drm_lima_param_gpu_id {
+	DRM_LIMA_PARAM_GPU_ID_UNKNOWN,
+	DRM_LIMA_PARAM_GPU_ID_MALI400,
+	DRM_LIMA_PARAM_GPU_ID_MALI450,
+};
+
+enum drm_lima_param {
+	DRM_LIMA_PARAM_GPU_ID,
+	DRM_LIMA_PARAM_NUM_PP,
+	DRM_LIMA_PARAM_GP_VERSION,
+	DRM_LIMA_PARAM_PP_VERSION,
+};
+
+/**
+ * get various information of the GPU
+ */
+struct drm_lima_get_param {
+	__u32 param; /* in, value in enum drm_lima_param */
+	__u32 pad;   /* pad, must be zero */
+	__u64 value; /* out, parameter value */
+};
+
+/**
+ * create a buffer for used by GPU
+ */
+struct drm_lima_gem_create {
+	__u32 size;    /* in, buffer size */
+	__u32 flags;   /* in, currently no flags, must be zero */
+	__u32 handle;  /* out, GEM buffer handle */
+	__u32 pad;     /* pad, must be zero */
+};
+
+/**
+ * get information of a buffer
+ */
+struct drm_lima_gem_info {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 va;      /* out, virtual address mapped into GPU MMU */
+	__u64 offset;  /* out, used to mmap this buffer to CPU */
+};
+
+#define LIMA_SUBMIT_BO_READ   0x01
+#define LIMA_SUBMIT_BO_WRITE  0x02
+
+/* buffer information used by one task */
+struct drm_lima_gem_submit_bo {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 flags;   /* in, buffer read/write by GPU */
+};
+
+#define LIMA_GP_FRAME_REG_NUM 6
+
+/* frame used to setup GP for each task */
+struct drm_lima_gp_frame {
+	__u32 frame[LIMA_GP_FRAME_REG_NUM];
+};
+
+#define LIMA_PP_FRAME_REG_NUM 23
+#define LIMA_PP_WB_REG_NUM 12
+
+/* frame used to setup mali400 GPU PP for each task */
+struct drm_lima_m400_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 plbu_array_address[4];
+	__u32 fragment_stack_address[4];
+};
+
+/* frame used to setup mali450 GPU PP for each task */
+struct drm_lima_m450_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 use_dlbu;
+	__u32 _pad;
+	union {
+		__u32 plbu_array_address[8];
+		__u32 dlbu_regs[4];
+	};
+	__u32 fragment_stack_address[8];
+};
+
+#define LIMA_PIPE_GP  0x00
+#define LIMA_PIPE_PP  0x01
+
+#define LIMA_SUBMIT_FLAG_EXPLICIT_FENCE (1 << 0)
+
+/**
+ * submit a task to GPU
+ *
+ * User can always merge multi sync_file and drm_syncobj
+ * into one drm_syncobj as in_sync[0], but we reserve
+ * in_sync[1] for another task's out_sync to avoid the
+ * export/import/merge pass when explicit sync.
+ */
+struct drm_lima_gem_submit {
+	__u32 ctx;         /* in, context handle task is submitted to */
+	__u32 pipe;        /* in, which pipe to use, GP/PP */
+	__u32 nr_bos;      /* in, array length of bos field */
+	__u32 frame_size;  /* in, size of frame field */
+	__u64 bos;         /* in, array of drm_lima_gem_submit_bo */
+	__u64 frame;       /* in, GP/PP frame */
+	__u32 flags;       /* in, submit flags */
+	__u32 out_sync;    /* in, drm_syncobj handle used to wait task finish after submission */
+	__u32 in_sync[2];  /* in, drm_syncobj handle used to wait before start this task */
+};
+
+#define LIMA_GEM_WAIT_READ   0x01
+#define LIMA_GEM_WAIT_WRITE  0x02
+
+/**
+ * wait pending GPU task finish of a buffer
+ */
+struct drm_lima_gem_wait {
+	__u32 handle;      /* in, GEM buffer handle */
+	__u32 op;          /* in, CPU want to read/write this buffer */
+	__s64 timeout_ns;  /* in, wait timeout in absulute time */
+};
+
+/**
+ * create a context
+ */
+struct drm_lima_ctx_create {
+	__u32 id;          /* out, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+/**
+ * free a context
+ */
+struct drm_lima_ctx_free {
+	__u32 id;          /* in, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+#define DRM_LIMA_GET_PARAM   0x00
+#define DRM_LIMA_GEM_CREATE  0x01
+#define DRM_LIMA_GEM_INFO    0x02
+#define DRM_LIMA_GEM_SUBMIT  0x03
+#define DRM_LIMA_GEM_WAIT    0x04
+#define DRM_LIMA_CTX_CREATE  0x05
+#define DRM_LIMA_CTX_FREE    0x06
+
+#define DRM_IOCTL_LIMA_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GET_PARAM, struct drm_lima_get_param)
+#define DRM_IOCTL_LIMA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_CREATE, struct drm_lima_gem_create)
+#define DRM_IOCTL_LIMA_GEM_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_INFO, struct drm_lima_gem_info)
+#define DRM_IOCTL_LIMA_GEM_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_SUBMIT, struct drm_lima_gem_submit)
+#define DRM_IOCTL_LIMA_GEM_WAIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_WAIT, struct drm_lima_gem_wait)
+#define DRM_IOCTL_LIMA_CTX_CREATE DRM_IOR(DRM_COMMAND_BASE + DRM_LIMA_CTX_CREATE, struct drm_lima_ctx_create)
+#define DRM_IOCTL_LIMA_CTX_FREE DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_CTX_FREE, struct drm_lima_ctx_free)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* __LIMA_DRM_H__ */
diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h
index 91a16b333c69..0b85ed6a3710 100644
--- a/include/drm-uapi/msm_drm.h
+++ b/include/drm-uapi/msm_drm.h
@@ -74,6 +74,8 @@ struct drm_msm_timespec {
 #define MSM_PARAM_TIMESTAMP  0x05
 #define MSM_PARAM_GMEM_BASE  0x06
 #define MSM_PARAM_NR_RINGS   0x07
+#define MSM_PARAM_PP_PGTABLE 0x08  /* => 1 for per-process pagetables, else 0 */
+#define MSM_PARAM_FAULTS     0x09
 
 struct drm_msm_param {
 	__u32 pipe;           /* in, MSM_PIPE_x */
@@ -286,6 +288,16 @@ struct drm_msm_submitqueue {
 	__u32 id;      /* out, identifier */
 };
 
+#define MSM_SUBMITQUEUE_PARAM_FAULTS   0
+
+struct drm_msm_submitqueue_query {
+	__u64 data;
+	__u32 id;
+	__u32 param;
+	__u32 len;
+	__u32 pad;
+};
+
 #define DRM_MSM_GET_PARAM              0x00
 /* placeholder:
 #define DRM_MSM_SET_PARAM              0x01
@@ -302,6 +314,7 @@ struct drm_msm_submitqueue {
  */
 #define DRM_MSM_SUBMITQUEUE_NEW        0x0A
 #define DRM_MSM_SUBMITQUEUE_CLOSE      0x0B
+#define DRM_MSM_SUBMITQUEUE_QUERY      0x0C
 
 #define DRM_IOCTL_MSM_GET_PARAM        DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param)
 #define DRM_IOCTL_MSM_GEM_NEW          DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_NEW, struct drm_msm_gem_new)
@@ -313,6 +326,7 @@ struct drm_msm_submitqueue {
 #define DRM_IOCTL_MSM_GEM_MADVISE      DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_MADVISE, struct drm_msm_gem_madvise)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW    DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32)
+#define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query)
 
 #if defined(__cplusplus)
 }
diff --git a/include/drm-uapi/nouveau_drm.h b/include/drm-uapi/nouveau_drm.h
index 259588a4b61b..9459a6e3bc1f 100644
--- a/include/drm-uapi/nouveau_drm.h
+++ b/include/drm-uapi/nouveau_drm.h
@@ -133,12 +133,63 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_NOTIFIEROBJ_ALLOC  0x05 /* deprecated */
 #define DRM_NOUVEAU_GPUOBJ_FREE        0x06 /* deprecated */
 #define DRM_NOUVEAU_NVIF               0x07
+#define DRM_NOUVEAU_SVM_INIT           0x08
+#define DRM_NOUVEAU_SVM_BIND           0x09
 #define DRM_NOUVEAU_GEM_NEW            0x40
 #define DRM_NOUVEAU_GEM_PUSHBUF        0x41
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
 
+struct drm_nouveau_svm_init {
+	__u64 unmanaged_addr;
+	__u64 unmanaged_size;
+};
+
+struct drm_nouveau_svm_bind {
+	__u64 header;
+	__u64 va_start;
+	__u64 va_end;
+	__u64 npages;
+	__u64 stride;
+	__u64 result;
+	__u64 reserved0;
+	__u64 reserved1;
+};
+
+#define NOUVEAU_SVM_BIND_COMMAND_SHIFT          0
+#define NOUVEAU_SVM_BIND_COMMAND_BITS           8
+#define NOUVEAU_SVM_BIND_COMMAND_MASK           ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_PRIORITY_SHIFT         8
+#define NOUVEAU_SVM_BIND_PRIORITY_BITS          8
+#define NOUVEAU_SVM_BIND_PRIORITY_MASK          ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_TARGET_SHIFT           16
+#define NOUVEAU_SVM_BIND_TARGET_BITS            32
+#define NOUVEAU_SVM_BIND_TARGET_MASK            0xffffffff
+
+/*
+ * Below is use to validate ioctl argument, userspace can also use it to make
+ * sure that no bit are set beyond known fields for a given kernel version.
+ */
+#define NOUVEAU_SVM_BIND_VALID_BITS     48
+#define NOUVEAU_SVM_BIND_VALID_MASK     ((1ULL << NOUVEAU_SVM_BIND_VALID_BITS) - 1)
+
+
+/*
+ * NOUVEAU_BIND_COMMAND__MIGRATE: synchronous migrate to target memory.
+ * result: number of page successfuly migrate to the target memory.
+ */
+#define NOUVEAU_SVM_BIND_COMMAND__MIGRATE               0
+
+/*
+ * NOUVEAU_SVM_BIND_HEADER_TARGET__GPU_VRAM: target the GPU VRAM memory.
+ */
+#define NOUVEAU_SVM_BIND_TARGET__GPU_VRAM               (1UL << 31)
+
+
+#define DRM_IOCTL_NOUVEAU_SVM_INIT           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_INIT, struct drm_nouveau_svm_init)
+#define DRM_IOCTL_NOUVEAU_SVM_BIND           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_BIND, struct drm_nouveau_svm_bind)
+
 #define DRM_IOCTL_NOUVEAU_GEM_NEW            DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_NEW, struct drm_nouveau_gem_new)
 #define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF        DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_PUSHBUF, struct drm_nouveau_gem_pushbuf)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h
index ea70669d2138..58fbe48c91e9 100644
--- a/include/drm-uapi/v3d_drm.h
+++ b/include/drm-uapi/v3d_drm.h
@@ -37,6 +37,7 @@ extern "C" {
 #define DRM_V3D_GET_PARAM                         0x04
 #define DRM_V3D_GET_BO_OFFSET                     0x05
 #define DRM_V3D_SUBMIT_TFU                        0x06
+#define DRM_V3D_SUBMIT_CSD                        0x07
 
 #define DRM_IOCTL_V3D_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
 #define DRM_IOCTL_V3D_WAIT_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
@@ -45,6 +46,7 @@ extern "C" {
 #define DRM_IOCTL_V3D_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
 #define DRM_IOCTL_V3D_GET_BO_OFFSET       DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
 #define DRM_IOCTL_V3D_SUBMIT_TFU          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_TFU, struct drm_v3d_submit_tfu)
+#define DRM_IOCTL_V3D_SUBMIT_CSD          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CSD, struct drm_v3d_submit_csd)
 
 /**
  * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
@@ -190,6 +192,7 @@ enum drm_v3d_param {
 	DRM_V3D_PARAM_V3D_CORE0_IDENT1,
 	DRM_V3D_PARAM_V3D_CORE0_IDENT2,
 	DRM_V3D_PARAM_SUPPORTS_TFU,
+	DRM_V3D_PARAM_SUPPORTS_CSD,
 };
 
 struct drm_v3d_get_param {
@@ -230,6 +233,31 @@ struct drm_v3d_submit_tfu {
 	__u32 out_sync;
 };
 
+/* Submits a compute shader for dispatch.  This job will block on any
+ * previous compute shaders submitted on this fd, and any other
+ * synchronization must be performed with in_sync/out_sync.
+ */
+struct drm_v3d_submit_csd {
+	__u32 cfg[7];
+	__u32 coef[4];
+
+	/* Pointer to a u32 array of the BOs that are referenced by the job.
+	 */
+	__u64 bo_handles;
+
+	/* Number of BO handles passed in (size is that times 4). */
+	__u32 bo_handle_count;
+
+	/* sync object to block on before running the CSD job.  Each
+	 * CSD job will execute in the order submitted to its FD.
+	 * Synchronization against rendering/TFU jobs or CSD from
+	 * other fds requires using sync objects.
+	 */
+	__u32 in_sync;
+	/* Sync object to signal when the CSD job is done. */
+	__u32 out_sync;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 03/25] headers: bump
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Catch up to drm-tip headers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 include/drm-uapi/amdgpu_drm.h  |  52 +++++++-
 include/drm-uapi/drm.h         |  36 ++++++
 include/drm-uapi/drm_mode.h    |   4 +-
 include/drm-uapi/i915_drm.h    | 209 ++++++++++++++++++++++++++++++++-
 include/drm-uapi/lima_drm.h    | 169 ++++++++++++++++++++++++++
 include/drm-uapi/msm_drm.h     |  14 +++
 include/drm-uapi/nouveau_drm.h |  51 ++++++++
 include/drm-uapi/v3d_drm.h     |  28 +++++
 8 files changed, 557 insertions(+), 6 deletions(-)
 create mode 100644 include/drm-uapi/lima_drm.h

diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h
index be84e43c1e19..4788730dbe78 100644
--- a/include/drm-uapi/amdgpu_drm.h
+++ b/include/drm-uapi/amdgpu_drm.h
@@ -210,6 +210,9 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_QUERY2_FLAGS_VRAMLOST (1<<1)
 /* indicate some job from this context once cause gpu hang */
 #define AMDGPU_CTX_QUERY2_FLAGS_GUILTY   (1<<2)
+/* indicate some errors are detected by RAS */
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_CE   (1<<3)
+#define AMDGPU_CTX_QUERY2_FLAGS_RAS_UE   (1<<4)
 
 /* Context priority level */
 #define AMDGPU_CTX_PRIORITY_UNSET       -2048
@@ -272,13 +275,14 @@ union drm_amdgpu_vm {
 
 /* sched ioctl */
 #define AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE	1
+#define AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE	2
 
 struct drm_amdgpu_sched_in {
 	/* AMDGPU_SCHED_OP_* */
 	__u32	op;
 	__u32	fd;
 	__s32	priority;
-	__u32	flags;
+	__u32   ctx_id;
 };
 
 union drm_amdgpu_sched {
@@ -523,6 +527,9 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN      0x04
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT     0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES      0x06
+#define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES	0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT    0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
 
 struct drm_amdgpu_cs_chunk {
 	__u32		chunk_id;
@@ -565,6 +572,11 @@ union drm_amdgpu_cs {
  * caches (L2/vL1/sL1/I$). */
 #define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
 
+/* Set GDS_COMPUTE_MAX_WAVE_ID = DEFAULT before PACKET3_INDIRECT_BUFFER.
+ * This will reset wave ID counters for the IB.
+ */
+#define AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID (1 << 4)
+
 struct drm_amdgpu_cs_chunk_ib {
 	__u32 _pad;
 	/** AMDGPU_IB_FLAG_* */
@@ -598,6 +610,12 @@ struct drm_amdgpu_cs_chunk_sem {
 	__u32 handle;
 };
 
+struct drm_amdgpu_cs_chunk_syncobj {
+       __u32 handle;
+       __u32 flags;
+       __u64 point;
+};
+
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ	0
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD	1
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD	2
@@ -673,6 +691,7 @@ struct drm_amdgpu_cs_chunk_data {
 	#define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11
 	/* Subquery id: Query DMCU firmware version */
 	#define AMDGPU_INFO_FW_DMCU		0x12
+	#define AMDGPU_INFO_FW_TA		0x13
 /* number of bytes moved for TTM migration */
 #define AMDGPU_INFO_NUM_BYTES_MOVED		0x0f
 /* the used VRAM size */
@@ -726,6 +745,37 @@ struct drm_amdgpu_cs_chunk_data {
 /* Number of VRAM page faults on CPU access. */
 #define AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS	0x1E
 #define AMDGPU_INFO_VRAM_LOST_COUNTER		0x1F
+/* query ras mask of enabled features*/
+#define AMDGPU_INFO_RAS_ENABLED_FEATURES	0x20
+
+/* RAS MASK: UMC (VRAM) */
+#define AMDGPU_INFO_RAS_ENABLED_UMC			(1 << 0)
+/* RAS MASK: SDMA */
+#define AMDGPU_INFO_RAS_ENABLED_SDMA			(1 << 1)
+/* RAS MASK: GFX */
+#define AMDGPU_INFO_RAS_ENABLED_GFX			(1 << 2)
+/* RAS MASK: MMHUB */
+#define AMDGPU_INFO_RAS_ENABLED_MMHUB			(1 << 3)
+/* RAS MASK: ATHUB */
+#define AMDGPU_INFO_RAS_ENABLED_ATHUB			(1 << 4)
+/* RAS MASK: PCIE */
+#define AMDGPU_INFO_RAS_ENABLED_PCIE			(1 << 5)
+/* RAS MASK: HDP */
+#define AMDGPU_INFO_RAS_ENABLED_HDP			(1 << 6)
+/* RAS MASK: XGMI */
+#define AMDGPU_INFO_RAS_ENABLED_XGMI			(1 << 7)
+/* RAS MASK: DF */
+#define AMDGPU_INFO_RAS_ENABLED_DF			(1 << 8)
+/* RAS MASK: SMN */
+#define AMDGPU_INFO_RAS_ENABLED_SMN			(1 << 9)
+/* RAS MASK: SEM */
+#define AMDGPU_INFO_RAS_ENABLED_SEM			(1 << 10)
+/* RAS MASK: MP0 */
+#define AMDGPU_INFO_RAS_ENABLED_MP0			(1 << 11)
+/* RAS MASK: MP1 */
+#define AMDGPU_INFO_RAS_ENABLED_MP1			(1 << 12)
+/* RAS MASK: FUSE */
+#define AMDGPU_INFO_RAS_ENABLED_FUSE			(1 << 13)
 
 #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT	0
 #define AMDGPU_INFO_MMR_SE_INDEX_MASK	0xff
diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index 85c685a2075e..c893f3b4a895 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -729,8 +729,18 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+struct drm_syncobj_transfer {
+	__u32 src_handle;
+	__u32 dst_handle;
+	__u64 src_point;
+	__u64 dst_point;
+	__u32 flags;
+	__u32 pad;
+};
+
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE (1 << 2) /* wait for time point to become available */
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
@@ -741,12 +751,33 @@ struct drm_syncobj_wait {
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_wait {
+	__u64 handles;
+	/* wait on specific timeline point for every handles*/
+	__u64 points;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
+
 struct drm_syncobj_array {
 	__u64 handles;
 	__u32 count_handles;
 	__u32 pad;
 };
 
+struct drm_syncobj_timeline_array {
+	__u64 handles;
+	__u64 points;
+	__u32 count_handles;
+	__u32 pad;
+};
+
+
 /* Query current scanout sequence number */
 struct drm_crtc_get_sequence {
 	__u32 crtc_id;		/* requested crtc_id */
@@ -903,6 +934,11 @@ extern "C" {
 #define DRM_IOCTL_MODE_GET_LEASE	DRM_IOWR(0xC8, struct drm_mode_get_lease)
 #define DRM_IOCTL_MODE_REVOKE_LEASE	DRM_IOWR(0xC9, struct drm_mode_revoke_lease)
 
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT	DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait)
+#define DRM_IOCTL_SYNCOBJ_QUERY		DRM_IOWR(0xCB, struct drm_syncobj_timeline_array)
+#define DRM_IOCTL_SYNCOBJ_TRANSFER	DRM_IOWR(0xCC, struct drm_syncobj_transfer)
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL	DRM_IOWR(0xCD, struct drm_syncobj_timeline_array)
+
 /**
  * Device specific ioctls should only be in their respective headers
  * The device specific ioctl range is from 0x40 to 0x9f.
diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h
index a439c2e67896..83cd1636b9be 100644
--- a/include/drm-uapi/drm_mode.h
+++ b/include/drm-uapi/drm_mode.h
@@ -33,7 +33,6 @@
 extern "C" {
 #endif
 
-#define DRM_DISPLAY_INFO_LEN	32
 #define DRM_CONNECTOR_NAME_LEN	32
 #define DRM_DISPLAY_MODE_LEN	32
 #define DRM_PROP_NAME_LEN	32
@@ -622,7 +621,8 @@ struct drm_color_ctm {
 
 struct drm_color_lut {
 	/*
-	 * Data is U0.16 fixed point format.
+	 * Values are mapped linearly to 0.0 - 1.0 range, with 0x0 == 0.0 and
+	 * 0xffff == 1.0.
 	 */
 	__u16 red;
 	__u16 green;
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index e01b3e1fd6d6..761517f15368 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -136,6 +136,8 @@ enum drm_i915_gem_engine_class {
 struct i915_engine_class_instance {
 	__u16 engine_class; /* see enum drm_i915_gem_engine_class */
 	__u16 engine_instance;
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL -2
 };
 
 /**
@@ -355,6 +357,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
 #define DRM_I915_QUERY			0x39
+#define DRM_I915_GEM_VM_CREATE		0x3a
+#define DRM_I915_GEM_VM_DESTROY		0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -415,6 +419,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -598,6 +604,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1120,7 +1132,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT		(1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1464,8 +1485,9 @@ struct drm_i915_gem_context_create_ext {
 	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
 	__u64 extensions;
 };
 
@@ -1507,6 +1529,41 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE	0x8
+
+	/*
+	 * The id of the associated virtual memory address space (ppGTT) of
+	 * this context. Can be retrieved and passed to another context
+	 * (on the same fd) for both to use the same ppGTT and so share
+	 * address layouts, and avoid reloading the page tables on context
+	 * switches between themselves.
+	 *
+	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+	 */
+#define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1540,9 +1597,10 @@ struct drm_i915_gem_context_param_sseu {
 	struct i915_engine_class_instance engine;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * Unknown flags must be cleared to zero.
 	 */
 	__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
 
 	/*
 	 * Mask of slices to enable for the context. Valid values are a subset
@@ -1570,12 +1628,115 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 num_siblings;
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 mbz64; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(name__, N__) struct { \
+	struct i915_user_extension base; \
+	__u16 engine_index; \
+	__u16 num_siblings; \
+	__u32 flags; \
+	__u64 mbz64; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by specifying bonding pairs, for any given master engine we will
+ * only execute on one of the corresponding siblings within the virtual engine.
+ *
+ * To execute a request in parallel on the master engine and a sibling requires
+ * coordination with a I915_EXEC_FENCE_SUBMIT.
+ */
+struct i915_context_engines_bond {
+	struct i915_user_extension base;
+
+	struct i915_engine_class_instance master;
+
+	__u16 virtual_index; /* index of virtual engine in ctx->engines[] */
+	__u16 num_bonds;
+
+	__u64 flags; /* all undefined flags must be zero */
+	__u64 mbz64[4]; /* reserved for future use; must be zero */
+
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_BOND(name__, N__) struct { \
+	struct i915_user_extension base; \
+	struct i915_engine_class_instance master; \
+	__u16 virtual_index; \
+	__u16 num_bonds; \
+	__u64 flags; \
+	__u64 mbz64[4]; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see i915_context_engines_load_balance */
+#define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
+	struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
+#define I915_CONTEXT_CLONE_VM		(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
@@ -1821,6 +1982,7 @@ struct drm_i915_perf_oa_config {
 struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
+#define DRM_I915_QUERY_ENGINE_INFO	2
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -1919,6 +2081,47 @@ struct drm_i915_query_topology_info {
 	__u8 data[];
 };
 
+/**
+ * struct drm_i915_engine_info
+ *
+ * Describes one engine and it's capabilities as known to the driver.
+ */
+struct drm_i915_engine_info {
+	/** Engine class and instance. */
+	struct i915_engine_class_instance engine;
+
+	/** Reserved field. */
+	__u32 rsvd0;
+
+	/** Engine flags. */
+	__u64 flags;
+
+	/** Capabilities of this engine. */
+	__u64 capabilities;
+#define I915_VIDEO_CLASS_CAPABILITY_HEVC		(1 << 0)
+#define I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC	(1 << 1)
+
+	/** Reserved fields. */
+	__u64 rsvd1[4];
+};
+
+/**
+ * struct drm_i915_query_engine_info
+ *
+ * Engine info query enumerates all engines known to the driver by filling in
+ * an array of struct drm_i915_engine_info structures.
+ */
+struct drm_i915_query_engine_info {
+	/** Number of struct drm_i915_engine_info structs following. */
+	__u32 num_engines;
+
+	/** MBZ */
+	__u32 rsvd[3];
+
+	/** Marker for drm_i915_engine_info structures. */
+	struct drm_i915_engine_info engines[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/drm-uapi/lima_drm.h b/include/drm-uapi/lima_drm.h
new file mode 100644
index 000000000000..95a00fb867e6
--- /dev/null
+++ b/include/drm-uapi/lima_drm.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */
+/* Copyright 2017-2018 Qiang Yu <yuq825@gmail.com> */
+
+#ifndef __LIMA_DRM_H__
+#define __LIMA_DRM_H__
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+enum drm_lima_param_gpu_id {
+	DRM_LIMA_PARAM_GPU_ID_UNKNOWN,
+	DRM_LIMA_PARAM_GPU_ID_MALI400,
+	DRM_LIMA_PARAM_GPU_ID_MALI450,
+};
+
+enum drm_lima_param {
+	DRM_LIMA_PARAM_GPU_ID,
+	DRM_LIMA_PARAM_NUM_PP,
+	DRM_LIMA_PARAM_GP_VERSION,
+	DRM_LIMA_PARAM_PP_VERSION,
+};
+
+/**
+ * get various information of the GPU
+ */
+struct drm_lima_get_param {
+	__u32 param; /* in, value in enum drm_lima_param */
+	__u32 pad;   /* pad, must be zero */
+	__u64 value; /* out, parameter value */
+};
+
+/**
+ * create a buffer for used by GPU
+ */
+struct drm_lima_gem_create {
+	__u32 size;    /* in, buffer size */
+	__u32 flags;   /* in, currently no flags, must be zero */
+	__u32 handle;  /* out, GEM buffer handle */
+	__u32 pad;     /* pad, must be zero */
+};
+
+/**
+ * get information of a buffer
+ */
+struct drm_lima_gem_info {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 va;      /* out, virtual address mapped into GPU MMU */
+	__u64 offset;  /* out, used to mmap this buffer to CPU */
+};
+
+#define LIMA_SUBMIT_BO_READ   0x01
+#define LIMA_SUBMIT_BO_WRITE  0x02
+
+/* buffer information used by one task */
+struct drm_lima_gem_submit_bo {
+	__u32 handle;  /* in, GEM buffer handle */
+	__u32 flags;   /* in, buffer read/write by GPU */
+};
+
+#define LIMA_GP_FRAME_REG_NUM 6
+
+/* frame used to setup GP for each task */
+struct drm_lima_gp_frame {
+	__u32 frame[LIMA_GP_FRAME_REG_NUM];
+};
+
+#define LIMA_PP_FRAME_REG_NUM 23
+#define LIMA_PP_WB_REG_NUM 12
+
+/* frame used to setup mali400 GPU PP for each task */
+struct drm_lima_m400_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 plbu_array_address[4];
+	__u32 fragment_stack_address[4];
+};
+
+/* frame used to setup mali450 GPU PP for each task */
+struct drm_lima_m450_pp_frame {
+	__u32 frame[LIMA_PP_FRAME_REG_NUM];
+	__u32 num_pp;
+	__u32 wb[3 * LIMA_PP_WB_REG_NUM];
+	__u32 use_dlbu;
+	__u32 _pad;
+	union {
+		__u32 plbu_array_address[8];
+		__u32 dlbu_regs[4];
+	};
+	__u32 fragment_stack_address[8];
+};
+
+#define LIMA_PIPE_GP  0x00
+#define LIMA_PIPE_PP  0x01
+
+#define LIMA_SUBMIT_FLAG_EXPLICIT_FENCE (1 << 0)
+
+/**
+ * submit a task to GPU
+ *
+ * User can always merge multi sync_file and drm_syncobj
+ * into one drm_syncobj as in_sync[0], but we reserve
+ * in_sync[1] for another task's out_sync to avoid the
+ * export/import/merge pass when explicit sync.
+ */
+struct drm_lima_gem_submit {
+	__u32 ctx;         /* in, context handle task is submitted to */
+	__u32 pipe;        /* in, which pipe to use, GP/PP */
+	__u32 nr_bos;      /* in, array length of bos field */
+	__u32 frame_size;  /* in, size of frame field */
+	__u64 bos;         /* in, array of drm_lima_gem_submit_bo */
+	__u64 frame;       /* in, GP/PP frame */
+	__u32 flags;       /* in, submit flags */
+	__u32 out_sync;    /* in, drm_syncobj handle used to wait task finish after submission */
+	__u32 in_sync[2];  /* in, drm_syncobj handle used to wait before start this task */
+};
+
+#define LIMA_GEM_WAIT_READ   0x01
+#define LIMA_GEM_WAIT_WRITE  0x02
+
+/**
+ * wait pending GPU task finish of a buffer
+ */
+struct drm_lima_gem_wait {
+	__u32 handle;      /* in, GEM buffer handle */
+	__u32 op;          /* in, CPU want to read/write this buffer */
+	__s64 timeout_ns;  /* in, wait timeout in absulute time */
+};
+
+/**
+ * create a context
+ */
+struct drm_lima_ctx_create {
+	__u32 id;          /* out, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+/**
+ * free a context
+ */
+struct drm_lima_ctx_free {
+	__u32 id;          /* in, context handle */
+	__u32 _pad;        /* pad, must be zero */
+};
+
+#define DRM_LIMA_GET_PARAM   0x00
+#define DRM_LIMA_GEM_CREATE  0x01
+#define DRM_LIMA_GEM_INFO    0x02
+#define DRM_LIMA_GEM_SUBMIT  0x03
+#define DRM_LIMA_GEM_WAIT    0x04
+#define DRM_LIMA_CTX_CREATE  0x05
+#define DRM_LIMA_CTX_FREE    0x06
+
+#define DRM_IOCTL_LIMA_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GET_PARAM, struct drm_lima_get_param)
+#define DRM_IOCTL_LIMA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_CREATE, struct drm_lima_gem_create)
+#define DRM_IOCTL_LIMA_GEM_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_LIMA_GEM_INFO, struct drm_lima_gem_info)
+#define DRM_IOCTL_LIMA_GEM_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_SUBMIT, struct drm_lima_gem_submit)
+#define DRM_IOCTL_LIMA_GEM_WAIT DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_GEM_WAIT, struct drm_lima_gem_wait)
+#define DRM_IOCTL_LIMA_CTX_CREATE DRM_IOR(DRM_COMMAND_BASE + DRM_LIMA_CTX_CREATE, struct drm_lima_ctx_create)
+#define DRM_IOCTL_LIMA_CTX_FREE DRM_IOW(DRM_COMMAND_BASE + DRM_LIMA_CTX_FREE, struct drm_lima_ctx_free)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* __LIMA_DRM_H__ */
diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h
index 91a16b333c69..0b85ed6a3710 100644
--- a/include/drm-uapi/msm_drm.h
+++ b/include/drm-uapi/msm_drm.h
@@ -74,6 +74,8 @@ struct drm_msm_timespec {
 #define MSM_PARAM_TIMESTAMP  0x05
 #define MSM_PARAM_GMEM_BASE  0x06
 #define MSM_PARAM_NR_RINGS   0x07
+#define MSM_PARAM_PP_PGTABLE 0x08  /* => 1 for per-process pagetables, else 0 */
+#define MSM_PARAM_FAULTS     0x09
 
 struct drm_msm_param {
 	__u32 pipe;           /* in, MSM_PIPE_x */
@@ -286,6 +288,16 @@ struct drm_msm_submitqueue {
 	__u32 id;      /* out, identifier */
 };
 
+#define MSM_SUBMITQUEUE_PARAM_FAULTS   0
+
+struct drm_msm_submitqueue_query {
+	__u64 data;
+	__u32 id;
+	__u32 param;
+	__u32 len;
+	__u32 pad;
+};
+
 #define DRM_MSM_GET_PARAM              0x00
 /* placeholder:
 #define DRM_MSM_SET_PARAM              0x01
@@ -302,6 +314,7 @@ struct drm_msm_submitqueue {
  */
 #define DRM_MSM_SUBMITQUEUE_NEW        0x0A
 #define DRM_MSM_SUBMITQUEUE_CLOSE      0x0B
+#define DRM_MSM_SUBMITQUEUE_QUERY      0x0C
 
 #define DRM_IOCTL_MSM_GET_PARAM        DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param)
 #define DRM_IOCTL_MSM_GEM_NEW          DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_NEW, struct drm_msm_gem_new)
@@ -313,6 +326,7 @@ struct drm_msm_submitqueue {
 #define DRM_IOCTL_MSM_GEM_MADVISE      DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_MADVISE, struct drm_msm_gem_madvise)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW    DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue)
 #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32)
+#define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY  DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query)
 
 #if defined(__cplusplus)
 }
diff --git a/include/drm-uapi/nouveau_drm.h b/include/drm-uapi/nouveau_drm.h
index 259588a4b61b..9459a6e3bc1f 100644
--- a/include/drm-uapi/nouveau_drm.h
+++ b/include/drm-uapi/nouveau_drm.h
@@ -133,12 +133,63 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_NOTIFIEROBJ_ALLOC  0x05 /* deprecated */
 #define DRM_NOUVEAU_GPUOBJ_FREE        0x06 /* deprecated */
 #define DRM_NOUVEAU_NVIF               0x07
+#define DRM_NOUVEAU_SVM_INIT           0x08
+#define DRM_NOUVEAU_SVM_BIND           0x09
 #define DRM_NOUVEAU_GEM_NEW            0x40
 #define DRM_NOUVEAU_GEM_PUSHBUF        0x41
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
 
+struct drm_nouveau_svm_init {
+	__u64 unmanaged_addr;
+	__u64 unmanaged_size;
+};
+
+struct drm_nouveau_svm_bind {
+	__u64 header;
+	__u64 va_start;
+	__u64 va_end;
+	__u64 npages;
+	__u64 stride;
+	__u64 result;
+	__u64 reserved0;
+	__u64 reserved1;
+};
+
+#define NOUVEAU_SVM_BIND_COMMAND_SHIFT          0
+#define NOUVEAU_SVM_BIND_COMMAND_BITS           8
+#define NOUVEAU_SVM_BIND_COMMAND_MASK           ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_PRIORITY_SHIFT         8
+#define NOUVEAU_SVM_BIND_PRIORITY_BITS          8
+#define NOUVEAU_SVM_BIND_PRIORITY_MASK          ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_TARGET_SHIFT           16
+#define NOUVEAU_SVM_BIND_TARGET_BITS            32
+#define NOUVEAU_SVM_BIND_TARGET_MASK            0xffffffff
+
+/*
+ * Below is use to validate ioctl argument, userspace can also use it to make
+ * sure that no bit are set beyond known fields for a given kernel version.
+ */
+#define NOUVEAU_SVM_BIND_VALID_BITS     48
+#define NOUVEAU_SVM_BIND_VALID_MASK     ((1ULL << NOUVEAU_SVM_BIND_VALID_BITS) - 1)
+
+
+/*
+ * NOUVEAU_BIND_COMMAND__MIGRATE: synchronous migrate to target memory.
+ * result: number of page successfuly migrate to the target memory.
+ */
+#define NOUVEAU_SVM_BIND_COMMAND__MIGRATE               0
+
+/*
+ * NOUVEAU_SVM_BIND_HEADER_TARGET__GPU_VRAM: target the GPU VRAM memory.
+ */
+#define NOUVEAU_SVM_BIND_TARGET__GPU_VRAM               (1UL << 31)
+
+
+#define DRM_IOCTL_NOUVEAU_SVM_INIT           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_INIT, struct drm_nouveau_svm_init)
+#define DRM_IOCTL_NOUVEAU_SVM_BIND           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_BIND, struct drm_nouveau_svm_bind)
+
 #define DRM_IOCTL_NOUVEAU_GEM_NEW            DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_NEW, struct drm_nouveau_gem_new)
 #define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF        DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_PUSHBUF, struct drm_nouveau_gem_pushbuf)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h
index ea70669d2138..58fbe48c91e9 100644
--- a/include/drm-uapi/v3d_drm.h
+++ b/include/drm-uapi/v3d_drm.h
@@ -37,6 +37,7 @@ extern "C" {
 #define DRM_V3D_GET_PARAM                         0x04
 #define DRM_V3D_GET_BO_OFFSET                     0x05
 #define DRM_V3D_SUBMIT_TFU                        0x06
+#define DRM_V3D_SUBMIT_CSD                        0x07
 
 #define DRM_IOCTL_V3D_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
 #define DRM_IOCTL_V3D_WAIT_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
@@ -45,6 +46,7 @@ extern "C" {
 #define DRM_IOCTL_V3D_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
 #define DRM_IOCTL_V3D_GET_BO_OFFSET       DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
 #define DRM_IOCTL_V3D_SUBMIT_TFU          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_TFU, struct drm_v3d_submit_tfu)
+#define DRM_IOCTL_V3D_SUBMIT_CSD          DRM_IOW(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CSD, struct drm_v3d_submit_csd)
 
 /**
  * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
@@ -190,6 +192,7 @@ enum drm_v3d_param {
 	DRM_V3D_PARAM_V3D_CORE0_IDENT1,
 	DRM_V3D_PARAM_V3D_CORE0_IDENT2,
 	DRM_V3D_PARAM_SUPPORTS_TFU,
+	DRM_V3D_PARAM_SUPPORTS_CSD,
 };
 
 struct drm_v3d_get_param {
@@ -230,6 +233,31 @@ struct drm_v3d_submit_tfu {
 	__u32 out_sync;
 };
 
+/* Submits a compute shader for dispatch.  This job will block on any
+ * previous compute shaders submitted on this fd, and any other
+ * synchronization must be performed with in_sync/out_sync.
+ */
+struct drm_v3d_submit_csd {
+	__u32 cfg[7];
+	__u32 coef[4];
+
+	/* Pointer to a u32 array of the BOs that are referenced by the job.
+	 */
+	__u64 bo_handles;
+
+	/* Number of BO handles passed in (size is that times 4). */
+	__u32 bo_handle_count;
+
+	/* sync object to block on before running the CSD job.  Each
+	 * CSD job will execute in the order submitted to its FD.
+	 * Synchronization against rendering/TFU jobs or CSD from
+	 * other fds requires using sync objects.
+	 */
+	__u32 in_sync;
+	/* Sync object to signal when the CSD job is done. */
+	__u32 out_sync;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 04/25] trace.pl: Virtual engine support
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add virtual/queue timelines to both stdout and HTML output.

A new timeline is created for each queue/virtual engine to display
associated requests in queued and runnable states. Once requests are
submitted to a real engine for executing they show up on the physical
engine timeline.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 scripts/trace.pl | 238 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 208 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 930e502ad8eb..873376d0e063 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,11 +27,16 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+my (%db, %vdb, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
     %ctxtimelines, %ctxengines);
+my (%cids, %ctxmap);
+my $cid = 0;
+my %queues;
 my @freqs;
 
-my $max_items = 3000;
+use constant VENG => '255:254';
+
+my $max_requests = 1000;
 my $width_us = 32000;
 my $correct_durations = 0;
 my %ignore_ring;
@@ -181,21 +186,21 @@ sub arg_trace
 	return @_;
 }
 
-sub arg_max_items
+sub arg_max_requests
 {
 	my $val;
 
 	return unless scalar(@_);
 
-	if ($_[0] eq '--max-items' or $_[0] eq '-m') {
+	if ($_[0] eq '--max-requests' or $_[0] eq '-m') {
 		shift @_;
 		$val = shift @_;
-	} elsif ($_[0] =~ /--max-items=(\d+)/) {
+	} elsif ($_[0] =~ /--max-requests=(\d+)/) {
 		shift @_;
 		$val = $1;
 	}
 
-	$max_items = int($val) if defined $val;
+	$max_requests = int($val) if defined $val;
 
 	return @_;
 }
@@ -292,7 +297,7 @@ while (@args) {
 	@args = arg_avg_delay_stats(@args);
 	@args = arg_gpu_timeline(@args);
 	@args = arg_trace(@args);
-	@args = arg_max_items(@args);
+	@args = arg_max_requests(@args);
 	@args = arg_zoom_width(@args);
 	@args = arg_split_requests(@args);
 	@args = arg_ignore_ring(@args);
@@ -324,6 +329,13 @@ sub sanitize_ctx
 	}
 }
 
+sub is_veng
+{
+	my ($class, $instance) = split ':', shift;
+
+	return $instance eq '254';
+}
+
 # Main input loop - parse lines and build the internal representation of the
 # trace using a hash of requests and some auxilliary data structures.
 my $prev_freq = 0;
@@ -366,6 +378,7 @@ while (<>) {
 			$ctx = $tp{'ctx'};
 			$orig_ctx = $ctx;
 			$ctx = sanitize_ctx($ctx, $ring);
+			$ring = VENG if is_veng($ring);
 			$key = db_key($ring, $ctx, $seqno);
 		}
 	}
@@ -374,6 +387,7 @@ while (<>) {
 		my %rw;
 
 		next if exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$rw{'key'} = $key;
 		$rw{'ring'} = $ring;
@@ -382,9 +396,19 @@ while (<>) {
 		$rw{'start'} = $time;
 		$reqwait{$key} = \%rw;
 	} elsif ($tp_name eq 'i915:i915_request_wait_end:') {
-		next unless exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
-		$reqwait{$key}->{'end'} = $time;
+		if (exists $reqwait{$key}) {
+			$reqwait{$key}->{'end'} = $time;
+		} else { # Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+
+			die unless exists $reqwait{$vkey};
+
+			# If the wait started on the virtual engine, attribute
+			# it to it completely.
+			$reqwait{$vkey}->{'end'} = $time;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_add:') {
 		if (exists $queue{$key}) {
 			$ctxdb{$orig_ctx}++;
@@ -395,19 +419,52 @@ while (<>) {
 		}
 
 		$queue{$key} = $time;
+		if ($ring eq VENG and not exists $queues{$ctx}) {
+			$queues{$ctx} = 1 ;
+			$cids{$ctx} = $cid++;
+			$ctxmap{$cids{$ctx}} = $ctx;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_submit:') {
 		die if exists $submit{$key};
 		die unless exists $queue{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$submit{$key} = $time;
 	} elsif ($tp_name eq 'i915:i915_request_in:') {
+		my ($q, $s);
 		my %req;
 
 		# preemption
 		delete $db{$key} if exists $db{$key};
 
-		die unless exists $queue{$key};
-		die unless exists $submit{$key};
+		unless (exists $queue{$key}) {
+			# Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+			my %req;
+
+			die unless exists $queues{$ctx};
+			die unless exists $queue{$vkey};
+			die unless exists $submit{$vkey};
+
+			# Create separate request record on the queue timeline
+			$q = $queue{$vkey};
+			$s = $submit{$vkey};
+			$req{'queue'} = $q;
+			$req{'submit'} = $s;
+			$req{'start'} = $time;
+			$req{'end'} = $time;
+			$req{'ring'} = VENG;
+			$req{'seqno'} = $seqno;
+			$req{'ctx'} = $ctx;
+			$req{'name'} = $ctx . '/' . $seqno;
+			$req{'global'} = $tp{'global'};
+			$req{'port'} = $tp{'port'};
+
+			$vdb{$vkey} = \%req;
+		} else {
+			$q = $queue{$key};
+			$s = $submit{$key};
+		}
 
 		$req{'start'} = $time;
 		$req{'ring'} = $ring;
@@ -419,8 +476,9 @@ while (<>) {
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
-		$req{'queue'} = $queue{$key};
-		$req{'submit'} = $submit{$key};
+		$req{'queue'} = $q;
+		$req{'submit'} = $s;
+		$req{'virtual'} = 1 if exists $queues{$ctx};
 		$rings{$ring} = $gid++ unless exists $rings{$ring};
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
@@ -715,8 +773,10 @@ foreach my $key (@sorted_keys) {
 
 	$running{$ring} += $end - $start if $correct_durations or
 					    not exists $db{$key}->{'no-end'};
-	$runnable{$ring} += $db{$key}->{'execute-delay'};
-	$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	unless (exists $db{$key}->{'virtual'}) {
+		$runnable{$ring} += $db{$key}->{'execute-delay'};
+		$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	}
 
 	$batch_count{$ring}++;
 
@@ -835,6 +895,12 @@ foreach my $key (keys %reqwait) {
 	$reqw{$reqwait{$key}->{'ring'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
 }
 
+# Add up all request waits per virtual engine
+my %vreqw;
+foreach my $key (keys %reqwait) {
+	$vreqw{$reqwait{$key}->{'ctx'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
+}
+
 say sprintf('GPU: %.2f%% idle, %.2f%% busy',
 	     $flat_busy{'gpu-idle'}, $flat_busy{'gpu-busy'}) unless $html;
 
@@ -956,18 +1022,24 @@ ENDHTML
 sub html_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $name;
 
-	$name = 'Ring' . $group;
+	$name = $veng ? 'Virtual' : 'Ring';
+	$name .= $group;
 	$name .= '<br><small><br>';
-	$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
-	$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
+		$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	}
 	$name .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable<br>';
 	$name .= sprintf('%.2f', $stats->{'queued'}) . '% queued<br><br>';
 	$name .= sprintf('%.2f', $stats->{'wait'}) . '% wait<br><br>';
 	$name .= $stats->{'count'} . ' batches<br>';
-	$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
-	$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
+		$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	}
 	$name .= '</small>';
 
 	print "\t{id: $id, content: '$name'},\n";
@@ -976,17 +1048,24 @@ sub html_stats
 sub stdio_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $str;
 
-	$str = 'Ring' . $group . ': ';
+	$str = $veng ? 'Virtual' : 'Ring';
+	$str .= $group . ': ';
 	$str .= $stats->{'count'} . ' batches, ';
-	$str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
-	$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
-	$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	unless ($veng) {
+		$str .= sprintf('%.2f (%.2f) avg batch us, ',
+				$stats->{'avg'}, $stats->{'total-avg'});
+		$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
+		$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	}
+
 	$str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
 	$str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
 	$str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';
-	if ($avg_delay_stats) {
+
+	if ($avg_delay_stats and not $veng) {
 		$str .= ', submit/execute/save-avg=(';
 		$str .= sprintf('%.2f/%.2f/%.2f)', $stats->{'submit'}, $stats->{'execute'}, $stats->{'save'});
 	}
@@ -1008,8 +1087,16 @@ foreach my $group (sort keys %rings) {
 
 	$stats{'idle'} = (1.0 - $flat_busy{$ring} / $elapsed) * 100.0;
 	$stats{'busy'} = $running{$ring} / $elapsed * 100.0;
-	$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
-	$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	if (exists $runnable{$ring}) {
+		$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $queued{$ring}) {
+		$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
 	$reqw{$ring} = 0 unless exists $reqw{$ring};
 	$stats{'wait'} = $reqw{$ring} / $elapsed * 100.0;
 	$stats{'count'} = $batch_count{$ring};
@@ -1026,6 +1113,59 @@ foreach my $group (sort keys %rings) {
 	}
 }
 
+sub sortVQueue {
+	my $as = $vdb{$a}->{'queue'};
+	my $bs = $vdb{$b}->{'queue'};
+	my $val;
+
+	$val = $as <=> $bs;
+	$val = $a cmp $b if $val == 0;
+
+	return $val;
+}
+
+my @sorted_vkeys = sort sortVQueue keys %vdb;
+my (%vqueued, %vrunnable);
+
+foreach my $key (@sorted_vkeys) {
+	my $ctx = $vdb{$key}->{'ctx'};
+
+	$vdb{$key}->{'submit-delay'} = $vdb{$key}->{'submit'} - $vdb{$key}->{'queue'};
+	$vdb{$key}->{'execute-delay'} = $vdb{$key}->{'start'} - $vdb{$key}->{'submit'};
+
+	$vqueued{$ctx} += $vdb{$key}->{'submit-delay'};
+	$vrunnable{$ctx} += $vdb{$key}->{'execute-delay'};
+}
+
+my $veng_id = $engine_start_id + scalar(keys %rings);
+
+foreach my $cid (sort keys %ctxmap) {
+	my $ctx = $ctxmap{$cid};
+	my $elapsed = $last_ts - $first_ts;
+	my %stats;
+
+	$stats{'virtual'} = 1;
+	if (exists $vrunnable{$ctx}) {
+		$stats{'runnable'} = $vrunnable{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $vqueued{$ctx}) {
+		$stats{'queued'} = $vqueued{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
+	$vreqw{$ctx} = 0 unless exists $vreqw{$ctx};
+	$stats{'wait'} = $vreqw{$ctx} / $elapsed * 100.0;
+	$stats{'count'} = scalar(grep {$ctx == $vdb{$_}->{'ctx'}} keys %vdb);
+
+	if ($html) {
+		html_stats(\%stats, $cid, $veng_id++);
+	} else {
+		stdio_stats(\%stats, $cid, $veng_id++);
+	}
+}
+
 exit 0 unless $html;
 
 print <<ENDHTML;
@@ -1129,6 +1269,7 @@ sub box_style
 }
 
 my $i = 0;
+my $req = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
@@ -1142,7 +1283,7 @@ foreach my $key (sort sortQueue keys %db) {
 	my $skey;
 
 	# submit to execute
-	unless (exists $skip_box{'queue'}) {
+	unless (exists $skip_box{'queue'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
 		$style = box_style($ctx, 'queue');
 		$content = "$name<br>$db{$key}->{'submit-delay'}us <small>($db{$key}->{'execute-delay'}us)</small>";
@@ -1153,7 +1294,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	# execute to start
 	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
-	unless (exists $skip_box{'ready'}) {
+	unless (exists $skip_box{'ready'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
@@ -1194,7 +1335,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	$last_ts = $end;
 
-	last if $i > $max_items;
+	last if ++$req > $max_requests;
 }
 
 push @freqs, [$prev_freq_ts, $last_ts, $prev_freq] if $prev_freq;
@@ -1227,6 +1368,43 @@ if ($gpu_timeline) {
 	}
 }
 
+$req = 0;
+$veng_id = $engine_start_id + scalar(keys %rings);
+foreach my $key (@sorted_vkeys) {
+	my ($name, $ctx, $seqno) = ($vdb{$key}->{'name'}, $vdb{$key}->{'ctx'}, $vdb{$key}->{'seqno'});
+	my $queue = $vdb{$key}->{'queue'};
+	my $submit = $vdb{$key}->{'submit'};
+	my $engine_start = $db{$key}->{'engine-start'};
+	my ($content, $style, $startend, $skey);
+	my $group = $veng_id + $cids{$ctx};
+	my $subgroup = $ctx - $min_ctx;
+	my $type = ' type: \'range\',';
+	my $duration;
+
+	# submit to execute
+	unless (exists $skip_box{'queue'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
+		$style = box_style($ctx, 'queue');
+		$content = "$name<br>$vdb{$key}->{'submit-delay'}us <small>($vdb{$key}->{'execute-delay'}us)</small>";
+		$startend = 'start: ' . $queue . ', end: ' . $submit;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	# execute to start
+	$engine_start = $vdb{$key}->{'start'} unless defined $engine_start;
+	unless (exists $skip_box{'ready'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
+		$style = box_style($ctx, 'ready');
+		$content = "<small>$name<br>$vdb{$key}->{'execute-delay'}us</small>";
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	last if ++$req > $max_requests;
+}
+
 my $end_ts = $first_ts + $width_us;
 $first_ts = $first_ts;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 04/25] trace.pl: Virtual engine support
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add virtual/queue timelines to both stdout and HTML output.

A new timeline is created for each queue/virtual engine to display
associated requests in queued and runnable states. Once requests are
submitted to a real engine for executing they show up on the physical
engine timeline.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 scripts/trace.pl | 238 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 208 insertions(+), 30 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 930e502ad8eb..873376d0e063 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -27,11 +27,16 @@ use warnings;
 use 5.010;
 
 my $gid = 0;
-my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
+my (%db, %vdb, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait,
     %ctxtimelines, %ctxengines);
+my (%cids, %ctxmap);
+my $cid = 0;
+my %queues;
 my @freqs;
 
-my $max_items = 3000;
+use constant VENG => '255:254';
+
+my $max_requests = 1000;
 my $width_us = 32000;
 my $correct_durations = 0;
 my %ignore_ring;
@@ -181,21 +186,21 @@ sub arg_trace
 	return @_;
 }
 
-sub arg_max_items
+sub arg_max_requests
 {
 	my $val;
 
 	return unless scalar(@_);
 
-	if ($_[0] eq '--max-items' or $_[0] eq '-m') {
+	if ($_[0] eq '--max-requests' or $_[0] eq '-m') {
 		shift @_;
 		$val = shift @_;
-	} elsif ($_[0] =~ /--max-items=(\d+)/) {
+	} elsif ($_[0] =~ /--max-requests=(\d+)/) {
 		shift @_;
 		$val = $1;
 	}
 
-	$max_items = int($val) if defined $val;
+	$max_requests = int($val) if defined $val;
 
 	return @_;
 }
@@ -292,7 +297,7 @@ while (@args) {
 	@args = arg_avg_delay_stats(@args);
 	@args = arg_gpu_timeline(@args);
 	@args = arg_trace(@args);
-	@args = arg_max_items(@args);
+	@args = arg_max_requests(@args);
 	@args = arg_zoom_width(@args);
 	@args = arg_split_requests(@args);
 	@args = arg_ignore_ring(@args);
@@ -324,6 +329,13 @@ sub sanitize_ctx
 	}
 }
 
+sub is_veng
+{
+	my ($class, $instance) = split ':', shift;
+
+	return $instance eq '254';
+}
+
 # Main input loop - parse lines and build the internal representation of the
 # trace using a hash of requests and some auxilliary data structures.
 my $prev_freq = 0;
@@ -366,6 +378,7 @@ while (<>) {
 			$ctx = $tp{'ctx'};
 			$orig_ctx = $ctx;
 			$ctx = sanitize_ctx($ctx, $ring);
+			$ring = VENG if is_veng($ring);
 			$key = db_key($ring, $ctx, $seqno);
 		}
 	}
@@ -374,6 +387,7 @@ while (<>) {
 		my %rw;
 
 		next if exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$rw{'key'} = $key;
 		$rw{'ring'} = $ring;
@@ -382,9 +396,19 @@ while (<>) {
 		$rw{'start'} = $time;
 		$reqwait{$key} = \%rw;
 	} elsif ($tp_name eq 'i915:i915_request_wait_end:') {
-		next unless exists $reqwait{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
-		$reqwait{$key}->{'end'} = $time;
+		if (exists $reqwait{$key}) {
+			$reqwait{$key}->{'end'} = $time;
+		} else { # Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+
+			die unless exists $reqwait{$vkey};
+
+			# If the wait started on the virtual engine, attribute
+			# it to it completely.
+			$reqwait{$vkey}->{'end'} = $time;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_add:') {
 		if (exists $queue{$key}) {
 			$ctxdb{$orig_ctx}++;
@@ -395,19 +419,52 @@ while (<>) {
 		}
 
 		$queue{$key} = $time;
+		if ($ring eq VENG and not exists $queues{$ctx}) {
+			$queues{$ctx} = 1 ;
+			$cids{$ctx} = $cid++;
+			$ctxmap{$cids{$ctx}} = $ctx;
+		}
 	} elsif ($tp_name eq 'i915:i915_request_submit:') {
 		die if exists $submit{$key};
 		die unless exists $queue{$key};
+		die if $ring eq VENG and not exists $queues{$ctx};
 
 		$submit{$key} = $time;
 	} elsif ($tp_name eq 'i915:i915_request_in:') {
+		my ($q, $s);
 		my %req;
 
 		# preemption
 		delete $db{$key} if exists $db{$key};
 
-		die unless exists $queue{$key};
-		die unless exists $submit{$key};
+		unless (exists $queue{$key}) {
+			# Virtual engine
+			my $vkey = db_key(VENG, $ctx, $seqno);
+			my %req;
+
+			die unless exists $queues{$ctx};
+			die unless exists $queue{$vkey};
+			die unless exists $submit{$vkey};
+
+			# Create separate request record on the queue timeline
+			$q = $queue{$vkey};
+			$s = $submit{$vkey};
+			$req{'queue'} = $q;
+			$req{'submit'} = $s;
+			$req{'start'} = $time;
+			$req{'end'} = $time;
+			$req{'ring'} = VENG;
+			$req{'seqno'} = $seqno;
+			$req{'ctx'} = $ctx;
+			$req{'name'} = $ctx . '/' . $seqno;
+			$req{'global'} = $tp{'global'};
+			$req{'port'} = $tp{'port'};
+
+			$vdb{$vkey} = \%req;
+		} else {
+			$q = $queue{$key};
+			$s = $submit{$key};
+		}
 
 		$req{'start'} = $time;
 		$req{'ring'} = $ring;
@@ -419,8 +476,9 @@ while (<>) {
 		$req{'name'} = $ctx . '/' . $seqno;
 		$req{'global'} = $tp{'global'};
 		$req{'port'} = $tp{'port'};
-		$req{'queue'} = $queue{$key};
-		$req{'submit'} = $submit{$key};
+		$req{'queue'} = $q;
+		$req{'submit'} = $s;
+		$req{'virtual'} = 1 if exists $queues{$ctx};
 		$rings{$ring} = $gid++ unless exists $rings{$ring};
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
@@ -715,8 +773,10 @@ foreach my $key (@sorted_keys) {
 
 	$running{$ring} += $end - $start if $correct_durations or
 					    not exists $db{$key}->{'no-end'};
-	$runnable{$ring} += $db{$key}->{'execute-delay'};
-	$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	unless (exists $db{$key}->{'virtual'}) {
+		$runnable{$ring} += $db{$key}->{'execute-delay'};
+		$queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'};
+	}
 
 	$batch_count{$ring}++;
 
@@ -835,6 +895,12 @@ foreach my $key (keys %reqwait) {
 	$reqw{$reqwait{$key}->{'ring'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
 }
 
+# Add up all request waits per virtual engine
+my %vreqw;
+foreach my $key (keys %reqwait) {
+	$vreqw{$reqwait{$key}->{'ctx'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'};
+}
+
 say sprintf('GPU: %.2f%% idle, %.2f%% busy',
 	     $flat_busy{'gpu-idle'}, $flat_busy{'gpu-busy'}) unless $html;
 
@@ -956,18 +1022,24 @@ ENDHTML
 sub html_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $name;
 
-	$name = 'Ring' . $group;
+	$name = $veng ? 'Virtual' : 'Ring';
+	$name .= $group;
 	$name .= '<br><small><br>';
-	$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
-	$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'idle'}) . '% idle<br><br>';
+		$name .= sprintf('%.2f', $stats->{'busy'}) . '% busy<br>';
+	}
 	$name .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable<br>';
 	$name .= sprintf('%.2f', $stats->{'queued'}) . '% queued<br><br>';
 	$name .= sprintf('%.2f', $stats->{'wait'}) . '% wait<br><br>';
 	$name .= $stats->{'count'} . ' batches<br>';
-	$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
-	$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	unless ($veng) {
+		$name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch<br>';
+		$name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch<br>';
+	}
 	$name .= '</small>';
 
 	print "\t{id: $id, content: '$name'},\n";
@@ -976,17 +1048,24 @@ sub html_stats
 sub stdio_stats
 {
 	my ($stats, $group, $id) = @_;
+	my $veng = exists $stats->{'virtual'} ? 1 : 0;
 	my $str;
 
-	$str = 'Ring' . $group . ': ';
+	$str = $veng ? 'Virtual' : 'Ring';
+	$str .= $group . ': ';
 	$str .= $stats->{'count'} . ' batches, ';
-	$str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'});
-	$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
-	$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	unless ($veng) {
+		$str .= sprintf('%.2f (%.2f) avg batch us, ',
+				$stats->{'avg'}, $stats->{'total-avg'});
+		$str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, ';
+		$str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, ';
+	}
+
 	$str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, ';
 	$str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, ';
 	$str .= sprintf('%.2f', $stats->{'wait'}) . '% wait';
-	if ($avg_delay_stats) {
+
+	if ($avg_delay_stats and not $veng) {
 		$str .= ', submit/execute/save-avg=(';
 		$str .= sprintf('%.2f/%.2f/%.2f)', $stats->{'submit'}, $stats->{'execute'}, $stats->{'save'});
 	}
@@ -1008,8 +1087,16 @@ foreach my $group (sort keys %rings) {
 
 	$stats{'idle'} = (1.0 - $flat_busy{$ring} / $elapsed) * 100.0;
 	$stats{'busy'} = $running{$ring} / $elapsed * 100.0;
-	$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
-	$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	if (exists $runnable{$ring}) {
+		$stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $queued{$ring}) {
+		$stats{'queued'} = $queued{$ring} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
 	$reqw{$ring} = 0 unless exists $reqw{$ring};
 	$stats{'wait'} = $reqw{$ring} / $elapsed * 100.0;
 	$stats{'count'} = $batch_count{$ring};
@@ -1026,6 +1113,59 @@ foreach my $group (sort keys %rings) {
 	}
 }
 
+sub sortVQueue {
+	my $as = $vdb{$a}->{'queue'};
+	my $bs = $vdb{$b}->{'queue'};
+	my $val;
+
+	$val = $as <=> $bs;
+	$val = $a cmp $b if $val == 0;
+
+	return $val;
+}
+
+my @sorted_vkeys = sort sortVQueue keys %vdb;
+my (%vqueued, %vrunnable);
+
+foreach my $key (@sorted_vkeys) {
+	my $ctx = $vdb{$key}->{'ctx'};
+
+	$vdb{$key}->{'submit-delay'} = $vdb{$key}->{'submit'} - $vdb{$key}->{'queue'};
+	$vdb{$key}->{'execute-delay'} = $vdb{$key}->{'start'} - $vdb{$key}->{'submit'};
+
+	$vqueued{$ctx} += $vdb{$key}->{'submit-delay'};
+	$vrunnable{$ctx} += $vdb{$key}->{'execute-delay'};
+}
+
+my $veng_id = $engine_start_id + scalar(keys %rings);
+
+foreach my $cid (sort keys %ctxmap) {
+	my $ctx = $ctxmap{$cid};
+	my $elapsed = $last_ts - $first_ts;
+	my %stats;
+
+	$stats{'virtual'} = 1;
+	if (exists $vrunnable{$ctx}) {
+		$stats{'runnable'} = $vrunnable{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'runnable'} = 0;
+	}
+	if (exists $vqueued{$ctx}) {
+		$stats{'queued'} = $vqueued{$ctx} / $elapsed * 100.0;
+	} else {
+		$stats{'queued'} = 0;
+	}
+	$vreqw{$ctx} = 0 unless exists $vreqw{$ctx};
+	$stats{'wait'} = $vreqw{$ctx} / $elapsed * 100.0;
+	$stats{'count'} = scalar(grep {$ctx == $vdb{$_}->{'ctx'}} keys %vdb);
+
+	if ($html) {
+		html_stats(\%stats, $cid, $veng_id++);
+	} else {
+		stdio_stats(\%stats, $cid, $veng_id++);
+	}
+}
+
 exit 0 unless $html;
 
 print <<ENDHTML;
@@ -1129,6 +1269,7 @@ sub box_style
 }
 
 my $i = 0;
+my $req = 0;
 foreach my $key (sort sortQueue keys %db) {
 	my ($name, $ctx, $seqno) = ($db{$key}->{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'});
 	my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'});
@@ -1142,7 +1283,7 @@ foreach my $key (sort sortQueue keys %db) {
 	my $skey;
 
 	# submit to execute
-	unless (exists $skip_box{'queue'}) {
+	unless (exists $skip_box{'queue'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
 		$style = box_style($ctx, 'queue');
 		$content = "$name<br>$db{$key}->{'submit-delay'}us <small>($db{$key}->{'execute-delay'}us)</small>";
@@ -1153,7 +1294,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	# execute to start
 	$engine_start = $db{$key}->{'start'} unless defined $engine_start;
-	unless (exists $skip_box{'ready'}) {
+	unless (exists $skip_box{'ready'} or exists $db{$key}->{'virtual'}) {
 		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
 		$style = box_style($ctx, 'ready');
 		$content = "<small>$name<br>$db{$key}->{'execute-delay'}us</small>";
@@ -1194,7 +1335,7 @@ foreach my $key (sort sortQueue keys %db) {
 
 	$last_ts = $end;
 
-	last if $i > $max_items;
+	last if ++$req > $max_requests;
 }
 
 push @freqs, [$prev_freq_ts, $last_ts, $prev_freq] if $prev_freq;
@@ -1227,6 +1368,43 @@ if ($gpu_timeline) {
 	}
 }
 
+$req = 0;
+$veng_id = $engine_start_id + scalar(keys %rings);
+foreach my $key (@sorted_vkeys) {
+	my ($name, $ctx, $seqno) = ($vdb{$key}->{'name'}, $vdb{$key}->{'ctx'}, $vdb{$key}->{'seqno'});
+	my $queue = $vdb{$key}->{'queue'};
+	my $submit = $vdb{$key}->{'submit'};
+	my $engine_start = $db{$key}->{'engine-start'};
+	my ($content, $style, $startend, $skey);
+	my $group = $veng_id + $cids{$ctx};
+	my $subgroup = $ctx - $min_ctx;
+	my $type = ' type: \'range\',';
+	my $duration;
+
+	# submit to execute
+	unless (exists $skip_box{'queue'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno;
+		$style = box_style($ctx, 'queue');
+		$content = "$name<br>$vdb{$key}->{'submit-delay'}us <small>($vdb{$key}->{'execute-delay'}us)</small>";
+		$startend = 'start: ' . $queue . ', end: ' . $submit;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	# execute to start
+	$engine_start = $vdb{$key}->{'start'} unless defined $engine_start;
+	unless (exists $skip_box{'ready'}) {
+		$skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1;
+		$style = box_style($ctx, 'ready');
+		$content = "<small>$name<br>$vdb{$key}->{'execute-delay'}us</small>";
+		$startend = 'start: ' . $submit . ', end: ' . $engine_start;
+		print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n";
+		$i++;
+	}
+
+	last if ++$req > $max_requests;
+}
+
 my $end_ts = $first_ts + $width_us;
 $first_ts = $first_ts;
 
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 05/25] trace.pl: Virtual engine preemption support
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Use the 'completed?' tracepoint field to detect more robustly when a
request has been preempted and remove it from the engine database if so.

Otherwise the script can hit a scenario where the same global seqno will
be mentioned multiple times (on an engine seqno) which aborts processing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 873376d0e063..c4ce7176b3e3 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -483,17 +483,23 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey;
-
 		die unless exists $ctxengines{$ctx};
-		die unless exists $db{$key};
-		die unless exists $db{$key}->{'start'};
-		die if exists $db{$key}->{'end'};
 
-		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+		if ($tp{'completed?'}) {
+			my $gkey;
+
+			die unless exists $db{$key};
+			die unless exists $db{$key}->{'start'};
+			die if exists $db{$key}->{'end'};
 
-		$db{$key}->{'end'} = $time;
-		$db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
+			$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
+			$db{$key}->{'end'} = $time;
+			$db{$key}->{'notify'} = $notify{$gkey}
+						if exists $notify{$gkey};
+		} else {
+			delete $db{$key};
+		}
 	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
 		my $gkey;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 05/25] trace.pl: Virtual engine preemption support
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Use the 'completed?' tracepoint field to detect more robustly when a
request has been preempted and remove it from the engine database if so.

Otherwise the script can hit a scenario where the same global seqno will
be mentioned multiple times (on an engine seqno) which aborts processing.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 873376d0e063..c4ce7176b3e3 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -483,17 +483,23 @@ while (<>) {
 		$ringmap{$rings{$ring}} = $ring;
 		$db{$key} = \%req;
 	} elsif ($tp_name eq 'i915:i915_request_out:') {
-		my $gkey;
-
 		die unless exists $ctxengines{$ctx};
-		die unless exists $db{$key};
-		die unless exists $db{$key}->{'start'};
-		die if exists $db{$key}->{'end'};
 
-		$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+		if ($tp{'completed?'}) {
+			my $gkey;
+
+			die unless exists $db{$key};
+			die unless exists $db{$key}->{'start'};
+			die if exists $db{$key}->{'end'};
 
-		$db{$key}->{'end'} = $time;
-		$db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey};
+			$gkey = db_key($ctxengines{$ctx}, $ctx, $seqno);
+
+			$db{$key}->{'end'} = $time;
+			$db{$key}->{'notify'} = $notify{$gkey}
+						if exists $notify{$gkey};
+		} else {
+			delete $db{$key};
+		}
 	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
 		my $gkey;
 
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 06/25] wsim/media-bench: i915 balancing
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl

v2:
 * Add vm_destroy. (Chris)
 * Remove unneeded braces. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 302 +++++++++++++++++++++++++++++++++++------
 scripts/media-bench.pl |   9 +-
 2 files changed, 265 insertions(+), 46 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 48568ce4066e..afb53f4114d2 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -142,6 +142,14 @@ struct w_step
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
 
+struct ctx {
+	uint32_t id;
+	int priority;
+	bool targets_instance;
+	bool wants_balance;
+	unsigned int static_vcs;
+};
+
 struct workload
 {
 	unsigned int id;
@@ -163,11 +171,7 @@ struct workload
 	struct timespec repeat_start;
 
 	unsigned int nr_ctxs;
-	struct {
-		uint32_t id;
-		int priority;
-		unsigned int static_vcs;
-	} *ctx_list;
+	struct ctx *ctx_list;
 
 	int sync_timeline;
 	uint32_t sync_seqno;
@@ -224,6 +228,7 @@ static int fd;
 #define HEARTBEAT	(1<<7)
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
+#define I915		(1<<10)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -841,7 +846,10 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 	if (engine == VCS2 && (flags & VCS2REMAP))
 		engine = BCS;
 
-	eb->flags = eb_engine_map[engine];
+	if ((flags & I915) && engine == VCS)
+		eb->flags = 0;
+	else
+		eb->flags = eb_engine_map[engine];
 }
 
 static void
@@ -867,6 +875,23 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
+static uint32_t
+get_ctxid(struct workload *wrk, struct w_step *w)
+{
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->targets_instance && ctx->wants_balance && w->engine == VCS)
+		return wrk->ctx_list[w->context * 2 + 1].id;
+	else
+		return wrk->ctx_list[w->context * 2].id;
+}
+
 static void
 alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 {
@@ -919,7 +944,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
 	w->eb.buffer_count = j + 1;
-	w->eb.rsvd1 = wrk->ctx_list[w->context].id;
+	w->eb.rsvd1 = get_ctxid(wrk, w);
 
 	if (flags & SWAPVCS && engine == VCS1)
 		engine = VCS2;
@@ -932,17 +957,48 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		printf("%x|", w->obj[i].handle);
 	printf(" %10lu flags=%llx bb=%x[%u] ctx[%u]=%u\n",
 		w->bb_sz, w->eb.flags, w->bb_handle, j, w->context,
-		wrk->ctx_list[w->context].id);
+		get_ctxid(wrk, w));
 #endif
 }
 
+static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
+{
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = ctx_id,
+		.param = I915_CONTEXT_PARAM_PRIORITY,
+		.value = prio,
+	};
+
+	if (prio)
+		gem_context_set_param(fd, &param);
+}
+
+static int __vm_destroy(int i915, uint32_t vm_id)
+{
+	struct drm_i915_gem_vm_control ctl = { .vm_id = vm_id };
+	int err = 0;
+
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_VM_DESTROY, &ctl)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+static void vm_destroy(int i915, uint32_t vm_id)
+{
+	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
+}
+
 static void
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs;
 	int max_ctx = -1;
 	struct w_step *w;
-	int i;
+	int i, j;
 
 	wrk->id = id;
 	wrk->prng = rand();
@@ -975,45 +1031,187 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Pre-scan workload steps to allocate context list storage.
+	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-		if ((int)w->context > max_ctx) {
-			int delta = w->context + 1 - wrk->nr_ctxs;
+		int ctx = w->context * 2 + 1; /* Odd slots are special. */
+		int delta;
+
+		if (ctx <= max_ctx)
+			continue;
+
+		delta = ctx + 1 - wrk->nr_ctxs;
 
-			wrk->nr_ctxs += delta;
-			wrk->ctx_list = realloc(wrk->ctx_list,
-						wrk->nr_ctxs *
-						sizeof(*wrk->ctx_list));
-			memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
-			       delta * sizeof(*wrk->ctx_list));
+		wrk->nr_ctxs += delta;
+		wrk->ctx_list = realloc(wrk->ctx_list,
+					wrk->nr_ctxs * sizeof(*wrk->ctx_list));
+		memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
+			delta * sizeof(*wrk->ctx_list));
 
-			max_ctx = w->context;
+		max_ctx = ctx;
+	}
+
+	/*
+	 * Identify if contexts target specific engine instances and if they
+	 * want to be balanced.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		bool targets = false;
+		bool balance = false;
+
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->type != BATCH)
+				continue;
+
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->engine == VCS)
+				balance = true;
+			else
+				targets = true;
 		}
 
-		if (!wrk->ctx_list[w->context].id) {
-			struct drm_i915_gem_context_create arg = {};
+		if (flags & I915) {
+			wrk->ctx_list[j].targets_instance = targets;
+			wrk->ctx_list[j].wants_balance = balance;
+		}
+	}
 
-			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &arg);
-			igt_assert(arg.ctx_id);
+	/*
+	 * Create and configure contexts.
+	 */
+	for (i = 0; i < wrk->nr_ctxs; i += 2) {
+		struct ctx *ctx = &wrk->ctx_list[i];
+		uint32_t ctx_id, share_vm = 0;
 
-			wrk->ctx_list[w->context].id = arg.ctx_id;
+		if (ctx->id)
+			continue;
 
-			if (flags & GLOBAL_BALANCE) {
-				wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
-				context_vcs_rr ^= 1;
-			} else {
-				wrk->ctx_list[w->context].static_vcs = ctx_vcs;
-				ctx_vcs ^= 1;
-			}
+		if (flags & I915) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+			};
+			struct drm_i915_gem_context_create_ext args = { };
 
-			if (wrk->prio) {
+			/* Find existing context to share ppgtt with. */
+			for (j = 0; j < wrk->nr_ctxs; j++) {
 				struct drm_i915_gem_context_param param = {
-					.ctx_id = arg.ctx_id,
-					.param = I915_CONTEXT_PARAM_PRIORITY,
-					.value = wrk->prio,
+					.param = I915_CONTEXT_PARAM_VM,
 				};
-				gem_context_set_param(fd, &param);
+
+				if (!wrk->ctx_list[j].id)
+					continue;
+
+				param.ctx_id = wrk->ctx_list[j].id;
+
+				gem_context_get_param(fd, &param);
+				igt_assert(param.value);
+
+				share_vm = param.value;
+
+				ext.param.value = share_vm;
+				args.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
+				args.extensions = to_user_pointer(&ext);
+				break;
 			}
+
+			if (!ctx->targets_instance)
+				args.flags |=
+				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			ctx_id = args.ctx_id;
+		} else {
+			struct drm_i915_gem_context_create args = {};
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
+			ctx_id = args.ctx_id;
+		}
+
+		igt_assert(ctx_id);
+		ctx->id = ctx_id;
+
+		if (flags & GLOBAL_BALANCE) {
+			ctx->static_vcs = context_vcs_rr;
+			context_vcs_rr ^= 1;
+		} else {
+			ctx->static_vcs = ctx_vcs;
+			ctx_vcs ^= 1;
 		}
+
+		__ctx_set_prio(ctx_id, wrk->prio);
+
+		/*
+		 * Do we need a separate context to satisfy this workloads which
+		 * both want to target specific engines and be balanced by i915?
+		 */
+		if ((flags & I915) && ctx->wants_balance &&
+		    ctx->targets_instance) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+				.param.value = share_vm,
+			};
+			struct drm_i915_gem_context_create_ext args = {
+				.extensions = to_user_pointer(&ext),
+				.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
+				    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
+			};
+
+			igt_assert(share_vm);
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			igt_assert(args.ctx_id);
+			ctx_id = args.ctx_id;
+			wrk->ctx_list[i + 1].id = args.ctx_id;
+
+			__ctx_set_prio(ctx_id, wrk->prio);
+		}
+
+		if (ctx->wants_balance) {
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
+				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+				.num_siblings = 2,
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
+				.extensions = to_user_pointer(&load_balance),
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_INVALID,
+					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			gem_context_set_param(fd, &param);
+		}
+
+		if (share_vm)
+			vm_destroy(fd, share_vm);
 	}
 
 	/* Record default preemption. */
@@ -1029,7 +1227,6 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
 		struct w_step *w2;
-		int j;
 
 		if (w->type != PREEMPTION)
 			continue;
@@ -1387,7 +1584,7 @@ static enum intel_engine_id
 context_balance(const struct workload_balancer *balancer,
 		struct workload *wrk, struct w_step *w)
 {
-	return get_vcs_engine(wrk->ctx_list[w->context].static_vcs);
+	return get_vcs_engine(__get_ctx(wrk, w)->static_vcs);
 }
 
 static unsigned int
@@ -1581,6 +1778,12 @@ static const struct workload_balancer all_balancers[] = {
 		.get_qd = get_engine_busy,
 		.balance = busy_avg_balance,
 	},
+	{
+		.id = 11,
+		.name = "i915",
+		.desc = "i915 balancing.",
+		.flags = I915,
+	},
 };
 
 static unsigned int
@@ -1959,7 +2162,8 @@ static void *run_workload(void *data)
 			last_sync = false;
 
 			wrk->nr_bb[engine]++;
-			if (engine == VCS && wrk->balancer) {
+			if (engine == VCS && wrk->balancer &&
+			    wrk->balancer->balance) {
 				engine = wrk->balancer->balance(wrk->balancer,
 								wrk, w);
 				wrk->nr_bb[engine]++;
@@ -2386,6 +2590,12 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
+	if ((flags & VCS2REMAP) && (flags & I915)) {
+		if (verbose)
+			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		return 1;
+	}
+
 	if (!nop_calibration) {
 		if (verbose > 1)
 			printf("Calibrating nop delay with %u%% tolerance...\n",
@@ -2471,11 +2681,17 @@ int main(int argc, char **argv)
 		printf("%u client%s.\n", clients, clients > 1 ? "s" : "");
 		if (flags & SWAPVCS)
 			printf("Swapping VCS rings between clients.\n");
-		if (flags & GLOBAL_BALANCE)
-			printf("Using %s balancer in global mode.\n",
-			       balancer->name);
-		else if (balancer)
+		if (flags & GLOBAL_BALANCE) {
+			if (flags & I915) {
+				printf("Ignoring global balancing with i915!\n");
+				flags &= ~GLOBAL_BALANCE;
+			} else {
+				printf("Using %s balancer in global mode.\n",
+				       balancer->name);
+			}
+		} else if (balancer) {
 			printf("Using %s balancer.\n", balancer->name);
+		}
 	}
 
 	if (master_workload >= 0 && clients == 1)
@@ -2492,7 +2708,7 @@ int main(int argc, char **argv)
 		if (flags & SWAPVCS && i & 1)
 			flags_ &= ~SWAPVCS;
 
-		if (flags & GLOBAL_BALANCE) {
+		if ((flags & GLOBAL_BALANCE) && !(flags & I915)) {
 			w[i]->balancer = &global_balancer;
 			w[i]->global_wrk = w[0];
 			w[i]->global_balancer = balancer;
diff --git a/scripts/media-bench.pl b/scripts/media-bench.pl
index f1cd59a253c2..1cd8205ff07c 100755
--- a/scripts/media-bench.pl
+++ b/scripts/media-bench.pl
@@ -49,10 +49,11 @@ my $nop;
 my %opts;
 
 my @balancers = ( 'rr', 'rand', 'qd', 'qdr', 'qdavg', 'rt', 'rtr', 'rtavg',
-		  'context', 'busy', 'busy-avg' );
+		  'context', 'busy', 'busy-avg', 'i915' );
 my %bal_skip_H = ( 'rr' => 1, 'rand' => 1, 'context' => 1, , 'busy' => 1,
-		   'busy-avg' => 1 );
-my %bal_skip_R = ();
+		   'busy-avg' => 1, 'i915' => 1 );
+my %bal_skip_R = ( 'i915' => 1 );
+my %bal_skip_G = ( 'i915' => 1 );
 
 my @workloads = (
 	'media_load_balance_17i7.wsim',
@@ -498,6 +499,8 @@ foreach my $wrk (@saturation_workloads) {
 				my $bid;
 
 				if ($bal ne '') {
+					next GBAL if $G =~ '-G' and exists $bal_skip_G{$bal};
+
 					push @xargs, "-b $bal";
 					push @xargs, '-R' unless exists $bal_skip_R{$bal};
 					push @xargs, $G if $G ne '';
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 06/25] wsim/media-bench: i915 balancing
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl

v2:
 * Add vm_destroy. (Chris)
 * Remove unneeded braces. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 302 +++++++++++++++++++++++++++++++++++------
 scripts/media-bench.pl |   9 +-
 2 files changed, 265 insertions(+), 46 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 48568ce4066e..afb53f4114d2 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -142,6 +142,14 @@ struct w_step
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
 
+struct ctx {
+	uint32_t id;
+	int priority;
+	bool targets_instance;
+	bool wants_balance;
+	unsigned int static_vcs;
+};
+
 struct workload
 {
 	unsigned int id;
@@ -163,11 +171,7 @@ struct workload
 	struct timespec repeat_start;
 
 	unsigned int nr_ctxs;
-	struct {
-		uint32_t id;
-		int priority;
-		unsigned int static_vcs;
-	} *ctx_list;
+	struct ctx *ctx_list;
 
 	int sync_timeline;
 	uint32_t sync_seqno;
@@ -224,6 +228,7 @@ static int fd;
 #define HEARTBEAT	(1<<7)
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
+#define I915		(1<<10)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -841,7 +846,10 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 	if (engine == VCS2 && (flags & VCS2REMAP))
 		engine = BCS;
 
-	eb->flags = eb_engine_map[engine];
+	if ((flags & I915) && engine == VCS)
+		eb->flags = 0;
+	else
+		eb->flags = eb_engine_map[engine];
 }
 
 static void
@@ -867,6 +875,23 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
+static uint32_t
+get_ctxid(struct workload *wrk, struct w_step *w)
+{
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->targets_instance && ctx->wants_balance && w->engine == VCS)
+		return wrk->ctx_list[w->context * 2 + 1].id;
+	else
+		return wrk->ctx_list[w->context * 2].id;
+}
+
 static void
 alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 {
@@ -919,7 +944,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
 	w->eb.buffer_count = j + 1;
-	w->eb.rsvd1 = wrk->ctx_list[w->context].id;
+	w->eb.rsvd1 = get_ctxid(wrk, w);
 
 	if (flags & SWAPVCS && engine == VCS1)
 		engine = VCS2;
@@ -932,17 +957,48 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		printf("%x|", w->obj[i].handle);
 	printf(" %10lu flags=%llx bb=%x[%u] ctx[%u]=%u\n",
 		w->bb_sz, w->eb.flags, w->bb_handle, j, w->context,
-		wrk->ctx_list[w->context].id);
+		get_ctxid(wrk, w));
 #endif
 }
 
+static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio)
+{
+	struct drm_i915_gem_context_param param = {
+		.ctx_id = ctx_id,
+		.param = I915_CONTEXT_PARAM_PRIORITY,
+		.value = prio,
+	};
+
+	if (prio)
+		gem_context_set_param(fd, &param);
+}
+
+static int __vm_destroy(int i915, uint32_t vm_id)
+{
+	struct drm_i915_gem_vm_control ctl = { .vm_id = vm_id };
+	int err = 0;
+
+	if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_VM_DESTROY, &ctl)) {
+		err = -errno;
+		igt_assume(err);
+	}
+
+	errno = 0;
+	return err;
+}
+
+static void vm_destroy(int i915, uint32_t vm_id)
+{
+	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
+}
+
 static void
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs;
 	int max_ctx = -1;
 	struct w_step *w;
-	int i;
+	int i, j;
 
 	wrk->id = id;
 	wrk->prng = rand();
@@ -975,45 +1031,187 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Pre-scan workload steps to allocate context list storage.
+	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-		if ((int)w->context > max_ctx) {
-			int delta = w->context + 1 - wrk->nr_ctxs;
+		int ctx = w->context * 2 + 1; /* Odd slots are special. */
+		int delta;
+
+		if (ctx <= max_ctx)
+			continue;
+
+		delta = ctx + 1 - wrk->nr_ctxs;
 
-			wrk->nr_ctxs += delta;
-			wrk->ctx_list = realloc(wrk->ctx_list,
-						wrk->nr_ctxs *
-						sizeof(*wrk->ctx_list));
-			memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
-			       delta * sizeof(*wrk->ctx_list));
+		wrk->nr_ctxs += delta;
+		wrk->ctx_list = realloc(wrk->ctx_list,
+					wrk->nr_ctxs * sizeof(*wrk->ctx_list));
+		memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0,
+			delta * sizeof(*wrk->ctx_list));
 
-			max_ctx = w->context;
+		max_ctx = ctx;
+	}
+
+	/*
+	 * Identify if contexts target specific engine instances and if they
+	 * want to be balanced.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		bool targets = false;
+		bool balance = false;
+
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->type != BATCH)
+				continue;
+
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->engine == VCS)
+				balance = true;
+			else
+				targets = true;
 		}
 
-		if (!wrk->ctx_list[w->context].id) {
-			struct drm_i915_gem_context_create arg = {};
+		if (flags & I915) {
+			wrk->ctx_list[j].targets_instance = targets;
+			wrk->ctx_list[j].wants_balance = balance;
+		}
+	}
 
-			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &arg);
-			igt_assert(arg.ctx_id);
+	/*
+	 * Create and configure contexts.
+	 */
+	for (i = 0; i < wrk->nr_ctxs; i += 2) {
+		struct ctx *ctx = &wrk->ctx_list[i];
+		uint32_t ctx_id, share_vm = 0;
 
-			wrk->ctx_list[w->context].id = arg.ctx_id;
+		if (ctx->id)
+			continue;
 
-			if (flags & GLOBAL_BALANCE) {
-				wrk->ctx_list[w->context].static_vcs = context_vcs_rr;
-				context_vcs_rr ^= 1;
-			} else {
-				wrk->ctx_list[w->context].static_vcs = ctx_vcs;
-				ctx_vcs ^= 1;
-			}
+		if (flags & I915) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+			};
+			struct drm_i915_gem_context_create_ext args = { };
 
-			if (wrk->prio) {
+			/* Find existing context to share ppgtt with. */
+			for (j = 0; j < wrk->nr_ctxs; j++) {
 				struct drm_i915_gem_context_param param = {
-					.ctx_id = arg.ctx_id,
-					.param = I915_CONTEXT_PARAM_PRIORITY,
-					.value = wrk->prio,
+					.param = I915_CONTEXT_PARAM_VM,
 				};
-				gem_context_set_param(fd, &param);
+
+				if (!wrk->ctx_list[j].id)
+					continue;
+
+				param.ctx_id = wrk->ctx_list[j].id;
+
+				gem_context_get_param(fd, &param);
+				igt_assert(param.value);
+
+				share_vm = param.value;
+
+				ext.param.value = share_vm;
+				args.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS;
+				args.extensions = to_user_pointer(&ext);
+				break;
 			}
+
+			if (!ctx->targets_instance)
+				args.flags |=
+				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			ctx_id = args.ctx_id;
+		} else {
+			struct drm_i915_gem_context_create args = {};
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args);
+			ctx_id = args.ctx_id;
+		}
+
+		igt_assert(ctx_id);
+		ctx->id = ctx_id;
+
+		if (flags & GLOBAL_BALANCE) {
+			ctx->static_vcs = context_vcs_rr;
+			context_vcs_rr ^= 1;
+		} else {
+			ctx->static_vcs = ctx_vcs;
+			ctx_vcs ^= 1;
 		}
+
+		__ctx_set_prio(ctx_id, wrk->prio);
+
+		/*
+		 * Do we need a separate context to satisfy this workloads which
+		 * both want to target specific engines and be balanced by i915?
+		 */
+		if ((flags & I915) && ctx->wants_balance &&
+		    ctx->targets_instance) {
+			struct drm_i915_gem_context_create_ext_setparam ext = {
+				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+				.param.param = I915_CONTEXT_PARAM_VM,
+				.param.value = share_vm,
+			};
+			struct drm_i915_gem_context_create_ext args = {
+				.extensions = to_user_pointer(&ext),
+				.flags =
+				    I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS |
+				    I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE,
+			};
+
+			igt_assert(share_vm);
+
+			drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT,
+				 &args);
+
+			igt_assert(args.ctx_id);
+			ctx_id = args.ctx_id;
+			wrk->ctx_list[i + 1].id = args.ctx_id;
+
+			__ctx_set_prio(ctx_id, wrk->prio);
+		}
+
+		if (ctx->wants_balance) {
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
+				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+				.num_siblings = 2,
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
+				.extensions = to_user_pointer(&load_balance),
+				.engines = {
+					{ .engine_class = I915_ENGINE_CLASS_INVALID,
+					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 0 },
+					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
+					  .engine_instance = 1 },
+				},
+			};
+
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			gem_context_set_param(fd, &param);
+		}
+
+		if (share_vm)
+			vm_destroy(fd, share_vm);
 	}
 
 	/* Record default preemption. */
@@ -1029,7 +1227,6 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 */
 	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
 		struct w_step *w2;
-		int j;
 
 		if (w->type != PREEMPTION)
 			continue;
@@ -1387,7 +1584,7 @@ static enum intel_engine_id
 context_balance(const struct workload_balancer *balancer,
 		struct workload *wrk, struct w_step *w)
 {
-	return get_vcs_engine(wrk->ctx_list[w->context].static_vcs);
+	return get_vcs_engine(__get_ctx(wrk, w)->static_vcs);
 }
 
 static unsigned int
@@ -1581,6 +1778,12 @@ static const struct workload_balancer all_balancers[] = {
 		.get_qd = get_engine_busy,
 		.balance = busy_avg_balance,
 	},
+	{
+		.id = 11,
+		.name = "i915",
+		.desc = "i915 balancing.",
+		.flags = I915,
+	},
 };
 
 static unsigned int
@@ -1959,7 +2162,8 @@ static void *run_workload(void *data)
 			last_sync = false;
 
 			wrk->nr_bb[engine]++;
-			if (engine == VCS && wrk->balancer) {
+			if (engine == VCS && wrk->balancer &&
+			    wrk->balancer->balance) {
 				engine = wrk->balancer->balance(wrk->balancer,
 								wrk, w);
 				wrk->nr_bb[engine]++;
@@ -2386,6 +2590,12 @@ int main(int argc, char **argv)
 		return 1;
 	}
 
+	if ((flags & VCS2REMAP) && (flags & I915)) {
+		if (verbose)
+			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		return 1;
+	}
+
 	if (!nop_calibration) {
 		if (verbose > 1)
 			printf("Calibrating nop delay with %u%% tolerance...\n",
@@ -2471,11 +2681,17 @@ int main(int argc, char **argv)
 		printf("%u client%s.\n", clients, clients > 1 ? "s" : "");
 		if (flags & SWAPVCS)
 			printf("Swapping VCS rings between clients.\n");
-		if (flags & GLOBAL_BALANCE)
-			printf("Using %s balancer in global mode.\n",
-			       balancer->name);
-		else if (balancer)
+		if (flags & GLOBAL_BALANCE) {
+			if (flags & I915) {
+				printf("Ignoring global balancing with i915!\n");
+				flags &= ~GLOBAL_BALANCE;
+			} else {
+				printf("Using %s balancer in global mode.\n",
+				       balancer->name);
+			}
+		} else if (balancer) {
 			printf("Using %s balancer.\n", balancer->name);
+		}
 	}
 
 	if (master_workload >= 0 && clients == 1)
@@ -2492,7 +2708,7 @@ int main(int argc, char **argv)
 		if (flags & SWAPVCS && i & 1)
 			flags_ &= ~SWAPVCS;
 
-		if (flags & GLOBAL_BALANCE) {
+		if ((flags & GLOBAL_BALANCE) && !(flags & I915)) {
 			w[i]->balancer = &global_balancer;
 			w[i]->global_wrk = w[0];
 			w[i]->global_balancer = balancer;
diff --git a/scripts/media-bench.pl b/scripts/media-bench.pl
index f1cd59a253c2..1cd8205ff07c 100755
--- a/scripts/media-bench.pl
+++ b/scripts/media-bench.pl
@@ -49,10 +49,11 @@ my $nop;
 my %opts;
 
 my @balancers = ( 'rr', 'rand', 'qd', 'qdr', 'qdavg', 'rt', 'rtr', 'rtavg',
-		  'context', 'busy', 'busy-avg' );
+		  'context', 'busy', 'busy-avg', 'i915' );
 my %bal_skip_H = ( 'rr' => 1, 'rand' => 1, 'context' => 1, , 'busy' => 1,
-		   'busy-avg' => 1 );
-my %bal_skip_R = ();
+		   'busy-avg' => 1, 'i915' => 1 );
+my %bal_skip_R = ( 'i915' => 1 );
+my %bal_skip_G = ( 'i915' => 1 );
 
 my @workloads = (
 	'media_load_balance_17i7.wsim',
@@ -498,6 +499,8 @@ foreach my $wrk (@saturation_workloads) {
 				my $bid;
 
 				if ($bal ne '') {
+					next GBAL if $G =~ '-G' and exists $bal_skip_G{$bal};
+
 					push @xargs, "-b $bal";
 					push @xargs, '-R' unless exists $bal_skip_R{$bal};
 					push @xargs, $G if $G ne '';
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 07/25] gem_wsim: Use IGT uapi headers
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are moving towards bumping the uAPI headers more often instead of using
too much local struct/ioctl/param definitions since the latter are more
challenging for rebase and maintenance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index afb53f4114d2..b91dbdec2cce 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -41,7 +41,6 @@
 #include <limits.h>
 #include <pthread.h>
 
-
 #include "intel_chipset.h"
 #include "intel_reg.h"
 #include "drm.h"
@@ -57,9 +56,6 @@
 
 #include "ewma.h"
 
-#define LOCAL_I915_EXEC_FENCE_IN              (1<<16)
-#define LOCAL_I915_EXEC_FENCE_OUT             (1<<17)
-
 enum intel_engine_id {
 	RCS,
 	BCS,
@@ -863,7 +859,7 @@ eb_update_flags(struct w_step *w, enum intel_engine_id engine,
 
 	igt_assert(w->emit_fence <= 0);
 	if (w->emit_fence)
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_OUT;
+		w->eb.flags |= I915_EXEC_FENCE_OUT;
 }
 
 static struct drm_i915_gem_exec_object2 *
@@ -2016,16 +2012,16 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_IN;
+		w->eb.flags |= I915_EXEC_FENCE_IN;
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT)
+	if (w->eb.flags & I915_EXEC_FENCE_OUT)
 		gem_execbuf_wr(fd, &w->eb);
 	else
 		gem_execbuf(fd, &w->eb);
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT) {
+	if (w->eb.flags & I915_EXEC_FENCE_OUT) {
 		w->emit_fence = w->eb.rsvd2 >> 32;
 		igt_assert(w->emit_fence > 0);
 	}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 07/25] gem_wsim: Use IGT uapi headers
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We are moving towards bumping the uAPI headers more often instead of using
too much local struct/ioctl/param definitions since the latter are more
challenging for rebase and maintenance.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index afb53f4114d2..b91dbdec2cce 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -41,7 +41,6 @@
 #include <limits.h>
 #include <pthread.h>
 
-
 #include "intel_chipset.h"
 #include "intel_reg.h"
 #include "drm.h"
@@ -57,9 +56,6 @@
 
 #include "ewma.h"
 
-#define LOCAL_I915_EXEC_FENCE_IN              (1<<16)
-#define LOCAL_I915_EXEC_FENCE_OUT             (1<<17)
-
 enum intel_engine_id {
 	RCS,
 	BCS,
@@ -863,7 +859,7 @@ eb_update_flags(struct w_step *w, enum intel_engine_id engine,
 
 	igt_assert(w->emit_fence <= 0);
 	if (w->emit_fence)
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_OUT;
+		w->eb.flags |= I915_EXEC_FENCE_OUT;
 }
 
 static struct drm_i915_gem_exec_object2 *
@@ -2016,16 +2012,16 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= LOCAL_I915_EXEC_FENCE_IN;
+		w->eb.flags |= I915_EXEC_FENCE_IN;
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT)
+	if (w->eb.flags & I915_EXEC_FENCE_OUT)
 		gem_execbuf_wr(fd, &w->eb);
 	else
 		gem_execbuf(fd, &w->eb);
 
-	if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT) {
+	if (w->eb.flags & I915_EXEC_FENCE_OUT) {
 		w->emit_fence = w->eb.rsvd2 >> 32;
 		igt_assert(w->emit_fence > 0);
 	}
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 08/25] gem_wsim: Factor out common error handling
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

There is a repeated pattern with error handling which can be moved to a
macro to for better readability in the command parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 244 +++++++++++++++---------------------------
 1 file changed, 88 insertions(+), 156 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index b91dbdec2cce..fceb850d0ca0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -289,6 +289,27 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 	return 0;
 }
 
+static void __attribute__((format(printf, 1, 2)))
+wsim_err(const char *fmt, ...)
+{
+	va_list ap;
+
+	if (!verbose)
+		return;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+}
+
+#define check_arg(cond, fmt, ...) \
+{ \
+	if (cond) { \
+		wsim_err(fmt, __VA_ARGS__); \
+		return NULL; \
+	} \
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -319,14 +340,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid delay at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid delay at step %u!\n",
+						  nr_steps);
 					step.type = DELAY;
 					step.delay = tmp;
 					goto add_step;
@@ -335,14 +351,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid period at step %u!\n",
+						  nr_steps);
 					step.type = PERIOD;
 					step.period = tmp;
 					goto add_step;
@@ -352,25 +363,17 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid priority format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.priority = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid priority format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -381,15 +384,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0 ||
-					    ((int)nr_steps + tmp) < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sync target at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0 ||
+						  ((int)nr_steps + tmp) < 0,
+						  "Invalid sync target at step %u!\n",
+						  nr_steps);
 					step.type = SYNC;
 					step.target = tmp;
 					goto add_step;
@@ -398,14 +396,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid throttle at step %u!\n",
+						  nr_steps);
 					step.type = THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -414,14 +407,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid qd throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid qd throttle at step %u!\n",
+						  nr_steps);
 					step.type = QD_THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -430,14 +418,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sw fence signal at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0,
+						  "Invalid sw fence signal at step %u!\n",
+						  nr_steps);
 					step.type = SW_FENCE_SIGNAL;
 					step.target = tmp;
 					goto add_step;
@@ -450,31 +433,20 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					} else if (tmp < 0 && nr == 1) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr == 1 && tmp < 0,
+						  "Invalid preemption period at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid preemption format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.period = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -492,13 +464,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			}
 
 			tmp = atoi(field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid ctx id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0, "Invalid ctx id at step %u!\n",
+				  nr_steps);
 			step.context = tmp;
 
 			valid++;
@@ -519,13 +486,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			}
 
-			if (old_valid == valid) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid engine id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(old_valid == valid,
+				  "Invalid engine id at step %u!\n", nr_steps);
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
@@ -535,25 +497,19 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmpl = strtol(field, &sep, 10);
-			if (tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid duration at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+				  tmpl == LONG_MAX,
+				  "Invalid duration at step %u!\n", nr_steps);
 			step.duration.min = tmpl;
 
 			if (sep && *sep == '-') {
 				tmpl = strtol(sep + 1, NULL, 10);
-				if (tmpl <= 0 || tmpl <= step.duration.min ||
-				    tmpl == LONG_MIN || tmpl == LONG_MAX) {
-					if (verbose)
-						fprintf(stderr,
-							"Invalid duration range at step %u!\n",
-							nr_steps);
-					return NULL;
-				}
+				check_arg(tmpl <= 0 ||
+					  tmpl <= step.duration.min ||
+					  tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration range at step %u!\n",
+					  nr_steps);
 				step.duration.max = tmpl;
 			} else {
 				step.duration.max = step.duration.min;
@@ -566,13 +522,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0,
+				  "Invalid dependency at step %u!\n", nr_steps);
 
 			valid++;
 		}
@@ -580,25 +531,16 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
 			fstart = NULL;
 
-			if (strlen(field) != 1 ||
-			    (field[0] != '0' && field[0] != '1')) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid wait boolean at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(strlen(field) != 1 ||
+				  (field[0] != '0' && field[0] != '1'),
+				  "Invalid wait boolean at step %u!\n",
+				  nr_steps);
 			step.sync = field[0] - '0';
 
 			valid++;
 		}
 
-		if (valid != 5) {
-			if (verbose)
-				fprintf(stderr, "Invalid record at step %u!\n",
-					nr_steps);
-			return NULL;
-		}
+		check_arg(valid != 5, "Invalid record at step %u!\n", nr_steps);
 
 		step.type = BATCH;
 
@@ -643,15 +585,10 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		for (j = 0; j < steps[i].fence_deps.nr; j++) {
 			tmp = steps[i].idx + steps[i].fence_deps.list[j];
-			if (tmp < 0 || tmp >= i ||
-			    (steps[tmp].type != BATCH &&
-			     steps[tmp].type != SW_FENCE)) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  (steps[tmp].type != BATCH &&
+				   steps[tmp].type != SW_FENCE),
+				  "Invalid dependency target %u!\n", i);
 			steps[tmp].emit_fence = -1;
 		}
 	}
@@ -660,14 +597,9 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		if (steps[i].type == SW_FENCE_SIGNAL) {
 			tmp = steps[i].idx + steps[i].target;
-			if (tmp < 0 || tmp >= i ||
-			    steps[tmp].type != SW_FENCE) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid sw fence target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  steps[tmp].type != SW_FENCE,
+				  "Invalid sw fence target %u!\n", i);
 		}
 	}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 08/25] gem_wsim: Factor out common error handling
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

There is a repeated pattern with error handling which can be moved to a
macro to for better readability in the command parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 244 +++++++++++++++---------------------------
 1 file changed, 88 insertions(+), 156 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index b91dbdec2cce..fceb850d0ca0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -289,6 +289,27 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 	return 0;
 }
 
+static void __attribute__((format(printf, 1, 2)))
+wsim_err(const char *fmt, ...)
+{
+	va_list ap;
+
+	if (!verbose)
+		return;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+}
+
+#define check_arg(cond, fmt, ...) \
+{ \
+	if (cond) { \
+		wsim_err(fmt, __VA_ARGS__); \
+		return NULL; \
+	} \
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -319,14 +340,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid delay at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid delay at step %u!\n",
+						  nr_steps);
 					step.type = DELAY;
 					step.delay = tmp;
 					goto add_step;
@@ -335,14 +351,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp <= 0,
+						  "Invalid period at step %u!\n",
+						  nr_steps);
 					step.type = PERIOD;
 					step.period = tmp;
 					goto add_step;
@@ -352,25 +363,17 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid priority format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.priority = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid priority format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -381,15 +384,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0 ||
-					    ((int)nr_steps + tmp) < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sync target at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0 ||
+						  ((int)nr_steps + tmp) < 0,
+						  "Invalid sync target at step %u!\n",
+						  nr_steps);
 					step.type = SYNC;
 					step.target = tmp;
 					goto add_step;
@@ -398,14 +396,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid throttle at step %u!\n",
+						  nr_steps);
 					step.type = THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -414,14 +407,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp < 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid qd throttle at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp < 0,
+						  "Invalid qd throttle at step %u!\n",
+						  nr_steps);
 					step.type = QD_THROTTLE;
 					step.throttle = tmp;
 					goto add_step;
@@ -430,14 +418,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				if ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp >= 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid sw fence signal at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
+					check_arg(tmp >= 0,
+						  "Invalid sw fence signal at step %u!\n",
+						  nr_steps);
 					step.type = SW_FENCE_SIGNAL;
 					step.target = tmp;
 					goto add_step;
@@ -450,31 +433,20 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
 				    NULL) {
 					tmp = atoi(field);
-					if (tmp <= 0 && nr == 0) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid context at step %u!\n",
-								nr_steps);
-						return NULL;
-					} else if (tmp < 0 && nr == 1) {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption period at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
-
-					if (nr == 0) {
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr == 1 && tmp < 0,
+						  "Invalid preemption period at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid preemption format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
 						step.context = tmp;
-					} else if (nr == 1) {
+					else
 						step.period = tmp;
-					} else {
-						if (verbose)
-							fprintf(stderr,
-								"Invalid preemption format at step %u!\n",
-								nr_steps);
-						return NULL;
-					}
 
 					nr++;
 				}
@@ -492,13 +464,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			}
 
 			tmp = atoi(field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid ctx id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0, "Invalid ctx id at step %u!\n",
+				  nr_steps);
 			step.context = tmp;
 
 			valid++;
@@ -519,13 +486,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			}
 
-			if (old_valid == valid) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid engine id at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(old_valid == valid,
+				  "Invalid engine id at step %u!\n", nr_steps);
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
@@ -535,25 +497,19 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmpl = strtol(field, &sep, 10);
-			if (tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid duration at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+				  tmpl == LONG_MAX,
+				  "Invalid duration at step %u!\n", nr_steps);
 			step.duration.min = tmpl;
 
 			if (sep && *sep == '-') {
 				tmpl = strtol(sep + 1, NULL, 10);
-				if (tmpl <= 0 || tmpl <= step.duration.min ||
-				    tmpl == LONG_MIN || tmpl == LONG_MAX) {
-					if (verbose)
-						fprintf(stderr,
-							"Invalid duration range at step %u!\n",
-							nr_steps);
-					return NULL;
-				}
+				check_arg(tmpl <= 0 ||
+					  tmpl <= step.duration.min ||
+					  tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration range at step %u!\n",
+					  nr_steps);
 				step.duration.max = tmpl;
 			} else {
 				step.duration.max = step.duration.min;
@@ -566,13 +522,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
-			if (tmp < 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(tmp < 0,
+				  "Invalid dependency at step %u!\n", nr_steps);
 
 			valid++;
 		}
@@ -580,25 +531,16 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
 			fstart = NULL;
 
-			if (strlen(field) != 1 ||
-			    (field[0] != '0' && field[0] != '1')) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid wait boolean at step %u!\n",
-						nr_steps);
-				return NULL;
-			}
+			check_arg(strlen(field) != 1 ||
+				  (field[0] != '0' && field[0] != '1'),
+				  "Invalid wait boolean at step %u!\n",
+				  nr_steps);
 			step.sync = field[0] - '0';
 
 			valid++;
 		}
 
-		if (valid != 5) {
-			if (verbose)
-				fprintf(stderr, "Invalid record at step %u!\n",
-					nr_steps);
-			return NULL;
-		}
+		check_arg(valid != 5, "Invalid record at step %u!\n", nr_steps);
 
 		step.type = BATCH;
 
@@ -643,15 +585,10 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		for (j = 0; j < steps[i].fence_deps.nr; j++) {
 			tmp = steps[i].idx + steps[i].fence_deps.list[j];
-			if (tmp < 0 || tmp >= i ||
-			    (steps[tmp].type != BATCH &&
-			     steps[tmp].type != SW_FENCE)) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid dependency target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  (steps[tmp].type != BATCH &&
+				   steps[tmp].type != SW_FENCE),
+				  "Invalid dependency target %u!\n", i);
 			steps[tmp].emit_fence = -1;
 		}
 	}
@@ -660,14 +597,9 @@ add_step:
 	for (i = 0; i < nr_steps; i++) {
 		if (steps[i].type == SW_FENCE_SIGNAL) {
 			tmp = steps[i].idx + steps[i].target;
-			if (tmp < 0 || tmp >= i ||
-			    steps[tmp].type != SW_FENCE) {
-				if (verbose)
-					fprintf(stderr,
-						"Invalid sw fence target %u!\n",
-						i);
-				return NULL;
-			}
+			check_arg(tmp < 0 || tmp >= i ||
+				  steps[tmp].type != SW_FENCE,
+				  "Invalid sw fence target %u!\n", i);
 		}
 	}
 
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 09/25] gem_wsim: More wsim_err
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few more opportunities to compact the code by using the error logging
helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 54 ++++++++++++-------------------------------
 1 file changed, 15 insertions(+), 39 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index fceb850d0ca0..6c9eb1e20efc 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -2419,9 +2419,7 @@ int main(int argc, char **argv)
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one master workload can be given!\n");
+				wsim_err("Only one master workload can be given!\n");
 				return 1;
 			}
 			master_workload = nr_w_args;
@@ -2434,9 +2432,7 @@ int main(int argc, char **argv)
 			break;
 		case 'a':
 			if (append_workload_arg) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one append workload can be given!\n");
+				wsim_err("Only one append workload can be given!\n");
 				return 1;
 			}
 			append_workload_arg = optarg;
@@ -2497,10 +2493,8 @@ int main(int argc, char **argv)
 			}
 
 			if (!balancer) {
-				if (verbose)
-					fprintf(stderr,
-						"Unknown balancing mode '%s'!\n",
-						optarg);
+				wsim_err("Unknown balancing mode '%s'!\n",
+					 optarg);
 				return 1;
 			}
 			break;
@@ -2513,14 +2507,12 @@ int main(int argc, char **argv)
 	}
 
 	if ((flags & HEARTBEAT) && !(flags & SEQNO)) {
-		if (verbose)
-			fprintf(stderr, "Heartbeat needs a seqno based balancer!\n");
+		wsim_err("Heartbeat needs a seqno based balancer!\n");
 		return 1;
 	}
 
 	if ((flags & VCS2REMAP) && (flags & I915)) {
-		if (verbose)
-			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		wsim_err("VCS remapping not supported with i915 balancing!\n");
 		return 1;
 	}
 
@@ -2537,31 +2529,24 @@ int main(int argc, char **argv)
 	}
 
 	if (!nr_w_args) {
-		if (verbose)
-			fprintf(stderr, "No workload descriptor(s)!\n");
+		wsim_err("No workload descriptor(s)!\n");
 		return 1;
 	}
 
 	if (nr_w_args > 1 && clients > 1) {
-		if (verbose)
-			fprintf(stderr,
-				"Cloned clients cannot be combined with multiple workloads!\n");
+		wsim_err("Cloned clients cannot be combined with multiple workloads!\n");
 		return 1;
 	}
 
 	if ((flags & GLOBAL_BALANCE) && !balancer) {
-		if (verbose)
-			fprintf(stderr,
-				"Balancer not specified in global balancing mode!\n");
+		wsim_err("Balancer not specified in global balancing mode!\n");
 		return 1;
 	}
 
 	if (append_workload_arg) {
 		append_workload_arg = load_workload_descriptor(append_workload_arg);
 		if (!append_workload_arg) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load append workload descriptor!\n");
+			wsim_err("Failed to load append workload descriptor!\n");
 			return 1;
 		}
 	}
@@ -2570,9 +2555,7 @@ int main(int argc, char **argv)
 		struct w_arg arg = { NULL, append_workload_arg, 0 };
 		app_w = parse_workload(&arg, flags, NULL);
 		if (!app_w) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse append workload!\n");
+			wsim_err("Failed to parse append workload!\n");
 			return 1;
 		}
 	}
@@ -2584,18 +2567,13 @@ int main(int argc, char **argv)
 		w_args[i].desc = load_workload_descriptor(w_args[i].filename);
 
 		if (!w_args[i].desc) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load workload descriptor %u!\n",
-					i);
+			wsim_err("Failed to load workload descriptor %u!\n", i);
 			return 1;
 		}
 
 		wrk[i] = parse_workload(&w_args[i], flags, app_w);
 		if (!wrk[i]) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse workload %u!\n", i);
+			wsim_err("Failed to parse workload %u!\n", i);
 			return 1;
 		}
 	}
@@ -2655,10 +2633,8 @@ int main(int argc, char **argv)
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
 			if (ret) {
-				if (verbose)
-					fprintf(stderr,
-						"Failed to initialize balancing! (%u=%d)\n",
-						i, ret);
+				wsim_err("Failed to initialize balancing! (%u=%d)\n",
+					 i, ret);
 				return 1;
 			}
 		}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 09/25] gem_wsim: More wsim_err
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few more opportunities to compact the code by using the error logging
helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 54 ++++++++++++-------------------------------
 1 file changed, 15 insertions(+), 39 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index fceb850d0ca0..6c9eb1e20efc 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -2419,9 +2419,7 @@ int main(int argc, char **argv)
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one master workload can be given!\n");
+				wsim_err("Only one master workload can be given!\n");
 				return 1;
 			}
 			master_workload = nr_w_args;
@@ -2434,9 +2432,7 @@ int main(int argc, char **argv)
 			break;
 		case 'a':
 			if (append_workload_arg) {
-				if (verbose)
-					fprintf(stderr,
-						"Only one append workload can be given!\n");
+				wsim_err("Only one append workload can be given!\n");
 				return 1;
 			}
 			append_workload_arg = optarg;
@@ -2497,10 +2493,8 @@ int main(int argc, char **argv)
 			}
 
 			if (!balancer) {
-				if (verbose)
-					fprintf(stderr,
-						"Unknown balancing mode '%s'!\n",
-						optarg);
+				wsim_err("Unknown balancing mode '%s'!\n",
+					 optarg);
 				return 1;
 			}
 			break;
@@ -2513,14 +2507,12 @@ int main(int argc, char **argv)
 	}
 
 	if ((flags & HEARTBEAT) && !(flags & SEQNO)) {
-		if (verbose)
-			fprintf(stderr, "Heartbeat needs a seqno based balancer!\n");
+		wsim_err("Heartbeat needs a seqno based balancer!\n");
 		return 1;
 	}
 
 	if ((flags & VCS2REMAP) && (flags & I915)) {
-		if (verbose)
-			fprintf(stderr, "VCS remapping not supported with i915 balancing!\n");
+		wsim_err("VCS remapping not supported with i915 balancing!\n");
 		return 1;
 	}
 
@@ -2537,31 +2529,24 @@ int main(int argc, char **argv)
 	}
 
 	if (!nr_w_args) {
-		if (verbose)
-			fprintf(stderr, "No workload descriptor(s)!\n");
+		wsim_err("No workload descriptor(s)!\n");
 		return 1;
 	}
 
 	if (nr_w_args > 1 && clients > 1) {
-		if (verbose)
-			fprintf(stderr,
-				"Cloned clients cannot be combined with multiple workloads!\n");
+		wsim_err("Cloned clients cannot be combined with multiple workloads!\n");
 		return 1;
 	}
 
 	if ((flags & GLOBAL_BALANCE) && !balancer) {
-		if (verbose)
-			fprintf(stderr,
-				"Balancer not specified in global balancing mode!\n");
+		wsim_err("Balancer not specified in global balancing mode!\n");
 		return 1;
 	}
 
 	if (append_workload_arg) {
 		append_workload_arg = load_workload_descriptor(append_workload_arg);
 		if (!append_workload_arg) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load append workload descriptor!\n");
+			wsim_err("Failed to load append workload descriptor!\n");
 			return 1;
 		}
 	}
@@ -2570,9 +2555,7 @@ int main(int argc, char **argv)
 		struct w_arg arg = { NULL, append_workload_arg, 0 };
 		app_w = parse_workload(&arg, flags, NULL);
 		if (!app_w) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse append workload!\n");
+			wsim_err("Failed to parse append workload!\n");
 			return 1;
 		}
 	}
@@ -2584,18 +2567,13 @@ int main(int argc, char **argv)
 		w_args[i].desc = load_workload_descriptor(w_args[i].filename);
 
 		if (!w_args[i].desc) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to load workload descriptor %u!\n",
-					i);
+			wsim_err("Failed to load workload descriptor %u!\n", i);
 			return 1;
 		}
 
 		wrk[i] = parse_workload(&w_args[i], flags, app_w);
 		if (!wrk[i]) {
-			if (verbose)
-				fprintf(stderr,
-					"Failed to parse workload %u!\n", i);
+			wsim_err("Failed to parse workload %u!\n", i);
 			return 1;
 		}
 	}
@@ -2655,10 +2633,8 @@ int main(int argc, char **argv)
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
 			if (ret) {
-				if (verbose)
-					fprintf(stderr,
-						"Failed to initialize balancing! (%u=%d)\n",
-						i, ret);
+				wsim_err("Failed to initialize balancing! (%u=%d)\n",
+					 i, ret);
 				return 1;
 			}
 		}
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 10/25] gem_wsim: Submit fence support
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add support for submit fences in a way similar to how normal input fences
are handled. Eg:

  1.RCS.500-1000.0.0
  1.VCS1.3000.s-1.0
  1.VCS2.3000.s-2.0

Submit fences are signalled when the originating request enters the
submission backend.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 20 ++++++++++++++++----
 benchmarks/wsim/README | 17 +++++++++++++++++
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 6c9eb1e20efc..464314c05697 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 struct deps
 {
 	int nr;
+	bool submit_fence;
 	int *list;
 };
 
@@ -253,17 +254,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 		   w->data_deps.list == w->fence_deps.list);
 
 	while ((token = strtok_r(tstart, "/", &tctx)) != NULL) {
+		bool submit_fence = false;
 		char *str = token;
 		struct deps *deps;
 		int dep;
 
 		tstart = NULL;
 
-		if (strlen(token) > 1 && token[0] == 'f') {
+		if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) {
+			deps = &w->data_deps;
+		} else {
+			if (str[0] == 's')
+				submit_fence = true;
+			else if (str[0] != 'f')
+				return -1;
+
 			deps = &w->fence_deps;
 			str++;
-		} else {
-			deps = &w->data_deps;
 		}
 
 		dep = atoi(str);
@@ -281,6 +288,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 					     sizeof(*deps->list) * deps->nr);
 			igt_assert(deps->list);
 			deps->list[deps->nr - 1] = dep;
+			deps->submit_fence = submit_fence;
 		}
 	}
 
@@ -1944,7 +1952,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= I915_EXEC_FENCE_IN;
+		if (w->fence_deps.submit_fence)
+			w->eb.flags |= I915_EXEC_FENCE_SUBMIT;
+		else
+			w->eb.flags |= I915_EXEC_FENCE_IN;
+
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 205cd6c93afb..4786f116b4ac 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -114,6 +114,23 @@ runnable. When the second RCS batch completes the standalone fence is signaled
 which allows the two VCS batches to be executed. Finally we wait until the both
 VCS batches have completed before starting the (optional) next iteration.
 
+Submit fences
+-------------
+
+Submit fences are a type of input fence which are signalled when the originating
+batch buffer is submitted to the GPU. (In contrary to normal sync fences, which
+are signalled when completed.)
+
+Submit fences have the identical syntax as the sync fences with the lower-case
+'s' being used to select them. Eg:
+
+  1.RCS.500-1000.0.0
+  1.VCS1.3000.s-1.0
+  1.VCS2.3000.s-2.0
+
+Here VCS1 and VCS2 batches will only be submitted for executing once the RCS
+batch enters the GPU.
+
 Context priority
 ----------------
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 10/25] gem_wsim: Submit fence support
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Add support for submit fences in a way similar to how normal input fences
are handled. Eg:

  1.RCS.500-1000.0.0
  1.VCS1.3000.s-1.0
  1.VCS2.3000.s-2.0

Submit fences are signalled when the originating request enters the
submission backend.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 20 ++++++++++++++++----
 benchmarks/wsim/README | 17 +++++++++++++++++
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 6c9eb1e20efc..464314c05697 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 struct deps
 {
 	int nr;
+	bool submit_fence;
 	int *list;
 };
 
@@ -253,17 +254,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 		   w->data_deps.list == w->fence_deps.list);
 
 	while ((token = strtok_r(tstart, "/", &tctx)) != NULL) {
+		bool submit_fence = false;
 		char *str = token;
 		struct deps *deps;
 		int dep;
 
 		tstart = NULL;
 
-		if (strlen(token) > 1 && token[0] == 'f') {
+		if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) {
+			deps = &w->data_deps;
+		} else {
+			if (str[0] == 's')
+				submit_fence = true;
+			else if (str[0] != 'f')
+				return -1;
+
 			deps = &w->fence_deps;
 			str++;
-		} else {
-			deps = &w->data_deps;
 		}
 
 		dep = atoi(str);
@@ -281,6 +288,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc)
 					     sizeof(*deps->list) * deps->nr);
 			igt_assert(deps->list);
 			deps->list[deps->nr - 1] = dep;
+			deps->submit_fence = submit_fence;
 		}
 	}
 
@@ -1944,7 +1952,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 		igt_assert(tgt >= 0 && tgt < w->idx);
 		igt_assert(wrk->steps[tgt].emit_fence > 0);
 
-		w->eb.flags |= I915_EXEC_FENCE_IN;
+		if (w->fence_deps.submit_fence)
+			w->eb.flags |= I915_EXEC_FENCE_SUBMIT;
+		else
+			w->eb.flags |= I915_EXEC_FENCE_IN;
+
 		w->eb.rsvd2 = wrk->steps[tgt].emit_fence;
 	}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 205cd6c93afb..4786f116b4ac 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -114,6 +114,23 @@ runnable. When the second RCS batch completes the standalone fence is signaled
 which allows the two VCS batches to be executed. Finally we wait until the both
 VCS batches have completed before starting the (optional) next iteration.
 
+Submit fences
+-------------
+
+Submit fences are a type of input fence which are signalled when the originating
+batch buffer is submitted to the GPU. (In contrary to normal sync fences, which
+are signalled when completed.)
+
+Submit fences have the identical syntax as the sync fences with the lower-case
+'s' being used to select them. Eg:
+
+  1.RCS.500-1000.0.0
+  1.VCS1.3000.s-1.0
+  1.VCS2.3000.s-2.0
+
+Here VCS1 and VCS2 batches will only be submitted for executing once the RCS
+batch enters the GPU.
+
 Context priority
 ----------------
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 11/25] gem_wsim: Extract str to engine lookup
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

v2:
 * Remove redundant check. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 464314c05697..60b7d32e22d4 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
 	} \
 }
 
+static int str_to_engine(const char *str)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
+		if (!strcasecmp(str, ring_str_map[i]))
+			return i;
+	}
+
+	return -1;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-			unsigned int old_valid = valid;
-
 			fstart = NULL;
 
-			for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
-				if (!strcasecmp(field, ring_str_map[i])) {
-					step.engine = i;
-					if (step.engine == BCS)
-						bcs_used = true;
-					valid++;
-					break;
-				}
-			}
-
-			check_arg(old_valid == valid,
+			i = str_to_engine(field);
+			check_arg(i < 0,
 				  "Invalid engine id at step %u!\n", nr_steps);
+
+			valid++;
+
+			step.engine = i;
+
+			if (step.engine == BCS)
+				bcs_used = true;
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 11/25] gem_wsim: Extract str to engine lookup
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

v2:
 * Remove redundant check. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 464314c05697..60b7d32e22d4 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -318,6 +318,18 @@ wsim_err(const char *fmt, ...)
 	} \
 }
 
+static int str_to_engine(const char *str)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
+		if (!strcasecmp(str, ring_str_map[i]))
+			return i;
+	}
+
+	return -1;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -480,22 +492,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-			unsigned int old_valid = valid;
-
 			fstart = NULL;
 
-			for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) {
-				if (!strcasecmp(field, ring_str_map[i])) {
-					step.engine = i;
-					if (step.engine == BCS)
-						bcs_used = true;
-					valid++;
-					break;
-				}
-			}
-
-			check_arg(old_valid == valid,
+			i = str_to_engine(field);
+			check_arg(i < 0,
 				  "Invalid engine id at step %u!\n", nr_steps);
+
+			valid++;
+
+			step.engine = i;
+
+			if (step.engine == BCS)
+				bcs_used = true;
 		}
 
 		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 12/25] gem_wsim: Engine map support
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support new i915 uAPI for configuring contexts with engine maps.

Please refer to the README file for more detailed explanation.

v2:
 * Allow defining engine maps by class.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README |  25 ++++-
 2 files changed, 204 insertions(+), 32 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 60b7d32e22d4..e5b12e37490e 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -57,6 +57,7 @@
 #include "ewma.h"
 
 enum intel_engine_id {
+	DEFAULT,
 	RCS,
 	BCS,
 	VCS,
@@ -81,7 +82,8 @@ enum w_type
 	SW_FENCE,
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
-	PREEMPTION
+	PREEMPTION,
+	ENGINE_MAP
 };
 
 struct deps
@@ -115,6 +117,10 @@ struct w_step
 		int throttle;
 		int fence_signal;
 		int priority;
+		struct {
+			unsigned int engine_map_count;
+			enum intel_engine_id *engine_map;
+		};
 	};
 
 	/* Implementation details */
@@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
 struct ctx {
 	uint32_t id;
 	int priority;
+	unsigned int engine_map_count;
+	enum intel_engine_id *engine_map;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -200,10 +208,10 @@ struct workload
 		int fd;
 		bool first;
 		unsigned int num_engines;
-		unsigned int engine_map[5];
+		unsigned int engine_map[NUM_ENGINES];
 		uint64_t t_prev;
-		uint64_t prev[5];
-		double busy[5];
+		uint64_t prev[NUM_ENGINES];
+		double busy[NUM_ENGINES];
 	} busy_balancer;
 };
 
@@ -234,6 +242,7 @@ static int fd;
 #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
 
 static const char *ring_str_map[NUM_ENGINES] = {
+	[DEFAULT] = "DEFAULT",
 	[RCS] = "RCS",
 	[BCS] = "BCS",
 	[VCS] = "VCS",
@@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static int parse_engine_map(struct w_step *step, const char *_str)
+{
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine;
+		unsigned int add;
+
+		tstart = NULL;
+
+		if (!strcmp(token, "DEFAULT"))
+			return -1;
+
+		engine = str_to_engine(token);
+		if ((int)engine < 0)
+			return -1;
+
+		if (engine != VCS && engine != VCS1 && engine != VCS2)
+			return -1; /* TODO */
+
+		add = engine == VCS ? 2 : 1;
+		step->engine_map_count += add;
+		step->engine_map = realloc(step->engine_map,
+					   step->engine_map_count *
+					   sizeof(step->engine_map[0]));
+
+		if (engine != VCS) {
+			step->engine_map[step->engine_map_count - 1] = engine;
+		} else {
+			step->engine_map[step->engine_map_count - 2] = VCS1;
+			step->engine_map[step->engine_map_count - 1] = VCS2;
+		}
+	}
+
+	return 0;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
+			} else if (!strcmp(field, "M")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx)) !=
+				    NULL) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid engine map format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else {
+						tmp = parse_engine_map(&step,
+								       field);
+						check_arg(tmp < 0,
+							  "Invalid engine map list at step %u!\n",
+							  nr_steps);
+					}
+
+					nr++;
+				}
+
+				step.type = ENGINE_MAP;
+				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
@@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
+	[DEFAULT] = I915_EXEC_DEFAULT,
 	[RCS] = I915_EXEC_RENDER,
 	[BCS] = I915_EXEC_BLT,
 	[VCS] = I915_EXEC_BSD,
@@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 		eb->flags = eb_engine_map[engine];
 }
 
+static unsigned int
+find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
+{
+	unsigned int i;
+
+	for (i = 0; i < ctx->engine_map_count; i++) {
+		if (ctx->engine_map[i] == engine)
+			return i + 1;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
 static void
-eb_update_flags(struct w_step *w, enum intel_engine_id engine,
-		unsigned int flags)
+eb_update_flags(struct workload *wrk, struct w_step *w,
+		enum intel_engine_id engine, unsigned int flags)
 {
-	eb_set_engine(&w->eb, engine, flags);
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->engine_map)
+		w->eb.flags = find_engine_in_map(ctx, engine);
+	else
+		eb_set_engine(&w->eb, engine, flags);
 
 	w->eb.flags |= I915_EXEC_HANDLE_LUT;
 	w->eb.flags |= I915_EXEC_NO_RELOC;
@@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
-static struct ctx *
-__get_ctx(struct workload *wrk, struct w_step *w)
-{
-	return &wrk->ctx_list[w->context * 2];
-}
-
 static uint32_t
 get_ctxid(struct workload *wrk, struct w_step *w)
 {
@@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		engine = VCS2;
 	else if (flags & SWAPVCS && engine == VCS2)
 		engine = VCS1;
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 #ifdef DEBUG
 	printf("%u: %u:|", w->idx, w->eb.buffer_count);
 	for (i = 0; i <= j; i++)
@@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
-static void
+static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs;
@@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	/*
 	 * Identify if contexts target specific engine instances and if they
 	 * want to be balanced.
+	 *
+	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
 		bool targets = false;
 		bool balance = false;
 
 		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-			if (w->type != BATCH)
-				continue;
-
 			if (w->context != (j / 2))
 				continue;
 
-			if (w->engine == VCS)
-				balance = true;
-			else
-				targets = true;
+			if (w->type == BATCH) {
+				if (w->engine == VCS)
+					balance = true;
+				else
+					targets = true;
+			} else if (w->type == ENGINE_MAP) {
+				wrk->ctx_list[j].engine_map = w->engine_map;
+				wrk->ctx_list[j].engine_map_count =
+					w->engine_map_count;
+			}
 		}
 
-		if (flags & I915) {
-			wrk->ctx_list[j].targets_instance = targets;
+		wrk->ctx_list[j].targets_instance = targets;
+		if (flags & I915)
 			wrk->ctx_list[j].wants_balance = balance;
+	}
+
+	/*
+	 * Ensure VCS is not allowed with engine map contexts.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->type != BATCH)
+				continue;
+
+			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+				wsim_err("Batches targetting engine maps must use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
+
 	/*
 	 * Create and configure contexts.
 	 */
@@ -1033,7 +1149,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->id)
 			continue;
 
-		if (flags & I915) {
+		if ((flags & I915) || ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1063,7 +1179,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->targets_instance)
+			if (!ctx->engine_map && !ctx->targets_instance)
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1096,7 +1212,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		 * both want to target specific engines and be balanced by i915?
 		 */
 		if ((flags & I915) && ctx->wants_balance &&
-		    ctx->targets_instance) {
+		    ctx->targets_instance && !ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1121,7 +1237,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			__ctx_set_prio(ctx_id, wrk->prio);
 		}
 
-		if (ctx->wants_balance) {
+		if (ctx->engine_map) {
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  ctx->engine_map_count + 1);
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			set_engines.extensions = 0;
+
+			/* Reserve slot for virtual engine. */
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+
+			for (j = 1; j <= ctx->engine_map_count; j++) {
+				set_engines.engines[j].engine_class =
+					I915_ENGINE_CLASS_VIDEO; /* FIXME */
+				set_engines.engines[j].engine_instance =
+					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+			}
+
+			gem_context_set_param(fd, &param);
+		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
 				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
 				.num_siblings = 2,
@@ -1204,6 +1346,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		alloc_step_batch(wrk, w, _flags);
 	}
+
+	return 0;
 }
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
@@ -1941,7 +2085,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	uint32_t seqno = new_seqno(wrk, engine);
 	unsigned int i;
 
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 
 	if (flags & SEQNO)
 		update_bb_seqno(w, engine, seqno);
@@ -2090,7 +2234,8 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
-			} else if (w->type == PREEMPTION) {
+			} else if (w->type == PREEMPTION ||
+				   w->type == ENGINE_MAP) {
 				continue;
 			}
 
@@ -2648,7 +2793,11 @@ int main(int argc, char **argv)
 		w[i]->print_stats = verbose > 1 ||
 				    (verbose > 0 && master_workload == i);
 
-		prepare_workload(i, w[i], flags_);
+		if (prepare_workload(i, w[i], flags_)) {
+			wsim_err("Failed to prepare workload %u!\n", i);
+			return 1;
+		}
+
 
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4786f116b4ac..53f814a73c73 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
 f
@@ -23,10 +24,11 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
 
-Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS
+Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
 
 Example (leading spaces must not be present in the actual file):
 ----------------------------------------------------------------
@@ -161,3 +163,24 @@ The same context is then marked to have batches which can be preempted every
 
 Same as with context priority, context preemption commands are valid until
 optionally overriden by another preemption control change on the same context.
+
+Engine maps
+-----------
+
+Engine maps are a per context feature which changes the way engine selection is
+done in the driver.
+
+Example:
+
+  M.1.VCS1|VCS2
+
+This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
+Submission to this context can now only reference these two engines.
+
+Engine maps can also be defined based on class like VCS.
+
+Example:
+
+M.1.VCS
+
+This sets up the engine map to all available VCS class engines.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support new i915 uAPI for configuring contexts with engine maps.

Please refer to the README file for more detailed explanation.

v2:
 * Allow defining engine maps by class.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README |  25 ++++-
 2 files changed, 204 insertions(+), 32 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 60b7d32e22d4..e5b12e37490e 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -57,6 +57,7 @@
 #include "ewma.h"
 
 enum intel_engine_id {
+	DEFAULT,
 	RCS,
 	BCS,
 	VCS,
@@ -81,7 +82,8 @@ enum w_type
 	SW_FENCE,
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
-	PREEMPTION
+	PREEMPTION,
+	ENGINE_MAP
 };
 
 struct deps
@@ -115,6 +117,10 @@ struct w_step
 		int throttle;
 		int fence_signal;
 		int priority;
+		struct {
+			unsigned int engine_map_count;
+			enum intel_engine_id *engine_map;
+		};
 	};
 
 	/* Implementation details */
@@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
 struct ctx {
 	uint32_t id;
 	int priority;
+	unsigned int engine_map_count;
+	enum intel_engine_id *engine_map;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -200,10 +208,10 @@ struct workload
 		int fd;
 		bool first;
 		unsigned int num_engines;
-		unsigned int engine_map[5];
+		unsigned int engine_map[NUM_ENGINES];
 		uint64_t t_prev;
-		uint64_t prev[5];
-		double busy[5];
+		uint64_t prev[NUM_ENGINES];
+		double busy[NUM_ENGINES];
 	} busy_balancer;
 };
 
@@ -234,6 +242,7 @@ static int fd;
 #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
 
 static const char *ring_str_map[NUM_ENGINES] = {
+	[DEFAULT] = "DEFAULT",
 	[RCS] = "RCS",
 	[BCS] = "BCS",
 	[VCS] = "VCS",
@@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static int parse_engine_map(struct w_step *step, const char *_str)
+{
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine;
+		unsigned int add;
+
+		tstart = NULL;
+
+		if (!strcmp(token, "DEFAULT"))
+			return -1;
+
+		engine = str_to_engine(token);
+		if ((int)engine < 0)
+			return -1;
+
+		if (engine != VCS && engine != VCS1 && engine != VCS2)
+			return -1; /* TODO */
+
+		add = engine == VCS ? 2 : 1;
+		step->engine_map_count += add;
+		step->engine_map = realloc(step->engine_map,
+					   step->engine_map_count *
+					   sizeof(step->engine_map[0]));
+
+		if (engine != VCS) {
+			step->engine_map[step->engine_map_count - 1] = engine;
+		} else {
+			step->engine_map[step->engine_map_count - 2] = VCS1;
+			step->engine_map[step->engine_map_count - 1] = VCS2;
+		}
+	}
+
+	return 0;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
+			} else if (!strcmp(field, "M")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx)) !=
+				    NULL) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid engine map format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else {
+						tmp = parse_engine_map(&step,
+								       field);
+						check_arg(tmp < 0,
+							  "Invalid engine map list at step %u!\n",
+							  nr_steps);
+					}
+
+					nr++;
+				}
+
+				step.type = ENGINE_MAP;
+				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
@@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
+	[DEFAULT] = I915_EXEC_DEFAULT,
 	[RCS] = I915_EXEC_RENDER,
 	[BCS] = I915_EXEC_BLT,
 	[VCS] = I915_EXEC_BSD,
@@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 		eb->flags = eb_engine_map[engine];
 }
 
+static unsigned int
+find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
+{
+	unsigned int i;
+
+	for (i = 0; i < ctx->engine_map_count; i++) {
+		if (ctx->engine_map[i] == engine)
+			return i + 1;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
 static void
-eb_update_flags(struct w_step *w, enum intel_engine_id engine,
-		unsigned int flags)
+eb_update_flags(struct workload *wrk, struct w_step *w,
+		enum intel_engine_id engine, unsigned int flags)
 {
-	eb_set_engine(&w->eb, engine, flags);
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->engine_map)
+		w->eb.flags = find_engine_in_map(ctx, engine);
+	else
+		eb_set_engine(&w->eb, engine, flags);
 
 	w->eb.flags |= I915_EXEC_HANDLE_LUT;
 	w->eb.flags |= I915_EXEC_NO_RELOC;
@@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
-static struct ctx *
-__get_ctx(struct workload *wrk, struct w_step *w)
-{
-	return &wrk->ctx_list[w->context * 2];
-}
-
 static uint32_t
 get_ctxid(struct workload *wrk, struct w_step *w)
 {
@@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		engine = VCS2;
 	else if (flags & SWAPVCS && engine == VCS2)
 		engine = VCS1;
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 #ifdef DEBUG
 	printf("%u: %u:|", w->idx, w->eb.buffer_count);
 	for (i = 0; i <= j; i++)
@@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
-static void
+static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs;
@@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	/*
 	 * Identify if contexts target specific engine instances and if they
 	 * want to be balanced.
+	 *
+	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
 		bool targets = false;
 		bool balance = false;
 
 		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-			if (w->type != BATCH)
-				continue;
-
 			if (w->context != (j / 2))
 				continue;
 
-			if (w->engine == VCS)
-				balance = true;
-			else
-				targets = true;
+			if (w->type == BATCH) {
+				if (w->engine == VCS)
+					balance = true;
+				else
+					targets = true;
+			} else if (w->type == ENGINE_MAP) {
+				wrk->ctx_list[j].engine_map = w->engine_map;
+				wrk->ctx_list[j].engine_map_count =
+					w->engine_map_count;
+			}
 		}
 
-		if (flags & I915) {
-			wrk->ctx_list[j].targets_instance = targets;
+		wrk->ctx_list[j].targets_instance = targets;
+		if (flags & I915)
 			wrk->ctx_list[j].wants_balance = balance;
+	}
+
+	/*
+	 * Ensure VCS is not allowed with engine map contexts.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->type != BATCH)
+				continue;
+
+			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+				wsim_err("Batches targetting engine maps must use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
+
 	/*
 	 * Create and configure contexts.
 	 */
@@ -1033,7 +1149,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->id)
 			continue;
 
-		if (flags & I915) {
+		if ((flags & I915) || ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1063,7 +1179,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->targets_instance)
+			if (!ctx->engine_map && !ctx->targets_instance)
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1096,7 +1212,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		 * both want to target specific engines and be balanced by i915?
 		 */
 		if ((flags & I915) && ctx->wants_balance &&
-		    ctx->targets_instance) {
+		    ctx->targets_instance && !ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1121,7 +1237,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			__ctx_set_prio(ctx_id, wrk->prio);
 		}
 
-		if (ctx->wants_balance) {
+		if (ctx->engine_map) {
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  ctx->engine_map_count + 1);
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			set_engines.extensions = 0;
+
+			/* Reserve slot for virtual engine. */
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+
+			for (j = 1; j <= ctx->engine_map_count; j++) {
+				set_engines.engines[j].engine_class =
+					I915_ENGINE_CLASS_VIDEO; /* FIXME */
+				set_engines.engines[j].engine_instance =
+					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+			}
+
+			gem_context_set_param(fd, &param);
+		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
 				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
 				.num_siblings = 2,
@@ -1204,6 +1346,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		alloc_step_batch(wrk, w, _flags);
 	}
+
+	return 0;
 }
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
@@ -1941,7 +2085,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	uint32_t seqno = new_seqno(wrk, engine);
 	unsigned int i;
 
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 
 	if (flags & SEQNO)
 		update_bb_seqno(w, engine, seqno);
@@ -2090,7 +2234,8 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
-			} else if (w->type == PREEMPTION) {
+			} else if (w->type == PREEMPTION ||
+				   w->type == ENGINE_MAP) {
 				continue;
 			}
 
@@ -2648,7 +2793,11 @@ int main(int argc, char **argv)
 		w[i]->print_stats = verbose > 1 ||
 				    (verbose > 0 && master_workload == i);
 
-		prepare_workload(i, w[i], flags_);
+		if (prepare_workload(i, w[i], flags_)) {
+			wsim_err("Failed to prepare workload %u!\n", i);
+			return 1;
+		}
+
 
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4786f116b4ac..53f814a73c73 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
 f
@@ -23,10 +24,11 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
 
-Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS
+Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
 
 Example (leading spaces must not be present in the actual file):
 ----------------------------------------------------------------
@@ -161,3 +163,24 @@ The same context is then marked to have batches which can be preempted every
 
 Same as with context priority, context preemption commands are valid until
 optionally overriden by another preemption control change on the same context.
+
+Engine maps
+-----------
+
+Engine maps are a per context feature which changes the way engine selection is
+done in the driver.
+
+Example:
+
+  M.1.VCS1|VCS2
+
+This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
+Submission to this context can now only reference these two engines.
+
+Engine maps can also be defined based on class like VCS.
+
+Example:
+
+M.1.VCS
+
+This sets up the engine map to all available VCS class engines.
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 13/25] gem_wsim: Save some lines by changing to implicit NULL checking
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can improve the parsing loop readability a bit more by avoiding some
line breaks caused by explicit NULL checks.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index e5b12e37490e..baa389c3f0e7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -391,7 +391,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 	igt_assert(desc);
 
-	while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) {
+	while ((_token = strtok_r(tstart, ",", &tctx))) {
 		tstart = NULL;
 		token = strdup(_token);
 		igt_assert(token);
@@ -399,12 +399,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		valid = 0;
 		memset(&step, 0, sizeof(step));
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid delay at step %u!\n",
@@ -414,8 +413,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid period at step %u!\n",
@@ -426,8 +424,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -447,8 +444,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0 ||
 						  ((int)nr_steps + tmp) < 0,
@@ -459,8 +455,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid throttle at step %u!\n",
@@ -470,8 +465,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid qd throttle at step %u!\n",
@@ -481,8 +475,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0,
 						  "Invalid sw fence signal at step %u!\n",
@@ -496,8 +489,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "M")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -523,8 +515,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -564,7 +555,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			i = str_to_engine(field);
@@ -579,7 +570,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				bcs_used = true;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			char *sep = NULL;
 			long int tmpl;
 
@@ -607,7 +598,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
@@ -617,7 +608,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			check_arg(strlen(field) != 1 ||
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 13/25] gem_wsim: Save some lines by changing to implicit NULL checking
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can improve the parsing loop readability a bit more by avoiding some
line breaks caused by explicit NULL checks.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index e5b12e37490e..baa389c3f0e7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -391,7 +391,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 	igt_assert(desc);
 
-	while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) {
+	while ((_token = strtok_r(tstart, ",", &tctx))) {
 		tstart = NULL;
 		token = strdup(_token);
 		igt_assert(token);
@@ -399,12 +399,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		valid = 0;
 		memset(&step, 0, sizeof(step));
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid delay at step %u!\n",
@@ -414,8 +413,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid period at step %u!\n",
@@ -426,8 +424,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -447,8 +444,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0 ||
 						  ((int)nr_steps + tmp) < 0,
@@ -459,8 +455,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid throttle at step %u!\n",
@@ -470,8 +465,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid qd throttle at step %u!\n",
@@ -481,8 +475,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0,
 						  "Invalid sw fence signal at step %u!\n",
@@ -496,8 +489,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "M")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -523,8 +515,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -564,7 +555,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			i = str_to_engine(field);
@@ -579,7 +570,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				bcs_used = true;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			char *sep = NULL;
 			long int tmpl;
 
@@ -607,7 +598,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
@@ -617,7 +608,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			check_arg(strlen(field) != 1 ||
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 14/25] gem_wsim: Compact int command parsing with a macro
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Parsing an integer workload descriptor field is a common pattern which we
can extract to a helper macro and by doing so further improve the
readability of the main parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 55 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index baa389c3f0e7..66832f74e34a 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -376,6 +376,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
+	if ((field = strtok_r(fstart, ".", &fctx))) { \
+		tmp = atoi(field); \
+		check_arg(_COND_, _ERR_, nr_steps); \
+		step.type = _STEP_; \
+		step._FIELD_ = tmp; \
+		goto add_step; \
+	} \
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -403,25 +412,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid delay at step %u!\n",
-						  nr_steps);
-					step.type = DELAY;
-					step.delay = tmp;
-					goto add_step;
-				}
+				int_field(DELAY, delay, tmp <= 0,
+					  "Invalid delay at step %u!\n");
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid period at step %u!\n",
-						  nr_steps);
-					step.type = PERIOD;
-					step.period = tmp;
-					goto add_step;
-				}
+				int_field(PERIOD, period, tmp <= 0,
+					  "Invalid period at step %u!\n");
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -444,46 +439,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0 ||
-						  ((int)nr_steps + tmp) < 0,
-						  "Invalid sync target at step %u!\n",
-						  nr_steps);
-					step.type = SYNC;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SYNC, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid sync target at step %u!\n");
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid throttle at step %u!\n",
-						  nr_steps);
-					step.type = THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid throttle at step %u!\n");
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid qd throttle at step %u!\n",
-						  nr_steps);
-					step.type = QD_THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(QD_THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid qd throttle at step %u!\n");
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0,
-						  "Invalid sw fence signal at step %u!\n",
-						  nr_steps);
-					step.type = SW_FENCE_SIGNAL;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SW_FENCE_SIGNAL, target,
+					  tmp >= 0,
+					  "Invalid sw fence signal at step %u!\n");
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 14/25] gem_wsim: Compact int command parsing with a macro
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Parsing an integer workload descriptor field is a common pattern which we
can extract to a helper macro and by doing so further improve the
readability of the main parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 55 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index baa389c3f0e7..66832f74e34a 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -376,6 +376,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
+	if ((field = strtok_r(fstart, ".", &fctx))) { \
+		tmp = atoi(field); \
+		check_arg(_COND_, _ERR_, nr_steps); \
+		step.type = _STEP_; \
+		step._FIELD_ = tmp; \
+		goto add_step; \
+	} \
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -403,25 +412,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid delay at step %u!\n",
-						  nr_steps);
-					step.type = DELAY;
-					step.delay = tmp;
-					goto add_step;
-				}
+				int_field(DELAY, delay, tmp <= 0,
+					  "Invalid delay at step %u!\n");
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid period at step %u!\n",
-						  nr_steps);
-					step.type = PERIOD;
-					step.period = tmp;
-					goto add_step;
-				}
+				int_field(PERIOD, period, tmp <= 0,
+					  "Invalid period at step %u!\n");
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -444,46 +439,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0 ||
-						  ((int)nr_steps + tmp) < 0,
-						  "Invalid sync target at step %u!\n",
-						  nr_steps);
-					step.type = SYNC;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SYNC, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid sync target at step %u!\n");
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid throttle at step %u!\n",
-						  nr_steps);
-					step.type = THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid throttle at step %u!\n");
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid qd throttle at step %u!\n",
-						  nr_steps);
-					step.type = QD_THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(QD_THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid qd throttle at step %u!\n");
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0,
-						  "Invalid sw fence signal at step %u!\n",
-						  nr_steps);
-					step.type = SW_FENCE_SIGNAL;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SW_FENCE_SIGNAL, target,
+					  tmp >= 0,
+					  "Invalid sw fence signal at step %u!\n");
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 73 ++++++++++++++++++++++++++++++++++++++----
 benchmarks/wsim/README | 18 +++++++++++
 2 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 66832f74e34a..f7f84d05010a 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@ enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@ struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -507,6 +509,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -841,7 +862,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1073,12 +1094,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1092,10 +1120,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
+
+			if (wrk->ctx_list[j].engine_map &&
+			    wrk->ctx_list[j].wants_balance &&
+			    w->engine != DEFAULT) {
+				wsim_err("Batches targetting load balanced maps must not use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
@@ -1140,7 +1177,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1201,6 +1239,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1208,7 +1248,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2196,7 +2254,8 @@ static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 53f814a73c73..7adb3b89ffcc 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -184,3 +186,19 @@ Example:
 M.1.VCS
 
 This sets up the engine map to all available VCS class engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
+
+Submissions to load balanced contexts are only allowed to use the DEFAULT engine
+specifier.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 73 ++++++++++++++++++++++++++++++++++++++----
 benchmarks/wsim/README | 18 +++++++++++
 2 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 66832f74e34a..f7f84d05010a 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@ enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@ struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -507,6 +509,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -841,7 +862,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1073,12 +1094,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1092,10 +1120,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
+
+			if (wrk->ctx_list[j].engine_map &&
+			    wrk->ctx_list[j].wants_balance &&
+			    w->engine != DEFAULT) {
+				wsim_err("Batches targetting load balanced maps must not use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
@@ -1140,7 +1177,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1201,6 +1239,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1208,7 +1248,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2196,7 +2254,8 @@ static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 53f814a73c73..7adb3b89ffcc 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -184,3 +186,19 @@ Example:
 M.1.VCS
 
 This sets up the engine map to all available VCS class engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
+
+Submissions to load balanced contexts are only allowed to use the DEFAULT engine
+specifier.
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 16/25] gem_wsim: Engine bond command
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.

Please refer to the README for a more detailed description.

v2:
 * Use list of symbolic engine names instead of the mask. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 159 +++++++++++++++++++++++++++++++++++++++--
 benchmarks/wsim/README |  50 +++++++++++++
 2 files changed, 202 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index f7f84d05010a..bd9201c2928b 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -85,6 +85,7 @@ enum w_type
 	PREEMPTION,
 	ENGINE_MAP,
 	LOAD_BALANCE,
+	BOND,
 };
 
 struct deps
@@ -100,6 +101,11 @@ struct w_arg {
 	int prio;
 };
 
+struct bond {
+	uint64_t mask;
+	enum intel_engine_id master;
+};
+
 struct w_step
 {
 	/* Workload step metadata */
@@ -123,6 +129,10 @@ struct w_step
 			enum intel_engine_id *engine_map;
 		};
 		bool load_balance;
+		struct {
+			uint64_t bond_mask;
+			enum intel_engine_id bond_master;
+		};
 	};
 
 	/* Implementation details */
@@ -152,6 +162,8 @@ struct ctx {
 	int priority;
 	unsigned int engine_map_count;
 	enum intel_engine_id *engine_map;
+	unsigned int bond_count;
+	struct bond *bonds;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -378,6 +390,26 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+static uint64_t engine_list_mask(const char *_str)
+{
+	uint64_t mask = 0;
+
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine = str_to_engine(token);
+
+		if ((int)engine < 0 || engine == DEFAULT || engine == VCS)
+			return 0;
+
+		mask |= 1 << engine;
+
+		tstart = NULL;
+	}
+
+	return mask;
+}
+
 #define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
 	if ((field = strtok_r(fstart, ".", &fctx))) { \
 		tmp = atoi(field); \
@@ -528,6 +560,39 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = LOAD_BALANCE;
 				goto add_step;
+			} else if (!strcmp(field, "b")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					check_arg(nr > 2,
+						  "Invalid bond format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						tmp = atoi(field);
+						step.context = tmp;
+						check_arg(tmp <= 0,
+							  "Invalid context at step %u!\n",
+							  nr_steps);
+					} else if (nr == 1) {
+						step.bond_mask = engine_list_mask(field);
+						check_arg(step.bond_mask == 0,
+							"Invalid siblings list at step %u!\n",
+							nr_steps);
+					} else if (nr == 2) {
+						tmp = str_to_engine(field);
+						check_arg(tmp <= 0 ||
+							  tmp == VCS ||
+							  tmp == DEFAULT,
+							  "Invalid master engine at step %u!\n",
+							  nr_steps);
+						step.bond_master = tmp;
+					}
+
+					nr++;
+				}
+
+				step.type = BOND;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -1011,6 +1076,31 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
+static unsigned int
+find_engine(struct i915_engine_class_instance *ci, unsigned int count,
+	    enum intel_engine_id engine)
+{
+	static struct i915_engine_class_instance map[] = {
+		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
+		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
+		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
+		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
+		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
+	};
+	unsigned int i;
+
+	igt_assert(engine < ARRAY_SIZE(map));
+	igt_assert(engine == RCS || map[engine].engine_class);
+
+	for (i = 0; i < count; i++, ci++) {
+		if (!memcmp(&map[engine], ci, sizeof(*ci)))
+			return i;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1078,6 +1168,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		struct ctx *ctx = &wrk->ctx_list[j];
+
 		bool targets = false;
 		bool balance = false;
 
@@ -1091,16 +1183,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				else
 					targets = true;
 			} else if (w->type == ENGINE_MAP) {
-				wrk->ctx_list[j].engine_map = w->engine_map;
-				wrk->ctx_list[j].engine_map_count =
-					w->engine_map_count;
+				ctx->engine_map = w->engine_map;
+				ctx->engine_map_count = w->engine_map_count;
 			} else if (w->type == LOAD_BALANCE) {
-				if (!wrk->ctx_list[j].engine_map) {
+				if (!ctx->engine_map) {
 					wsim_err("Load balancing needs an engine map!\n");
 					return 1;
 				}
-				wrk->ctx_list[j].wants_balance =
-					w->load_balance;
+				ctx->wants_balance = w->load_balance;
+			} else if (w->type == BOND) {
+				if (!ctx->wants_balance) {
+					wsim_err("Engine bonds need load balancing engine map!\n");
+					return 1;
+				}
+				ctx->bond_count++;
+				ctx->bonds = realloc(ctx->bonds,
+						     ctx->bond_count *
+						     sizeof(struct bond));
+				igt_assert(ctx->bonds);
+				ctx->bonds[ctx->bond_count - 1].mask =
+					w->bond_mask;
+				ctx->bonds[ctx->bond_count - 1].master =
+					w->bond_master;
 			}
 		}
 
@@ -1281,6 +1385,46 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map[j - 1] - VCS1; /* FIXME */
 			}
 
+			for (j = 0; j < ctx->bond_count; j++) {
+				unsigned long mask = ctx->bonds[j].mask;
+				I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
+								 __builtin_popcount(mask));
+				struct i915_context_engines_bond *p = NULL, *prev;
+				unsigned int b, e;
+
+				prev = p;
+				p = alloca(sizeof(bond));
+				assert(p);
+				memset(p, 0, sizeof(bond));
+
+				if (j == 0)
+					load_balance.base.next_extension =
+						to_user_pointer(p);
+				else if (j < (ctx->bond_count - 1))
+					prev->base.next_extension =
+						to_user_pointer(p);
+
+				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+				p->virtual_index = 0;
+				p->master.engine_class =
+					I915_ENGINE_CLASS_VIDEO;
+				p->master.engine_instance =
+					ctx->bonds[j].master - VCS1;
+
+				for (b = 0, e = 0; mask; e++, mask >>= 1) {
+					unsigned int idx;
+
+					if (!(mask & 1))
+						continue;
+
+					idx = find_engine(&set_engines.engines[1],
+							  ctx->engine_map_count,
+							  e);
+					p->engines[b++] =
+						set_engines.engines[1 + idx];
+				}
+			}
+
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
@@ -2255,7 +2399,8 @@ static void *run_workload(void *data)
 				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
-				   w->type == LOAD_BALANCE) {
+				   w->type == LOAD_BALANCE ||
+				   w->type == BOND) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 7adb3b89ffcc..e5dcf929519e 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -7,6 +7,7 @@ B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
+b.<uint>.<str>[|<str>].<str>
 f
 
 For duration a range can be given from which a random value will be picked
@@ -26,6 +27,7 @@ Additional workload steps are also supported:
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
  'B' - Turn on context load balancing.
+ 'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -202,3 +204,51 @@ This enables load balancing for context number one.
 
 Submissions to load balanced contexts are only allowed to use the DEFAULT engine
 specifier.
+
+Engine bonds
+------------
+
+Engine bonds are extensions on load balanced contexts. They allow expressing
+rules of engine selection between two co-operating contexts tied with submit
+fences. In other words, the rule expression is telling the driver: "If you pick
+this engine for context one, then you have to pick that engine for context two".
+
+Syntax is:
+  b.<context>.<engine_list>.<master_engine>
+
+Engine list is a list of one or more sibling engines separated by a pipe
+character (eg. "VCS1|VCS2").
+
+There can be multiple bonds tied to the same context.
+
+Example:
+
+  M.1.RCS|VECS
+  B.1
+  M.2.VCS1|VCS2
+  B.2
+  b.2.VCS1.RCS
+  b.2.VCS2.VECS
+
+This tells the driver that if it picked RCS for context one, it has to pick VCS1
+for context two. And if it picked VECS for context one, it has to pick VCS1 for
+context two.
+
+If we extend the above example with more workload directives:
+
+  1.DEFAULT.1000.0.0
+  2.DEFAULT.1000.s-1.0
+
+We get to a fully functional example where two batch buffers are submitted in a
+load balanced fashion, telling the driver they should run simultaneously and
+that valid engine pairs are either RCS + VCS1 (for two contexts respectively),
+or VECS + VCS2.
+
+This can also be extended using sync fences to improve chances of the first
+submission not getting on the hardware after the second one. Second block would
+then look like:
+
+  f
+  1.DEFAULT.1000.f-1.0
+  2.DEFAULT.1000.s-1.0
+  a.-3
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 16/25] gem_wsim: Engine bond command
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.

Please refer to the README for a more detailed description.

v2:
 * Use list of symbolic engine names instead of the mask. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 159 +++++++++++++++++++++++++++++++++++++++--
 benchmarks/wsim/README |  50 +++++++++++++
 2 files changed, 202 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index f7f84d05010a..bd9201c2928b 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -85,6 +85,7 @@ enum w_type
 	PREEMPTION,
 	ENGINE_MAP,
 	LOAD_BALANCE,
+	BOND,
 };
 
 struct deps
@@ -100,6 +101,11 @@ struct w_arg {
 	int prio;
 };
 
+struct bond {
+	uint64_t mask;
+	enum intel_engine_id master;
+};
+
 struct w_step
 {
 	/* Workload step metadata */
@@ -123,6 +129,10 @@ struct w_step
 			enum intel_engine_id *engine_map;
 		};
 		bool load_balance;
+		struct {
+			uint64_t bond_mask;
+			enum intel_engine_id bond_master;
+		};
 	};
 
 	/* Implementation details */
@@ -152,6 +162,8 @@ struct ctx {
 	int priority;
 	unsigned int engine_map_count;
 	enum intel_engine_id *engine_map;
+	unsigned int bond_count;
+	struct bond *bonds;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -378,6 +390,26 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+static uint64_t engine_list_mask(const char *_str)
+{
+	uint64_t mask = 0;
+
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine = str_to_engine(token);
+
+		if ((int)engine < 0 || engine == DEFAULT || engine == VCS)
+			return 0;
+
+		mask |= 1 << engine;
+
+		tstart = NULL;
+	}
+
+	return mask;
+}
+
 #define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
 	if ((field = strtok_r(fstart, ".", &fctx))) { \
 		tmp = atoi(field); \
@@ -528,6 +560,39 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = LOAD_BALANCE;
 				goto add_step;
+			} else if (!strcmp(field, "b")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					check_arg(nr > 2,
+						  "Invalid bond format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						tmp = atoi(field);
+						step.context = tmp;
+						check_arg(tmp <= 0,
+							  "Invalid context at step %u!\n",
+							  nr_steps);
+					} else if (nr == 1) {
+						step.bond_mask = engine_list_mask(field);
+						check_arg(step.bond_mask == 0,
+							"Invalid siblings list at step %u!\n",
+							nr_steps);
+					} else if (nr == 2) {
+						tmp = str_to_engine(field);
+						check_arg(tmp <= 0 ||
+							  tmp == VCS ||
+							  tmp == DEFAULT,
+							  "Invalid master engine at step %u!\n",
+							  nr_steps);
+						step.bond_master = tmp;
+					}
+
+					nr++;
+				}
+
+				step.type = BOND;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -1011,6 +1076,31 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
+static unsigned int
+find_engine(struct i915_engine_class_instance *ci, unsigned int count,
+	    enum intel_engine_id engine)
+{
+	static struct i915_engine_class_instance map[] = {
+		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
+		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
+		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
+		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
+		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
+	};
+	unsigned int i;
+
+	igt_assert(engine < ARRAY_SIZE(map));
+	igt_assert(engine == RCS || map[engine].engine_class);
+
+	for (i = 0; i < count; i++, ci++) {
+		if (!memcmp(&map[engine], ci, sizeof(*ci)))
+			return i;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1078,6 +1168,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		struct ctx *ctx = &wrk->ctx_list[j];
+
 		bool targets = false;
 		bool balance = false;
 
@@ -1091,16 +1183,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				else
 					targets = true;
 			} else if (w->type == ENGINE_MAP) {
-				wrk->ctx_list[j].engine_map = w->engine_map;
-				wrk->ctx_list[j].engine_map_count =
-					w->engine_map_count;
+				ctx->engine_map = w->engine_map;
+				ctx->engine_map_count = w->engine_map_count;
 			} else if (w->type == LOAD_BALANCE) {
-				if (!wrk->ctx_list[j].engine_map) {
+				if (!ctx->engine_map) {
 					wsim_err("Load balancing needs an engine map!\n");
 					return 1;
 				}
-				wrk->ctx_list[j].wants_balance =
-					w->load_balance;
+				ctx->wants_balance = w->load_balance;
+			} else if (w->type == BOND) {
+				if (!ctx->wants_balance) {
+					wsim_err("Engine bonds need load balancing engine map!\n");
+					return 1;
+				}
+				ctx->bond_count++;
+				ctx->bonds = realloc(ctx->bonds,
+						     ctx->bond_count *
+						     sizeof(struct bond));
+				igt_assert(ctx->bonds);
+				ctx->bonds[ctx->bond_count - 1].mask =
+					w->bond_mask;
+				ctx->bonds[ctx->bond_count - 1].master =
+					w->bond_master;
 			}
 		}
 
@@ -1281,6 +1385,46 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map[j - 1] - VCS1; /* FIXME */
 			}
 
+			for (j = 0; j < ctx->bond_count; j++) {
+				unsigned long mask = ctx->bonds[j].mask;
+				I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
+								 __builtin_popcount(mask));
+				struct i915_context_engines_bond *p = NULL, *prev;
+				unsigned int b, e;
+
+				prev = p;
+				p = alloca(sizeof(bond));
+				assert(p);
+				memset(p, 0, sizeof(bond));
+
+				if (j == 0)
+					load_balance.base.next_extension =
+						to_user_pointer(p);
+				else if (j < (ctx->bond_count - 1))
+					prev->base.next_extension =
+						to_user_pointer(p);
+
+				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+				p->virtual_index = 0;
+				p->master.engine_class =
+					I915_ENGINE_CLASS_VIDEO;
+				p->master.engine_instance =
+					ctx->bonds[j].master - VCS1;
+
+				for (b = 0, e = 0; mask; e++, mask >>= 1) {
+					unsigned int idx;
+
+					if (!(mask & 1))
+						continue;
+
+					idx = find_engine(&set_engines.engines[1],
+							  ctx->engine_map_count,
+							  e);
+					p->engines[b++] =
+						set_engines.engines[1 + idx];
+				}
+			}
+
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
@@ -2255,7 +2399,8 @@ static void *run_workload(void *data)
 				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
-				   w->type == LOAD_BALANCE) {
+				   w->type == LOAD_BALANCE ||
+				   w->type == BOND) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 7adb3b89ffcc..e5dcf929519e 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -7,6 +7,7 @@ B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
+b.<uint>.<str>[|<str>].<str>
 f
 
 For duration a range can be given from which a random value will be picked
@@ -26,6 +27,7 @@ Additional workload steps are also supported:
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
  'B' - Turn on context load balancing.
+ 'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -202,3 +204,51 @@ This enables load balancing for context number one.
 
 Submissions to load balanced contexts are only allowed to use the DEFAULT engine
 specifier.
+
+Engine bonds
+------------
+
+Engine bonds are extensions on load balanced contexts. They allow expressing
+rules of engine selection between two co-operating contexts tied with submit
+fences. In other words, the rule expression is telling the driver: "If you pick
+this engine for context one, then you have to pick that engine for context two".
+
+Syntax is:
+  b.<context>.<engine_list>.<master_engine>
+
+Engine list is a list of one or more sibling engines separated by a pipe
+character (eg. "VCS1|VCS2").
+
+There can be multiple bonds tied to the same context.
+
+Example:
+
+  M.1.RCS|VECS
+  B.1
+  M.2.VCS1|VCS2
+  B.2
+  b.2.VCS1.RCS
+  b.2.VCS2.VECS
+
+This tells the driver that if it picked RCS for context one, it has to pick VCS1
+for context two. And if it picked VECS for context one, it has to pick VCS1 for
+context two.
+
+If we extend the above example with more workload directives:
+
+  1.DEFAULT.1000.0.0
+  2.DEFAULT.1000.s-1.0
+
+We get to a fully functional example where two batch buffers are submitted in a
+load balanced fashion, telling the driver they should run simultaneously and
+that valid engine pairs are either RCS + VCS1 (for two contexts respectively),
+or VECS + VCS2.
+
+This can also be extended using sync fences to improve chances of the first
+submission not getting on the hardware after the second one. Second block would
+then look like:
+
+  f
+  1.DEFAULT.1000.f-1.0
+  2.DEFAULT.1000.s-1.0
+  a.-3
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 17/25] gem_wsim: Some more example workloads
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few additional workloads useful for experimenting with scheduling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/wsim/frame-split-60fps.wsim      | 16 ++++++++++++++++
 benchmarks/wsim/high-composited-game.wsim   | 11 +++++++++++
 benchmarks/wsim/media-1080p-player.wsim     |  5 +++++
 benchmarks/wsim/medium-composited-game.wsim |  9 +++++++++
 4 files changed, 41 insertions(+)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
new file mode 100644
index 000000000000..20fdcf8c8b4a
--- /dev/null
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -0,0 +1,16 @@
+X.1.0
+M.1.VCS1
+B.1
+X.2.0
+M.2.VCS2
+B.2
+b.2.VCS2.VCS1
+f
+1.DEFAULT.4000-6000.f-1.0
+2.DEFAULT.4000-6000.s-1.0
+a.-3
+3.RCS.2000-4000.-3/-2.0
+3.VECS.2000.-1.0
+4.BCS.1000.-1.0
+s.-2
+p.16667
diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim
new file mode 100644
index 000000000000..a90a2b2be95b
--- /dev/null
+++ b/benchmarks/wsim/high-composited-game.wsim
@@ -0,0 +1,11 @@
+1.RCS.500.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim
new file mode 100644
index 000000000000..bcbb0cfd2ad3
--- /dev/null
+++ b/benchmarks/wsim/media-1080p-player.wsim
@@ -0,0 +1,5 @@
+1.VCS.5000-10000.0.0
+2.RCS.1000-2000.-1.0
+P.3.1
+3.BCS.1000.-2.0
+p.16667
diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim
new file mode 100644
index 000000000000..580883516168
--- /dev/null
+++ b/benchmarks/wsim/medium-composited-game.wsim
@@ -0,0 +1,9 @@
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 17/25] gem_wsim: Some more example workloads
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few additional workloads useful for experimenting with scheduling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/wsim/frame-split-60fps.wsim      | 16 ++++++++++++++++
 benchmarks/wsim/high-composited-game.wsim   | 11 +++++++++++
 benchmarks/wsim/media-1080p-player.wsim     |  5 +++++
 benchmarks/wsim/medium-composited-game.wsim |  9 +++++++++
 4 files changed, 41 insertions(+)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
new file mode 100644
index 000000000000..20fdcf8c8b4a
--- /dev/null
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -0,0 +1,16 @@
+X.1.0
+M.1.VCS1
+B.1
+X.2.0
+M.2.VCS2
+B.2
+b.2.VCS2.VCS1
+f
+1.DEFAULT.4000-6000.f-1.0
+2.DEFAULT.4000-6000.s-1.0
+a.-3
+3.RCS.2000-4000.-3/-2.0
+3.VECS.2000.-1.0
+4.BCS.1000.-1.0
+s.-2
+p.16667
diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim
new file mode 100644
index 000000000000..a90a2b2be95b
--- /dev/null
+++ b/benchmarks/wsim/high-composited-game.wsim
@@ -0,0 +1,11 @@
+1.RCS.500.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim
new file mode 100644
index 000000000000..bcbb0cfd2ad3
--- /dev/null
+++ b/benchmarks/wsim/media-1080p-player.wsim
@@ -0,0 +1,5 @@
+1.VCS.5000-10000.0.0
+2.RCS.1000-2000.-1.0
+P.3.1
+3.BCS.1000.-2.0
+p.16667
diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim
new file mode 100644
index 000000000000..580883516168
--- /dev/null
+++ b/benchmarks/wsim/medium-composited-game.wsim
@@ -0,0 +1,9 @@
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 18/25] gem_wsim: Infinite batch support
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:

  1.RCS.*.0.0
  T.-1

First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.

v2: (Chris)
 * Relax the recursive batch with 4096 nops between BB_START.
 * Check for at least gen8.
 * Simplify relocation entry building.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
---
 benchmarks/gem_wsim.c                  | 124 ++++++++++++++++++-------
 benchmarks/wsim/README                 |   9 +-
 benchmarks/wsim/frame-split-60fps.wsim |   6 +-
 3 files changed, 104 insertions(+), 35 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index bd9201c2928b..a36873640b24 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -86,6 +86,7 @@ enum w_type
 	ENGINE_MAP,
 	LOAD_BALANCE,
 	BOND,
+	TERMINATE,
 };
 
 struct deps
@@ -113,6 +114,7 @@ struct w_step
 	unsigned int context;
 	unsigned int engine;
 	struct duration duration;
+	bool unbound_duration;
 	struct deps data_deps;
 	struct deps fence_deps;
 	int emit_fence;
@@ -143,7 +145,7 @@ struct w_step
 
 	struct drm_i915_gem_execbuffer2 eb;
 	struct drm_i915_gem_exec_object2 *obj;
-	struct drm_i915_gem_relocation_entry reloc[4];
+	struct drm_i915_gem_relocation_entry reloc[5];
 	unsigned long bb_sz;
 	uint32_t bb_handle;
 	uint32_t *seqno_value;
@@ -153,6 +155,7 @@ struct w_step
 	uint32_t *rt1_address;
 	uint32_t *latch_value;
 	uint32_t *latch_address;
+	uint32_t *recursive_bb_start;
 };
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
@@ -517,6 +520,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = ENGINE_MAP;
 				goto add_step;
+			} else if (!strcmp(field, "T")) {
+				int_field(TERMINATE, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid terminate target at step %u!\n");
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -632,23 +639,31 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 			fstart = NULL;
 
-			tmpl = strtol(field, &sep, 10);
-			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
-				  tmpl == LONG_MAX,
-				  "Invalid duration at step %u!\n", nr_steps);
-			step.duration.min = tmpl;
-
-			if (sep && *sep == '-') {
-				tmpl = strtol(sep + 1, NULL, 10);
-				check_arg(tmpl <= 0 ||
-					  tmpl <= step.duration.min ||
-					  tmpl == LONG_MIN ||
-					  tmpl == LONG_MAX,
-					  "Invalid duration range at step %u!\n",
+			if (field[0] == '*') {
+				check_arg(intel_gen(intel_get_drm_devid(fd)) < 8,
+					  "Infinite batch at step %u needs Gen8+!\n",
 					  nr_steps);
-				step.duration.max = tmpl;
+				step.unbound_duration = true;
 			} else {
-				step.duration.max = step.duration.min;
+				tmpl = strtol(field, &sep, 10);
+				check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration at step %u!\n",
+					  nr_steps);
+				step.duration.min = tmpl;
+
+				if (sep && *sep == '-') {
+					tmpl = strtol(sep + 1, NULL, 10);
+					check_arg(tmpl <= 0 ||
+						tmpl <= step.duration.min ||
+						tmpl == LONG_MIN ||
+						tmpl == LONG_MAX,
+						"Invalid duration range at step %u!\n",
+						nr_steps);
+					step.duration.max = tmpl;
+				} else {
+					step.duration.max = step.duration.min;
+				}
 			}
 
 			valid++;
@@ -808,7 +823,7 @@ init_bb(struct w_step *w, unsigned int flags)
 	unsigned int i;
 	uint32_t *ptr;
 
-	if (!arb_period)
+	if (w->unbound_duration || !arb_period)
 		return;
 
 	gem_set_domain(fd, w->bb_handle,
@@ -822,12 +837,13 @@ init_bb(struct w_step *w, unsigned int flags)
 	munmap(ptr, mmap_len);
 }
 
-static void
+static unsigned int
 terminate_bb(struct w_step *w, unsigned int flags)
 {
 	const uint32_t bbe = 0xa << 23;
 	unsigned long mmap_start, mmap_len;
 	unsigned long batch_start = w->bb_sz;
+	unsigned int r = 0;
 	uint32_t *ptr, *cs;
 
 	igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
@@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	if (flags & RT)
 		batch_start -= 12 * sizeof(uint32_t);
 
+	if (w->unbound_duration)
+		batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
+
 	mmap_start = rounddown(batch_start, PAGE_SIZE);
 	mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
 
@@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
 	cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
 
+	if (w->unbound_duration) {
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
+		batch_start += 4 * sizeof(uint32_t);
+
+		*cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
+		w->recursive_bb_start = cs;
+		*cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
+		*cs++ = 0;
+		*cs++ = 0;
+	}
+
 	if (flags & SEQNO) {
-		w->reloc[0].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -860,7 +890,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	if (flags & RT) {
-		w->reloc[1].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -870,7 +900,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		w->rt0_value = cs;
 		*cs++ = 0;
 
-		w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
@@ -879,7 +909,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		*cs++ = 0;
 		*cs++ = 0;
 
-		w->reloc[3].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -891,6 +921,8 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	*cs = bbe;
+
+	return r;
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
@@ -1011,19 +1043,22 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		}
 	}
 
-	w->bb_sz = get_bb_sz(w->duration.max);
-	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
+	if (w->unbound_duration)
+		/* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
+		w->bb_sz = max(PAGE_SIZE, get_bb_sz(w->preempt_us)) +
+			   (1 + 3) * sizeof(uint32_t);
+	else
+		w->bb_sz = get_bb_sz(w->duration.max);
+	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
 	init_bb(w, flags);
-	terminate_bb(w, flags);
+	w->obj[j].relocation_count = terminate_bb(w, flags);
 
-	if (flags & SEQNO) {
+	if (w->obj[j].relocation_count) {
 		w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
-		if (flags & RT)
-			w->obj[j].relocation_count = 4;
-		else
-			w->obj[j].relocation_count = 1;
 		for (i = 0; i < w->obj[j].relocation_count; i++)
 			w->reloc[i].target_handle = 1;
+		if (w->unbound_duration)
+			w->reloc[0].target_handle = j;
 	}
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
@@ -2120,6 +2155,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
 	}
 }
 
+static void
+update_bb_start(struct w_step *w)
+{
+	if (!w->unbound_duration)
+		return;
+
+	gem_set_domain(fd, w->bb_handle,
+		       I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+
+	*w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
+}
+
 static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
 {
 	if (target < 0)
@@ -2255,9 +2302,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	if (flags & RT)
 		update_bb_rt(w, engine, seqno);
 
+	update_bb_start(w);
+
 	w->eb.batch_start_offset =
+		w->unbound_duration ?
+		0 :
 		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
-			2 * sizeof(uint32_t));
+		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
 		int tgt = w->idx + w->fence_deps.list[i];
@@ -2397,6 +2448,17 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
+			} else if (w->type == TERMINATE) {
+				unsigned int t_idx = i + w->target;
+
+				igt_assert(t_idx >= 0 && t_idx < i);
+				igt_assert(wrk->steps[t_idx].type == BATCH);
+				igt_assert(wrk->steps[t_idx].unbound_duration);
+
+				*wrk->steps[t_idx].recursive_bb_start =
+					MI_BATCH_BUFFER_END;
+				__sync_synchronize();
+				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
 				   w->type == LOAD_BALANCE ||
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index e5dcf929519e..552d8882010b 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -2,11 +2,11 @@ Workload descriptor format
 ==========================
 
 ctx.engine.duration_us.dependency.wait,...
-<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
-d|p|s|t|q|a.<int>,...
+d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
 
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
 Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
@@ -77,6 +78,10 @@ Example:
 
 I this case the last step has a data dependency on both first and second steps.
 
+Batch durations can also be specified as infinite by using the '*' in the
+duration field. Such batches must be ended by the terminate command ('T')
+otherwise they will cause a GPU hang to be reported.
+
 Sync (fd) fences
 ----------------
 
diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
index 20fdcf8c8b4a..17490ddfaddd 100644
--- a/benchmarks/wsim/frame-split-60fps.wsim
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -6,10 +6,12 @@ M.2.VCS2
 B.2
 b.2.VCS2.VCS1
 f
-1.DEFAULT.4000-6000.f-1.0
+1.DEFAULT.*.f-1.0
 2.DEFAULT.4000-6000.s-1.0
 a.-3
-3.RCS.2000-4000.-3/-2.0
+s.-2
+T.-4
+3.RCS.2000-4000.-5/-4.0
 3.VECS.2000.-1.0
 4.BCS.1000.-1.0
 s.-2
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 18/25] gem_wsim: Infinite batch support
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:

  1.RCS.*.0.0
  T.-1

First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.

v2: (Chris)
 * Relax the recursive batch with 4096 nops between BB_START.
 * Check for at least gen8.
 * Simplify relocation entry building.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
---
 benchmarks/gem_wsim.c                  | 124 ++++++++++++++++++-------
 benchmarks/wsim/README                 |   9 +-
 benchmarks/wsim/frame-split-60fps.wsim |   6 +-
 3 files changed, 104 insertions(+), 35 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index bd9201c2928b..a36873640b24 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -86,6 +86,7 @@ enum w_type
 	ENGINE_MAP,
 	LOAD_BALANCE,
 	BOND,
+	TERMINATE,
 };
 
 struct deps
@@ -113,6 +114,7 @@ struct w_step
 	unsigned int context;
 	unsigned int engine;
 	struct duration duration;
+	bool unbound_duration;
 	struct deps data_deps;
 	struct deps fence_deps;
 	int emit_fence;
@@ -143,7 +145,7 @@ struct w_step
 
 	struct drm_i915_gem_execbuffer2 eb;
 	struct drm_i915_gem_exec_object2 *obj;
-	struct drm_i915_gem_relocation_entry reloc[4];
+	struct drm_i915_gem_relocation_entry reloc[5];
 	unsigned long bb_sz;
 	uint32_t bb_handle;
 	uint32_t *seqno_value;
@@ -153,6 +155,7 @@ struct w_step
 	uint32_t *rt1_address;
 	uint32_t *latch_value;
 	uint32_t *latch_address;
+	uint32_t *recursive_bb_start;
 };
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
@@ -517,6 +520,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = ENGINE_MAP;
 				goto add_step;
+			} else if (!strcmp(field, "T")) {
+				int_field(TERMINATE, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid terminate target at step %u!\n");
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -632,23 +639,31 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 			fstart = NULL;
 
-			tmpl = strtol(field, &sep, 10);
-			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
-				  tmpl == LONG_MAX,
-				  "Invalid duration at step %u!\n", nr_steps);
-			step.duration.min = tmpl;
-
-			if (sep && *sep == '-') {
-				tmpl = strtol(sep + 1, NULL, 10);
-				check_arg(tmpl <= 0 ||
-					  tmpl <= step.duration.min ||
-					  tmpl == LONG_MIN ||
-					  tmpl == LONG_MAX,
-					  "Invalid duration range at step %u!\n",
+			if (field[0] == '*') {
+				check_arg(intel_gen(intel_get_drm_devid(fd)) < 8,
+					  "Infinite batch at step %u needs Gen8+!\n",
 					  nr_steps);
-				step.duration.max = tmpl;
+				step.unbound_duration = true;
 			} else {
-				step.duration.max = step.duration.min;
+				tmpl = strtol(field, &sep, 10);
+				check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration at step %u!\n",
+					  nr_steps);
+				step.duration.min = tmpl;
+
+				if (sep && *sep == '-') {
+					tmpl = strtol(sep + 1, NULL, 10);
+					check_arg(tmpl <= 0 ||
+						tmpl <= step.duration.min ||
+						tmpl == LONG_MIN ||
+						tmpl == LONG_MAX,
+						"Invalid duration range at step %u!\n",
+						nr_steps);
+					step.duration.max = tmpl;
+				} else {
+					step.duration.max = step.duration.min;
+				}
 			}
 
 			valid++;
@@ -808,7 +823,7 @@ init_bb(struct w_step *w, unsigned int flags)
 	unsigned int i;
 	uint32_t *ptr;
 
-	if (!arb_period)
+	if (w->unbound_duration || !arb_period)
 		return;
 
 	gem_set_domain(fd, w->bb_handle,
@@ -822,12 +837,13 @@ init_bb(struct w_step *w, unsigned int flags)
 	munmap(ptr, mmap_len);
 }
 
-static void
+static unsigned int
 terminate_bb(struct w_step *w, unsigned int flags)
 {
 	const uint32_t bbe = 0xa << 23;
 	unsigned long mmap_start, mmap_len;
 	unsigned long batch_start = w->bb_sz;
+	unsigned int r = 0;
 	uint32_t *ptr, *cs;
 
 	igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
@@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	if (flags & RT)
 		batch_start -= 12 * sizeof(uint32_t);
 
+	if (w->unbound_duration)
+		batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
+
 	mmap_start = rounddown(batch_start, PAGE_SIZE);
 	mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
 
@@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
 	cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
 
+	if (w->unbound_duration) {
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
+		batch_start += 4 * sizeof(uint32_t);
+
+		*cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
+		w->recursive_bb_start = cs;
+		*cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
+		*cs++ = 0;
+		*cs++ = 0;
+	}
+
 	if (flags & SEQNO) {
-		w->reloc[0].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -860,7 +890,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	if (flags & RT) {
-		w->reloc[1].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -870,7 +900,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		w->rt0_value = cs;
 		*cs++ = 0;
 
-		w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
@@ -879,7 +909,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		*cs++ = 0;
 		*cs++ = 0;
 
-		w->reloc[3].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -891,6 +921,8 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	*cs = bbe;
+
+	return r;
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
@@ -1011,19 +1043,22 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		}
 	}
 
-	w->bb_sz = get_bb_sz(w->duration.max);
-	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
+	if (w->unbound_duration)
+		/* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
+		w->bb_sz = max(PAGE_SIZE, get_bb_sz(w->preempt_us)) +
+			   (1 + 3) * sizeof(uint32_t);
+	else
+		w->bb_sz = get_bb_sz(w->duration.max);
+	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
 	init_bb(w, flags);
-	terminate_bb(w, flags);
+	w->obj[j].relocation_count = terminate_bb(w, flags);
 
-	if (flags & SEQNO) {
+	if (w->obj[j].relocation_count) {
 		w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
-		if (flags & RT)
-			w->obj[j].relocation_count = 4;
-		else
-			w->obj[j].relocation_count = 1;
 		for (i = 0; i < w->obj[j].relocation_count; i++)
 			w->reloc[i].target_handle = 1;
+		if (w->unbound_duration)
+			w->reloc[0].target_handle = j;
 	}
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
@@ -2120,6 +2155,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
 	}
 }
 
+static void
+update_bb_start(struct w_step *w)
+{
+	if (!w->unbound_duration)
+		return;
+
+	gem_set_domain(fd, w->bb_handle,
+		       I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+
+	*w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
+}
+
 static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
 {
 	if (target < 0)
@@ -2255,9 +2302,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	if (flags & RT)
 		update_bb_rt(w, engine, seqno);
 
+	update_bb_start(w);
+
 	w->eb.batch_start_offset =
+		w->unbound_duration ?
+		0 :
 		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
-			2 * sizeof(uint32_t));
+		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
 		int tgt = w->idx + w->fence_deps.list[i];
@@ -2397,6 +2448,17 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
+			} else if (w->type == TERMINATE) {
+				unsigned int t_idx = i + w->target;
+
+				igt_assert(t_idx >= 0 && t_idx < i);
+				igt_assert(wrk->steps[t_idx].type == BATCH);
+				igt_assert(wrk->steps[t_idx].unbound_duration);
+
+				*wrk->steps[t_idx].recursive_bb_start =
+					MI_BATCH_BUFFER_END;
+				__sync_synchronize();
+				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
 				   w->type == LOAD_BALANCE ||
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index e5dcf929519e..552d8882010b 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -2,11 +2,11 @@ Workload descriptor format
 ==========================
 
 ctx.engine.duration_us.dependency.wait,...
-<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
-d|p|s|t|q|a.<int>,...
+d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
 
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
 Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
@@ -77,6 +78,10 @@ Example:
 
 I this case the last step has a data dependency on both first and second steps.
 
+Batch durations can also be specified as infinite by using the '*' in the
+duration field. Such batches must be ended by the terminate command ('T')
+otherwise they will cause a GPU hang to be reported.
+
 Sync (fd) fences
 ----------------
 
diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
index 20fdcf8c8b4a..17490ddfaddd 100644
--- a/benchmarks/wsim/frame-split-60fps.wsim
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -6,10 +6,12 @@ M.2.VCS2
 B.2
 b.2.VCS2.VCS1
 f
-1.DEFAULT.4000-6000.f-1.0
+1.DEFAULT.*.f-1.0
 2.DEFAULT.4000-6000.s-1.0
 a.-3
-3.RCS.2000-4000.-3/-2.0
+s.-2
+T.-4
+3.RCS.2000-4000.-5/-4.0
 3.VECS.2000.-1.0
 4.BCS.1000.-1.0
 s.-2
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 19/25] gem_wsim: Command line switch for specifying low slice count workloads
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new command line switch ('-s') is added which toggles the low slice
count mode for workloads following on the command line.

This enables easy benchmarking of the effect of running the existing media
workloads in parallel against another client. For example:

  ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim

Adding or removing the '-s' switch before the second workload enables
analyzing the cost of dynamic SSEU switching impacted to the first
(master) workload.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 44 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index a36873640b24..875838f65128 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -100,6 +100,7 @@ struct w_arg {
 	char *filename;
 	char *desc;
 	int prio;
+	bool sseu;
 };
 
 struct bond {
@@ -179,6 +180,7 @@ struct workload
 	unsigned int nr_steps;
 	struct w_step *steps;
 	int prio;
+	bool sseu;
 
 	pthread_t thread;
 	bool run;
@@ -251,6 +253,7 @@ static int fd;
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
 #define I915		(1<<10)
+#define SSEU		(1<<11)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -726,6 +729,7 @@ add_step:
 	wrk->nr_steps = nr_steps;
 	wrk->steps = steps;
 	wrk->prio = arg->prio;
+	wrk->sseu = arg->sseu;
 
 	free(desc);
 
@@ -771,6 +775,7 @@ clone_workload(struct workload *_wrk)
 	memset(wrk, 0, sizeof(*wrk));
 
 	wrk->prio = _wrk->prio;
+	wrk->sseu = _wrk->sseu;
 	wrk->nr_steps = _wrk->nr_steps;
 	wrk->steps = calloc(wrk->nr_steps, sizeof(struct w_step));
 	igt_assert(wrk->steps);
@@ -1136,6 +1141,26 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
+static void
+set_ctx_sseu(uint32_t ctx)
+{
+	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param param = { };
+
+	sseu.class = I915_ENGINE_CLASS_RENDER;
+	sseu.instance = 0;
+
+	param.ctx_id = ctx;
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&sseu;
+
+	gem_context_get_param(fd, &param);
+
+	sseu.slice_mask = 1;
+
+	gem_context_set_param(fd, &param);
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1494,6 +1519,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
+		if (wrk->sseu)
+			set_ctx_sseu(arg.ctx_id);
+
 		if (share_vm)
 			vm_destroy(fd, share_vm);
 	}
@@ -2668,6 +2696,8 @@ static void print_help(void)
 "  -R              Round-robin initial VCS assignment per client.\n"
 "  -H              Send heartbeat on synchronisation points with seqno based\n"
 "                  balancers. Gives better engine busyness view in some cases.\n"
+"  -s              Turn on small SSEU config for the next workload on the\n"
+"                  command line. Subsequent -s switches it off.\n"
 "  -S              Synchronize the sequence of random batch durations between\n"
 "                  clients.\n"
 "  -G              Global load balancing - a single load balancer will be shared\n"
@@ -2710,11 +2740,12 @@ static char *load_workload_descriptor(char *filename)
 }
 
 static struct w_arg *
-add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg, int prio)
+add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg,
+		 int prio, bool sseu)
 {
 	w_args = realloc(w_args, sizeof(*w_args) * nr_args);
 	igt_assert(w_args);
-	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio };
+	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio, sseu };
 
 	return w_args;
 }
@@ -2807,7 +2838,8 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
-	while ((c = getopt(argc, argv, "hqv2RSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+	while ((c = getopt(argc, argv,
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -2817,7 +2849,8 @@ int main(int argc, char **argv)
 			master_workload = nr_w_args;
 			/* Fall through */
 		case 'w':
-			w_args = add_workload_arg(w_args, ++nr_w_args, optarg, prio);
+			w_args = add_workload_arg(w_args, ++nr_w_args, optarg,
+						  prio, flags & SSEU);
 			break;
 		case 'p':
 			prio = atoi(optarg);
@@ -2859,6 +2892,9 @@ int main(int argc, char **argv)
 		case 'S':
 			flags |= SYNCEDCLIENTS;
 			break;
+		case 's':
+			flags ^= SSEU;
+			break;
 		case 'H':
 			flags |= HEARTBEAT;
 			break;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 19/25] gem_wsim: Command line switch for specifying low slice count workloads
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new command line switch ('-s') is added which toggles the low slice
count mode for workloads following on the command line.

This enables easy benchmarking of the effect of running the existing media
workloads in parallel against another client. For example:

  ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim

Adding or removing the '-s' switch before the second workload enables
analyzing the cost of dynamic SSEU switching impacted to the first
(master) workload.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 44 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index a36873640b24..875838f65128 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -100,6 +100,7 @@ struct w_arg {
 	char *filename;
 	char *desc;
 	int prio;
+	bool sseu;
 };
 
 struct bond {
@@ -179,6 +180,7 @@ struct workload
 	unsigned int nr_steps;
 	struct w_step *steps;
 	int prio;
+	bool sseu;
 
 	pthread_t thread;
 	bool run;
@@ -251,6 +253,7 @@ static int fd;
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
 #define I915		(1<<10)
+#define SSEU		(1<<11)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -726,6 +729,7 @@ add_step:
 	wrk->nr_steps = nr_steps;
 	wrk->steps = steps;
 	wrk->prio = arg->prio;
+	wrk->sseu = arg->sseu;
 
 	free(desc);
 
@@ -771,6 +775,7 @@ clone_workload(struct workload *_wrk)
 	memset(wrk, 0, sizeof(*wrk));
 
 	wrk->prio = _wrk->prio;
+	wrk->sseu = _wrk->sseu;
 	wrk->nr_steps = _wrk->nr_steps;
 	wrk->steps = calloc(wrk->nr_steps, sizeof(struct w_step));
 	igt_assert(wrk->steps);
@@ -1136,6 +1141,26 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
+static void
+set_ctx_sseu(uint32_t ctx)
+{
+	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param param = { };
+
+	sseu.class = I915_ENGINE_CLASS_RENDER;
+	sseu.instance = 0;
+
+	param.ctx_id = ctx;
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&sseu;
+
+	gem_context_get_param(fd, &param);
+
+	sseu.slice_mask = 1;
+
+	gem_context_set_param(fd, &param);
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1494,6 +1519,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
+		if (wrk->sseu)
+			set_ctx_sseu(arg.ctx_id);
+
 		if (share_vm)
 			vm_destroy(fd, share_vm);
 	}
@@ -2668,6 +2696,8 @@ static void print_help(void)
 "  -R              Round-robin initial VCS assignment per client.\n"
 "  -H              Send heartbeat on synchronisation points with seqno based\n"
 "                  balancers. Gives better engine busyness view in some cases.\n"
+"  -s              Turn on small SSEU config for the next workload on the\n"
+"                  command line. Subsequent -s switches it off.\n"
 "  -S              Synchronize the sequence of random batch durations between\n"
 "                  clients.\n"
 "  -G              Global load balancing - a single load balancer will be shared\n"
@@ -2710,11 +2740,12 @@ static char *load_workload_descriptor(char *filename)
 }
 
 static struct w_arg *
-add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg, int prio)
+add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg,
+		 int prio, bool sseu)
 {
 	w_args = realloc(w_args, sizeof(*w_args) * nr_args);
 	igt_assert(w_args);
-	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio };
+	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio, sseu };
 
 	return w_args;
 }
@@ -2807,7 +2838,8 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
-	while ((c = getopt(argc, argv, "hqv2RSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+	while ((c = getopt(argc, argv,
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -2817,7 +2849,8 @@ int main(int argc, char **argv)
 			master_workload = nr_w_args;
 			/* Fall through */
 		case 'w':
-			w_args = add_workload_arg(w_args, ++nr_w_args, optarg, prio);
+			w_args = add_workload_arg(w_args, ++nr_w_args, optarg,
+						  prio, flags & SSEU);
 			break;
 		case 'p':
 			prio = atoi(optarg);
@@ -2859,6 +2892,9 @@ int main(int argc, char **argv)
 		case 'S':
 			flags |= SYNCEDCLIENTS;
 			break;
+		case 's':
+			flags ^= SSEU;
+			break;
 		case 'H':
 			flags |= HEARTBEAT;
 			break;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 20/25] gem_wsim: Per context SSEU control
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command ('S') is added which allows per context slice
(re-)configuration.

v2:
 * Only query device SSEU on first use. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 83 ++++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README | 23 +++++++++++-
 2 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 875838f65128..feb9650588a1 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 	LOAD_BALANCE,
 	BOND,
 	TERMINATE,
+	SSEU
 };
 
 struct deps
@@ -136,6 +137,7 @@ struct w_step
 			uint64_t bond_mask;
 			enum intel_engine_id bond_master;
 		};
+		int sseu;
 	};
 
 	/* Implementation details */
@@ -171,6 +173,7 @@ struct ctx {
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
+	uint64_t sseu;
 };
 
 struct workload
@@ -241,6 +244,9 @@ static unsigned int context_vcs_rr;
 
 static int verbose = 1;
 static int fd;
+static struct drm_i915_gem_context_param_sseu device_sseu = {
+	.slice_mask = -1 /* Force read on first use. */
+};
 
 #define SWAPVCS		(1<<0)
 #define SEQNO		(1<<1)
@@ -482,6 +488,27 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				int_field(SYNC, target,
 					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
 					  "Invalid sync target at step %u!\n");
+			} else if (!strcmp(field, "S")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(tmp <= 0 && nr == 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid SSEU format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
+						step.context = tmp;
+					else if (nr == 1)
+						step.sseu = tmp;
+
+					nr++;
+				}
+
+				step.type = SSEU;
+				goto add_step;
 			} else if (!strcmp(field, "t")) {
 				int_field(THROTTLE, throttle,
 					  tmp < 0,
@@ -1141,24 +1168,38 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
-static void
-set_ctx_sseu(uint32_t ctx)
+static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 {
-	struct drm_i915_gem_context_param_sseu sseu = { };
 	struct drm_i915_gem_context_param param = { };
 
-	sseu.class = I915_ENGINE_CLASS_RENDER;
-	sseu.instance = 0;
+	if (device_sseu.slice_mask == -1) {
+		param.param = I915_CONTEXT_PARAM_SSEU;
+		param.value = (uintptr_t)&device_sseu;
+
+		gem_context_get_param(fd, &param);
+	}
+
+	return device_sseu;
+}
+
+static uint64_t
+set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+{
+	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
+	struct drm_i915_gem_context_param param = { };
+
+	if (slice_mask == -1)
+		slice_mask = device_sseu.slice_mask;
+
+	sseu.slice_mask = slice_mask;
 
 	param.ctx_id = ctx;
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&sseu;
 
-	gem_context_get_param(fd, &param);
-
-	sseu.slice_mask = 1;
-
 	gem_context_set_param(fd, &param);
+
+	return slice_mask;
 }
 
 static int
@@ -1359,6 +1400,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		igt_assert(ctx_id);
 		ctx->id = ctx_id;
+		ctx->sseu = device_sseu.slice_mask;
 
 		if (flags & GLOBAL_BALANCE) {
 			ctx->static_vcs = context_vcs_rr;
@@ -1519,8 +1561,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
-		if (wrk->sseu)
-			set_ctx_sseu(arg.ctx_id);
+		if (wrk->sseu) {
+			/* Set to slice 0 only, one slice. */
+			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+		}
 
 		if (share_vm)
 			vm_destroy(fd, share_vm);
@@ -1557,6 +1601,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Scan for SSEU control steps.
+	 */
+	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+		if (w->type == SSEU) {
+			get_device_sseu();
+			break;
+		}
+	}
+
 	/*
 	 * Allocate batch buffers.
 	 */
@@ -2492,6 +2546,13 @@ static void *run_workload(void *data)
 				   w->type == LOAD_BALANCE ||
 				   w->type == BOND) {
 				continue;
+			} else if (w->type == SSEU) {
+				if (w->sseu != wrk->ctx_list[w->context].sseu) {
+					wrk->ctx_list[w->context].sseu =
+						set_ctx_sseu(wrk->ctx_list[w->context].id,
+							     w->sseu);
+				}
+				continue;
 			}
 
 			if (do_sleep || w->type == PERIOD) {
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 552d8882010b..eea111ab7704 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -5,7 +5,7 @@ ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
-P|X.<uint>.<int>
+P|S|X.<uint>.<int>
 d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'S' - Context SSEU configuration.
  'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
@@ -257,3 +258,23 @@ then look like:
   1.DEFAULT.1000.f-1.0
   2.DEFAULT.1000.s-1.0
   a.-3
+
+Context SSEU configuration
+--------------------------
+
+  S.1.1
+  1.RCS.1000.0.0
+  S.2.-1
+  2.RCS.1000.0.0
+
+Context 1 is configured to run with one enabled slice (slice mask 1) and a batch
+is sumitted against it. Context 2 is configured to run with all slices (this is
+the default so the command could also be omitted) and a batch submitted against
+it.
+
+This shows the dynamic SSEU reconfiguration cost beween two contexts competing
+for the render engine.
+
+Slice mask of -1 has a special meaning of "all slices". Otherwise any integer
+can be specifying as the slice mask, but beware any apart from 1 and -1 can make
+the workload not portable between different GPUs.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 20/25] gem_wsim: Per context SSEU control
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command ('S') is added which allows per context slice
(re-)configuration.

v2:
 * Only query device SSEU on first use. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 83 ++++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README | 23 +++++++++++-
 2 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 875838f65128..feb9650588a1 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 	LOAD_BALANCE,
 	BOND,
 	TERMINATE,
+	SSEU
 };
 
 struct deps
@@ -136,6 +137,7 @@ struct w_step
 			uint64_t bond_mask;
 			enum intel_engine_id bond_master;
 		};
+		int sseu;
 	};
 
 	/* Implementation details */
@@ -171,6 +173,7 @@ struct ctx {
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
+	uint64_t sseu;
 };
 
 struct workload
@@ -241,6 +244,9 @@ static unsigned int context_vcs_rr;
 
 static int verbose = 1;
 static int fd;
+static struct drm_i915_gem_context_param_sseu device_sseu = {
+	.slice_mask = -1 /* Force read on first use. */
+};
 
 #define SWAPVCS		(1<<0)
 #define SEQNO		(1<<1)
@@ -482,6 +488,27 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				int_field(SYNC, target,
 					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
 					  "Invalid sync target at step %u!\n");
+			} else if (!strcmp(field, "S")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(tmp <= 0 && nr == 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid SSEU format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
+						step.context = tmp;
+					else if (nr == 1)
+						step.sseu = tmp;
+
+					nr++;
+				}
+
+				step.type = SSEU;
+				goto add_step;
 			} else if (!strcmp(field, "t")) {
 				int_field(THROTTLE, throttle,
 					  tmp < 0,
@@ -1141,24 +1168,38 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
-static void
-set_ctx_sseu(uint32_t ctx)
+static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 {
-	struct drm_i915_gem_context_param_sseu sseu = { };
 	struct drm_i915_gem_context_param param = { };
 
-	sseu.class = I915_ENGINE_CLASS_RENDER;
-	sseu.instance = 0;
+	if (device_sseu.slice_mask == -1) {
+		param.param = I915_CONTEXT_PARAM_SSEU;
+		param.value = (uintptr_t)&device_sseu;
+
+		gem_context_get_param(fd, &param);
+	}
+
+	return device_sseu;
+}
+
+static uint64_t
+set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+{
+	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
+	struct drm_i915_gem_context_param param = { };
+
+	if (slice_mask == -1)
+		slice_mask = device_sseu.slice_mask;
+
+	sseu.slice_mask = slice_mask;
 
 	param.ctx_id = ctx;
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&sseu;
 
-	gem_context_get_param(fd, &param);
-
-	sseu.slice_mask = 1;
-
 	gem_context_set_param(fd, &param);
+
+	return slice_mask;
 }
 
 static int
@@ -1359,6 +1400,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		igt_assert(ctx_id);
 		ctx->id = ctx_id;
+		ctx->sseu = device_sseu.slice_mask;
 
 		if (flags & GLOBAL_BALANCE) {
 			ctx->static_vcs = context_vcs_rr;
@@ -1519,8 +1561,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
-		if (wrk->sseu)
-			set_ctx_sseu(arg.ctx_id);
+		if (wrk->sseu) {
+			/* Set to slice 0 only, one slice. */
+			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+		}
 
 		if (share_vm)
 			vm_destroy(fd, share_vm);
@@ -1557,6 +1601,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Scan for SSEU control steps.
+	 */
+	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+		if (w->type == SSEU) {
+			get_device_sseu();
+			break;
+		}
+	}
+
 	/*
 	 * Allocate batch buffers.
 	 */
@@ -2492,6 +2546,13 @@ static void *run_workload(void *data)
 				   w->type == LOAD_BALANCE ||
 				   w->type == BOND) {
 				continue;
+			} else if (w->type == SSEU) {
+				if (w->sseu != wrk->ctx_list[w->context].sseu) {
+					wrk->ctx_list[w->context].sseu =
+						set_ctx_sseu(wrk->ctx_list[w->context].id,
+							     w->sseu);
+				}
+				continue;
 			}
 
 			if (do_sleep || w->type == PERIOD) {
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 552d8882010b..eea111ab7704 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -5,7 +5,7 @@ ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
-P|X.<uint>.<int>
+P|S|X.<uint>.<int>
 d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'S' - Context SSEU configuration.
  'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
@@ -257,3 +258,23 @@ then look like:
   1.DEFAULT.1000.f-1.0
   2.DEFAULT.1000.s-1.0
   a.-3
+
+Context SSEU configuration
+--------------------------
+
+  S.1.1
+  1.RCS.1000.0.0
+  S.2.-1
+  2.RCS.1000.0.0
+
+Context 1 is configured to run with one enabled slice (slice mask 1) and a batch
+is sumitted against it. Context 2 is configured to run with all slices (this is
+the default so the command could also be omitted) and a batch submitted against
+it.
+
+This shows the dynamic SSEU reconfiguration cost beween two contexts competing
+for the render engine.
+
+Slice mask of -1 has a special meaning of "all slices". Otherwise any integer
+can be specifying as the slice mask, but beware any apart from 1 and -1 can make
+the workload not portable between different GPUs.
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 21/25] gem_wsim: Allow RCS virtual engine with SSEU control
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To allow exercising the SSEU configuration in combination with Virtual
Engine, allow RCS to be specified in the engine map and use appropriate
index based addressing when applying SSEU configuration to it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 51 ++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index feb9650588a1..af042d71c1d3 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -382,7 +382,8 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		if ((int)engine < 0)
 			return -1;
 
-		if (engine != VCS && engine != VCS1 && engine != VCS2)
+		if (engine != VCS && engine != VCS1 && engine != VCS2 &&
+		    engine != RCS)
 			return -1; /* TODO */
 
 		add = engine == VCS ? 2 : 1;
@@ -1183,7 +1184,7 @@ static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 }
 
 static uint64_t
-set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+set_ctx_sseu(struct ctx *ctx, uint64_t slice_mask)
 {
 	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
 	struct drm_i915_gem_context_param param = { };
@@ -1191,10 +1192,17 @@ set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 	if (slice_mask == -1)
 		slice_mask = device_sseu.slice_mask;
 
+	if (ctx->engine_map && ctx->wants_balance) {
+		sseu.flags = I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX;
+		sseu.engine.engine_class = I915_ENGINE_CLASS_INVALID;
+		sseu.engine.engine_instance = 0;
+	}
+
 	sseu.slice_mask = slice_mask;
 
-	param.ctx_id = ctx;
+	param.ctx_id = ctx->id;
 	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.size = sizeof(sseu);
 	param.value = (uintptr_t)&sseu;
 
 	gem_context_set_param(fd, &param);
@@ -1465,10 +1473,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map_count;
 
 				for (j = 0; j < ctx->engine_map_count; j++) {
-					load_balance.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					load_balance.engines[j].engine_instance =
-						ctx->engine_map[j] - VCS1; /* FIXME */
+					if (ctx->engine_map[j] == RCS) {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_RENDER;
+						load_balance.engines[j].engine_instance =
+							0; /* FIXME */
+					} else {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_VIDEO; /* FIXME */
+						load_balance.engines[j].engine_instance =
+							ctx->engine_map[j] - VCS1; /* FIXME */
+					}
 				}
 			} else {
 				set_engines.extensions = 0;
@@ -1481,10 +1496,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				I915_ENGINE_CLASS_INVALID_NONE;
 
 			for (j = 1; j <= ctx->engine_map_count; j++) {
-				set_engines.engines[j].engine_class =
-					I915_ENGINE_CLASS_VIDEO; /* FIXME */
-				set_engines.engines[j].engine_instance =
-					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				if (ctx->engine_map[j - 1] == RCS) {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_RENDER;
+					set_engines.engines[j].engine_instance = 0; /* FIXME */
+				} else {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					set_engines.engines[j].engine_instance =
+						ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				}
 			}
 
 			for (j = 0; j < ctx->bond_count; j++) {
@@ -1563,7 +1584,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		if (wrk->sseu) {
 			/* Set to slice 0 only, one slice. */
-			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+			ctx->sseu = set_ctx_sseu(ctx, 1);
 		}
 
 		if (share_vm)
@@ -2547,9 +2568,9 @@ static void *run_workload(void *data)
 				   w->type == BOND) {
 				continue;
 			} else if (w->type == SSEU) {
-				if (w->sseu != wrk->ctx_list[w->context].sseu) {
-					wrk->ctx_list[w->context].sseu =
-						set_ctx_sseu(wrk->ctx_list[w->context].id,
+				if (w->sseu != wrk->ctx_list[w->context * 2].sseu) {
+					wrk->ctx_list[w->context * 2].sseu =
+						set_ctx_sseu(&wrk->ctx_list[w->context * 2],
 							     w->sseu);
 				}
 				continue;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 21/25] gem_wsim: Allow RCS virtual engine with SSEU control
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To allow exercising the SSEU configuration in combination with Virtual
Engine, allow RCS to be specified in the engine map and use appropriate
index based addressing when applying SSEU configuration to it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 51 ++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index feb9650588a1..af042d71c1d3 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -382,7 +382,8 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		if ((int)engine < 0)
 			return -1;
 
-		if (engine != VCS && engine != VCS1 && engine != VCS2)
+		if (engine != VCS && engine != VCS1 && engine != VCS2 &&
+		    engine != RCS)
 			return -1; /* TODO */
 
 		add = engine == VCS ? 2 : 1;
@@ -1183,7 +1184,7 @@ static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 }
 
 static uint64_t
-set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+set_ctx_sseu(struct ctx *ctx, uint64_t slice_mask)
 {
 	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
 	struct drm_i915_gem_context_param param = { };
@@ -1191,10 +1192,17 @@ set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 	if (slice_mask == -1)
 		slice_mask = device_sseu.slice_mask;
 
+	if (ctx->engine_map && ctx->wants_balance) {
+		sseu.flags = I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX;
+		sseu.engine.engine_class = I915_ENGINE_CLASS_INVALID;
+		sseu.engine.engine_instance = 0;
+	}
+
 	sseu.slice_mask = slice_mask;
 
-	param.ctx_id = ctx;
+	param.ctx_id = ctx->id;
 	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.size = sizeof(sseu);
 	param.value = (uintptr_t)&sseu;
 
 	gem_context_set_param(fd, &param);
@@ -1465,10 +1473,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map_count;
 
 				for (j = 0; j < ctx->engine_map_count; j++) {
-					load_balance.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					load_balance.engines[j].engine_instance =
-						ctx->engine_map[j] - VCS1; /* FIXME */
+					if (ctx->engine_map[j] == RCS) {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_RENDER;
+						load_balance.engines[j].engine_instance =
+							0; /* FIXME */
+					} else {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_VIDEO; /* FIXME */
+						load_balance.engines[j].engine_instance =
+							ctx->engine_map[j] - VCS1; /* FIXME */
+					}
 				}
 			} else {
 				set_engines.extensions = 0;
@@ -1481,10 +1496,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				I915_ENGINE_CLASS_INVALID_NONE;
 
 			for (j = 1; j <= ctx->engine_map_count; j++) {
-				set_engines.engines[j].engine_class =
-					I915_ENGINE_CLASS_VIDEO; /* FIXME */
-				set_engines.engines[j].engine_instance =
-					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				if (ctx->engine_map[j - 1] == RCS) {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_RENDER;
+					set_engines.engines[j].engine_instance = 0; /* FIXME */
+				} else {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					set_engines.engines[j].engine_instance =
+						ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				}
 			}
 
 			for (j = 0; j < ctx->bond_count; j++) {
@@ -1563,7 +1584,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		if (wrk->sseu) {
 			/* Set to slice 0 only, one slice. */
-			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+			ctx->sseu = set_ctx_sseu(ctx, 1);
 		}
 
 		if (share_vm)
@@ -2547,9 +2568,9 @@ static void *run_workload(void *data)
 				   w->type == BOND) {
 				continue;
 			} else if (w->type == SSEU) {
-				if (w->sseu != wrk->ctx_list[w->context].sseu) {
-					wrk->ctx_list[w->context].sseu =
-						set_ctx_sseu(wrk->ctx_list[w->context].id,
+				if (w->sseu != wrk->ctx_list[w->context * 2].sseu) {
+					wrk->ctx_list[w->context * 2].sseu =
+						set_ctx_sseu(&wrk->ctx_list[w->context * 2],
 							     w->sseu);
 				}
 				continue;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 22/25] tests/i915_query: Engine discovery tests
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Test the new engine discovery query.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/i915_query.c | 247 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)

diff --git a/tests/i915/i915_query.c b/tests/i915/i915_query.c
index 7d0c0e3a061c..ecbec3ae141d 100644
--- a/tests/i915/i915_query.c
+++ b/tests/i915/i915_query.c
@@ -483,6 +483,241 @@ test_query_topology_known_pci_ids(int fd, int devid)
 	free(topo_info);
 }
 
+static bool query_engine_info_supported(int fd)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(fd, &item, 1) == 0 && item.length > 0;
+}
+
+static void engines_invalid(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len;
+
+	/* Flags is MBZ. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.flags = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Length not zero and not greater or equal required size. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Query correct length. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	len = item.length;
+
+	engines = malloc(len);
+	igt_assert(engines);
+
+	/* Ivalid pointer. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	/* All fields in engines query are MBZ and only filled by the kernel. */
+
+	memset(engines, 0, len);
+	engines->num_engines = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[0] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[1] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[2] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	free(engines);
+
+	igt_assert(len <= 4096);
+	engines = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON,
+		       -1, 0);
+	igt_assert(engines != MAP_FAILED);
+
+	/* PROT_NONE is similar to unmapped area. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_NONE), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+	igt_assert_eq(mprotect(engines, len, PROT_WRITE), 0);
+
+	/* Read-only so kernel cannot fill the data back. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_READ), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	munmap(engines, 4096);
+}
+
+static bool
+has_engine(struct drm_i915_query_engine_info *engines,
+	   unsigned class, unsigned instance)
+{
+	unsigned int i;
+
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		if (engine->engine.engine_class == class &&
+		    engine->engine.engine_instance == instance)
+			return true;
+	}
+
+	return false;
+}
+
+static void engines(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len, i;
+
+	engines = malloc(4096);
+	igt_assert(engines);
+
+	/* Query required buffer length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	igt_assert(item.length <= 4096);
+	len = item.length;
+
+	/* Check length larger than required works and reports same length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 4096;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Actual query. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Every GPU has at least one engine. */
+	igt_assert(engines->num_engines > 0);
+
+	/* MBZ fields. */
+	igt_assert_eq(engines->rsvd[0], 0);
+	igt_assert_eq(engines->rsvd[1], 0);
+	igt_assert_eq(engines->rsvd[2], 0);
+
+	/* Check results match the legacy GET_PARAM (where we can). */
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		igt_debug("%u: class=%u instance=%u flags=%llx capabilities=%llx\n",
+			  i,
+			  engine->engine.engine_class,
+			  engine->engine.engine_instance,
+			  engine->flags,
+			  engine->capabilities);
+
+		/* MBZ fields. */
+		igt_assert_eq(engine->rsvd0, 0);
+		igt_assert_eq(engine->rsvd1[0], 0);
+		igt_assert_eq(engine->rsvd1[1], 0);
+
+		switch (engine->engine.engine_class) {
+		case I915_ENGINE_CLASS_RENDER:
+			/* Will be tested later. */
+			break;
+		case I915_ENGINE_CLASS_COPY:
+			igt_assert(gem_has_blt(fd));
+			break;
+		case I915_ENGINE_CLASS_VIDEO:
+			switch (engine->engine.engine_instance) {
+			case 0:
+				igt_assert(gem_has_bsd(fd));
+				break;
+			case 1:
+				igt_assert(gem_has_bsd2(fd));
+				break;
+			}
+			break;
+		case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+			igt_assert(gem_has_vebox(fd));
+			break;
+		default:
+			igt_assert(0);
+		}
+	}
+
+	/* Reverse check to the above - all GET_PARAM engines are present. */
+	igt_assert(has_engine(engines, I915_ENGINE_CLASS_RENDER, 0));
+	if (gem_has_blt(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_COPY, 0));
+	if (gem_has_bsd(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 0));
+	if (gem_has_bsd2(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 1));
+	if (gem_has_vebox(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO_ENHANCE,
+				       0));
+
+	free(engines);
+}
+
 igt_main
 {
 	int fd = -1;
@@ -530,6 +765,18 @@ igt_main
 		test_query_topology_known_pci_ids(fd, devid);
 	}
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(query_engine_info_supported(fd));
+		}
+
+		igt_subtest("engine-info-invalid")
+			engines_invalid(fd);
+
+		igt_subtest("engine-info")
+			engines(fd);
+	}
+
 	igt_fixture {
 		close(fd);
 	}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 22/25] tests/i915_query: Engine discovery tests
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Test the new engine discovery query.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 tests/i915/i915_query.c | 247 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)

diff --git a/tests/i915/i915_query.c b/tests/i915/i915_query.c
index 7d0c0e3a061c..ecbec3ae141d 100644
--- a/tests/i915/i915_query.c
+++ b/tests/i915/i915_query.c
@@ -483,6 +483,241 @@ test_query_topology_known_pci_ids(int fd, int devid)
 	free(topo_info);
 }
 
+static bool query_engine_info_supported(int fd)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(fd, &item, 1) == 0 && item.length > 0;
+}
+
+static void engines_invalid(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len;
+
+	/* Flags is MBZ. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.flags = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Length not zero and not greater or equal required size. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 1;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	/* Query correct length. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	len = item.length;
+
+	engines = malloc(len);
+	igt_assert(engines);
+
+	/* Ivalid pointer. */
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	/* All fields in engines query are MBZ and only filled by the kernel. */
+
+	memset(engines, 0, len);
+	engines->num_engines = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[0] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[1] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	memset(engines, 0, len);
+	engines->rsvd[2] = 1;
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EINVAL);
+
+	free(engines);
+
+	igt_assert(len <= 4096);
+	engines = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON,
+		       -1, 0);
+	igt_assert(engines != MAP_FAILED);
+
+	/* PROT_NONE is similar to unmapped area. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_NONE), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+	igt_assert_eq(mprotect(engines, len, PROT_WRITE), 0);
+
+	/* Read-only so kernel cannot fill the data back. */
+	memset(engines, 0, len);
+	igt_assert_eq(mprotect(engines, len, PROT_READ), 0);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, -EFAULT);
+
+	munmap(engines, 4096);
+}
+
+static bool
+has_engine(struct drm_i915_query_engine_info *engines,
+	   unsigned class, unsigned instance)
+{
+	unsigned int i;
+
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		if (engine->engine.engine_class == class &&
+		    engine->engine.engine_instance == instance)
+			return true;
+	}
+
+	return false;
+}
+
+static void engines(int fd)
+{
+	struct drm_i915_query_engine_info *engines;
+	struct drm_i915_query_item item;
+	unsigned int len, i;
+
+	engines = malloc(4096);
+	igt_assert(engines);
+
+	/* Query required buffer length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert(item.length >= 0);
+	igt_assert(item.length <= 4096);
+	len = item.length;
+
+	/* Check length larger than required works and reports same length. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = 4096;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Actual query. */
+	memset(engines, 0, 4096);
+	memset(&item, 0, sizeof(item));
+	item.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	item.length = len;
+	item.data_ptr = to_user_pointer(engines);
+	i915_query_items(fd, &item, 1);
+	igt_assert_eq(item.length, len);
+
+	/* Every GPU has at least one engine. */
+	igt_assert(engines->num_engines > 0);
+
+	/* MBZ fields. */
+	igt_assert_eq(engines->rsvd[0], 0);
+	igt_assert_eq(engines->rsvd[1], 0);
+	igt_assert_eq(engines->rsvd[2], 0);
+
+	/* Check results match the legacy GET_PARAM (where we can). */
+	for (i = 0; i < engines->num_engines; i++) {
+		struct drm_i915_engine_info *engine =
+			(struct drm_i915_engine_info *)&engines->engines[i];
+
+		igt_debug("%u: class=%u instance=%u flags=%llx capabilities=%llx\n",
+			  i,
+			  engine->engine.engine_class,
+			  engine->engine.engine_instance,
+			  engine->flags,
+			  engine->capabilities);
+
+		/* MBZ fields. */
+		igt_assert_eq(engine->rsvd0, 0);
+		igt_assert_eq(engine->rsvd1[0], 0);
+		igt_assert_eq(engine->rsvd1[1], 0);
+
+		switch (engine->engine.engine_class) {
+		case I915_ENGINE_CLASS_RENDER:
+			/* Will be tested later. */
+			break;
+		case I915_ENGINE_CLASS_COPY:
+			igt_assert(gem_has_blt(fd));
+			break;
+		case I915_ENGINE_CLASS_VIDEO:
+			switch (engine->engine.engine_instance) {
+			case 0:
+				igt_assert(gem_has_bsd(fd));
+				break;
+			case 1:
+				igt_assert(gem_has_bsd2(fd));
+				break;
+			}
+			break;
+		case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+			igt_assert(gem_has_vebox(fd));
+			break;
+		default:
+			igt_assert(0);
+		}
+	}
+
+	/* Reverse check to the above - all GET_PARAM engines are present. */
+	igt_assert(has_engine(engines, I915_ENGINE_CLASS_RENDER, 0));
+	if (gem_has_blt(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_COPY, 0));
+	if (gem_has_bsd(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 0));
+	if (gem_has_bsd2(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO, 1));
+	if (gem_has_vebox(fd))
+		igt_assert(has_engine(engines, I915_ENGINE_CLASS_VIDEO_ENHANCE,
+				       0));
+
+	free(engines);
+}
+
 igt_main
 {
 	int fd = -1;
@@ -530,6 +765,18 @@ igt_main
 		test_query_topology_known_pci_ids(fd, devid);
 	}
 
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(query_engine_info_supported(fd));
+		}
+
+		igt_subtest("engine-info-invalid")
+			engines_invalid(fd);
+
+		igt_subtest("engine-info")
+			engines(fd);
+	}
+
 	igt_fixture {
 		close(fd);
 	}
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 23/25] gem_wsim: Consolidate engine assignments into helpers
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This will allow applying the discovered engine configuration from a single
place.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 145 +++++++++++++++++++++++++-----------------
 1 file changed, 87 insertions(+), 58 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index af042d71c1d3..d43e7c767801 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,6 +365,61 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static unsigned int num_engines_in_class(enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	return 2;
+}
+
+static void
+fill_engines_class(struct i915_engine_class_instance *ci,
+		   enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[0].engine_instance = 0;
+
+	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[1].engine_instance = 1;
+}
+
+static void
+fill_engines_id_class(enum intel_engine_id *list,
+		      enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	list[0] = VCS1;
+	list[1] = VCS2;
+}
+
+static struct i915_engine_class_instance
+get_engine(enum intel_engine_id engine)
+{
+	struct i915_engine_class_instance ci;
+
+	switch (engine) {
+	case RCS:
+		ci.engine_class = I915_ENGINE_CLASS_RENDER;
+		ci.engine_instance = 0;
+		break;
+	case VCS1:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
+		ci.engine_instance = 0;
+		break;
+	case VCS2:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
+		ci.engine_instance = 1;
+		break;
+	default:
+		igt_assert(0);
+	};
+
+	return ci;
+}
+
 static int parse_engine_map(struct w_step *step, const char *_str)
 {
 	char *token, *tctx = NULL, *tstart = (char *)_str;
@@ -386,18 +441,16 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		    engine != RCS)
 			return -1; /* TODO */
 
-		add = engine == VCS ? 2 : 1;
+		add = engine == VCS ? num_engines_in_class(VCS) : 1;
 		step->engine_map_count += add;
 		step->engine_map = realloc(step->engine_map,
 					   step->engine_map_count *
 					   sizeof(step->engine_map[0]));
 
-		if (engine != VCS) {
-			step->engine_map[step->engine_map_count - 1] = engine;
-		} else {
-			step->engine_map[step->engine_map_count - 2] = VCS1;
-			step->engine_map[step->engine_map_count - 1] = VCS2;
-		}
+		if (engine != VCS)
+			step->engine_map[step->engine_map_count - add] = engine;
+		else
+			fill_engines_id_class(&step->engine_map[step->engine_map_count - add], VCS);
 	}
 
 	return 0;
@@ -1472,19 +1525,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				load_balance.num_siblings =
 					ctx->engine_map_count;
 
-				for (j = 0; j < ctx->engine_map_count; j++) {
-					if (ctx->engine_map[j] == RCS) {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_RENDER;
-						load_balance.engines[j].engine_instance =
-							0; /* FIXME */
-					} else {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_VIDEO; /* FIXME */
-						load_balance.engines[j].engine_instance =
-							ctx->engine_map[j] - VCS1; /* FIXME */
-					}
-				}
+				for (j = 0; j < ctx->engine_map_count; j++)
+					load_balance.engines[j] =
+						get_engine(ctx->engine_map[j]);
 			} else {
 				set_engines.extensions = 0;
 			}
@@ -1495,18 +1538,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			set_engines.engines[0].engine_instance =
 				I915_ENGINE_CLASS_INVALID_NONE;
 
-			for (j = 1; j <= ctx->engine_map_count; j++) {
-				if (ctx->engine_map[j - 1] == RCS) {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_RENDER;
-					set_engines.engines[j].engine_instance = 0; /* FIXME */
-				} else {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					set_engines.engines[j].engine_instance =
-						ctx->engine_map[j - 1] - VCS1; /* FIXME */
-				}
-			}
+			for (j = 1; j <= ctx->engine_map_count; j++)
+				set_engines.engines[j] =
+					get_engine(ctx->engine_map[j - 1]);
 
 			for (j = 0; j < ctx->bond_count; j++) {
 				unsigned long mask = ctx->bonds[j].mask;
@@ -1529,10 +1563,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
 				p->virtual_index = 0;
-				p->master.engine_class =
-					I915_ENGINE_CLASS_VIDEO;
-				p->master.engine_instance =
-					ctx->bonds[j].master - VCS1;
+				p->master = get_engine(ctx->bonds[j].master);
 
 				for (b = 0, e = 0; mask; e++, mask >>= 1) {
 					unsigned int idx;
@@ -1550,28 +1581,11 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
-			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
-				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
-				.num_siblings = 2,
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
-				.extensions = to_user_pointer(&load_balance),
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_INVALID,
-					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-
+			const unsigned int count = num_engines_in_class(VCS);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 count);
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  count + 1);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1579,6 +1593,21 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
+			set_engines.extensions = to_user_pointer(&load_balance);
+
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+			fill_engines_class(&set_engines.engines[1], VCS);
+
+			memset(&load_balance, 0, sizeof(load_balance));
+			load_balance.base.name =
+				I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+			load_balance.num_siblings = count;
+
+			fill_engines_class(&load_balance.engines[0], VCS);
+
 			gem_context_set_param(fd, &param);
 		}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 23/25] gem_wsim: Consolidate engine assignments into helpers
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This will allow applying the discovered engine configuration from a single
place.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 145 +++++++++++++++++++++++++-----------------
 1 file changed, 87 insertions(+), 58 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index af042d71c1d3..d43e7c767801 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,6 +365,61 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static unsigned int num_engines_in_class(enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	return 2;
+}
+
+static void
+fill_engines_class(struct i915_engine_class_instance *ci,
+		   enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[0].engine_instance = 0;
+
+	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[1].engine_instance = 1;
+}
+
+static void
+fill_engines_id_class(enum intel_engine_id *list,
+		      enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	list[0] = VCS1;
+	list[1] = VCS2;
+}
+
+static struct i915_engine_class_instance
+get_engine(enum intel_engine_id engine)
+{
+	struct i915_engine_class_instance ci;
+
+	switch (engine) {
+	case RCS:
+		ci.engine_class = I915_ENGINE_CLASS_RENDER;
+		ci.engine_instance = 0;
+		break;
+	case VCS1:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
+		ci.engine_instance = 0;
+		break;
+	case VCS2:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
+		ci.engine_instance = 1;
+		break;
+	default:
+		igt_assert(0);
+	};
+
+	return ci;
+}
+
 static int parse_engine_map(struct w_step *step, const char *_str)
 {
 	char *token, *tctx = NULL, *tstart = (char *)_str;
@@ -386,18 +441,16 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		    engine != RCS)
 			return -1; /* TODO */
 
-		add = engine == VCS ? 2 : 1;
+		add = engine == VCS ? num_engines_in_class(VCS) : 1;
 		step->engine_map_count += add;
 		step->engine_map = realloc(step->engine_map,
 					   step->engine_map_count *
 					   sizeof(step->engine_map[0]));
 
-		if (engine != VCS) {
-			step->engine_map[step->engine_map_count - 1] = engine;
-		} else {
-			step->engine_map[step->engine_map_count - 2] = VCS1;
-			step->engine_map[step->engine_map_count - 1] = VCS2;
-		}
+		if (engine != VCS)
+			step->engine_map[step->engine_map_count - add] = engine;
+		else
+			fill_engines_id_class(&step->engine_map[step->engine_map_count - add], VCS);
 	}
 
 	return 0;
@@ -1472,19 +1525,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				load_balance.num_siblings =
 					ctx->engine_map_count;
 
-				for (j = 0; j < ctx->engine_map_count; j++) {
-					if (ctx->engine_map[j] == RCS) {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_RENDER;
-						load_balance.engines[j].engine_instance =
-							0; /* FIXME */
-					} else {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_VIDEO; /* FIXME */
-						load_balance.engines[j].engine_instance =
-							ctx->engine_map[j] - VCS1; /* FIXME */
-					}
-				}
+				for (j = 0; j < ctx->engine_map_count; j++)
+					load_balance.engines[j] =
+						get_engine(ctx->engine_map[j]);
 			} else {
 				set_engines.extensions = 0;
 			}
@@ -1495,18 +1538,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			set_engines.engines[0].engine_instance =
 				I915_ENGINE_CLASS_INVALID_NONE;
 
-			for (j = 1; j <= ctx->engine_map_count; j++) {
-				if (ctx->engine_map[j - 1] == RCS) {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_RENDER;
-					set_engines.engines[j].engine_instance = 0; /* FIXME */
-				} else {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					set_engines.engines[j].engine_instance =
-						ctx->engine_map[j - 1] - VCS1; /* FIXME */
-				}
-			}
+			for (j = 1; j <= ctx->engine_map_count; j++)
+				set_engines.engines[j] =
+					get_engine(ctx->engine_map[j - 1]);
 
 			for (j = 0; j < ctx->bond_count; j++) {
 				unsigned long mask = ctx->bonds[j].mask;
@@ -1529,10 +1563,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
 				p->virtual_index = 0;
-				p->master.engine_class =
-					I915_ENGINE_CLASS_VIDEO;
-				p->master.engine_instance =
-					ctx->bonds[j].master - VCS1;
+				p->master = get_engine(ctx->bonds[j].master);
 
 				for (b = 0, e = 0; mask; e++, mask >>= 1) {
 					unsigned int idx;
@@ -1550,28 +1581,11 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
-			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
-				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
-				.num_siblings = 2,
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
-				.extensions = to_user_pointer(&load_balance),
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_INVALID,
-					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-
+			const unsigned int count = num_engines_in_class(VCS);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 count);
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  count + 1);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1579,6 +1593,21 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
+			set_engines.extensions = to_user_pointer(&load_balance);
+
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+			fill_engines_class(&set_engines.engines[1], VCS);
+
+			memset(&load_balance, 0, sizeof(load_balance));
+			load_balance.base.name =
+				I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+			load_balance.num_siblings = count;
+
+			fill_engines_class(&load_balance.engines[0], VCS);
+
 			gem_context_set_param(fd, &param);
 		}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Instead of hardcoding the VCS balancing engines, discover, both with the
new engines query, or with the legacy get_param in the fallback case, so
class based addressing always works.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 180 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 173 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index d43e7c767801..539de243f6e8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,34 +365,198 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static bool __engines_queried;
+static unsigned int __num_engines;
+static struct i915_engine_class_instance *__engines;
+
+static int
+__i915_query(int i915, struct drm_i915_query *q)
+{
+	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
+		return -errno;
+	return 0;
+}
+
+static int
+__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	struct drm_i915_query q = {
+		.num_items = n_items,
+		.items_ptr = to_user_pointer(items),
+	};
+	return __i915_query(i915, &q);
+}
+
+static void
+i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
+}
+
+static bool has_query(int i915)
+{
+	struct drm_i915_query query = {};
+
+	return __i915_query(i915, &query) == 0;
+}
+
+static bool has_engine_query(int i915)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
+}
+
+static void query_engines(void)
+{
+	struct i915_engine_class_instance *engines;
+	unsigned int num;
+
+	if (__engines_queried)
+		return;
+
+	__engines_queried = true;
+
+	if (!has_query(fd) || !has_engine_query(fd)) {
+		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
+		unsigned int i = 0;
+
+		igt_assert(num);
+
+		num = 1 + num_bsd;
+
+		if (gem_has_blt(fd))
+			num++;
+
+		if (gem_has_vebox(fd))
+			num++;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
+		engines[i].engine_instance = 0;
+		i++;
+
+		if (gem_has_blt(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd2(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 1;
+			i++;
+		}
+
+		if (gem_has_vebox(fd)) {
+			engines[i].engine_class =
+				I915_ENGINE_CLASS_VIDEO_ENHANCE;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+	} else {
+		struct drm_i915_query_engine_info *engine_info;
+		struct drm_i915_query_item item = {
+			.query_id = DRM_I915_QUERY_ENGINE_INFO,
+		};
+		const unsigned int sz = 4096;
+		unsigned int i;
+
+		engine_info = malloc(sz);
+		igt_assert(engine_info);
+		memset(engine_info, 0, sz);
+
+		item.data_ptr = to_user_pointer(engine_info);
+		item.length = sz;
+
+		i915_query_items(fd, &item, 1);
+		igt_assert(item.length > 0);
+		igt_assert(item.length <= sz);
+
+		num = engine_info->num_engines;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		for (i = 0; i < num; i++) {
+			struct drm_i915_engine_info *engine =
+				(struct drm_i915_engine_info *)&engine_info->engines[i];
+
+			engines[i] = engine->engine;
+		}
+	}
+
+	__engines = engines;
+	__num_engines = num;
+}
+
 static unsigned int num_engines_in_class(enum intel_engine_id class)
 {
+	unsigned int i, count = 0;
+
 	igt_assert(class == VCS);
 
-	return 2;
+	query_engines();
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class == I915_ENGINE_CLASS_VIDEO)
+			count++;
+	}
+
+	igt_assert(count);
+	return count;
 }
 
 static void
 fill_engines_class(struct i915_engine_class_instance *ci,
 		   enum intel_engine_id class)
 {
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
 
-	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[0].engine_instance = 0;
+	query_engines();
 
-	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[1].engine_instance = 1;
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		ci[j].engine_class = __engines[i].engine_class;
+		ci[j].engine_instance = __engines[i].engine_instance;
+		j++;
+	}
 }
 
 static void
 fill_engines_id_class(enum intel_engine_id *list,
 		      enum intel_engine_id class)
 {
+	enum intel_engine_id engine = VCS1;
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
+	igt_assert(num_engines_in_class(VCS) <= 2);
+
+	query_engines();
 
-	list[0] = VCS1;
-	list[1] = VCS2;
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		list[j++] = engine++;
+	}
 }
 
 static struct i915_engine_class_instance
@@ -400,6 +564,8 @@ get_engine(enum intel_engine_id engine)
 {
 	struct i915_engine_class_instance ci;
 
+	query_engines();
+
 	switch (engine) {
 	case RCS:
 		ci.engine_class = I915_ENGINE_CLASS_RENDER;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Instead of hardcoding the VCS balancing engines, discover, both with the
new engines query, or with the legacy get_param in the fallback case, so
class based addressing always works.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 180 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 173 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index d43e7c767801..539de243f6e8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,34 +365,198 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static bool __engines_queried;
+static unsigned int __num_engines;
+static struct i915_engine_class_instance *__engines;
+
+static int
+__i915_query(int i915, struct drm_i915_query *q)
+{
+	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
+		return -errno;
+	return 0;
+}
+
+static int
+__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	struct drm_i915_query q = {
+		.num_items = n_items,
+		.items_ptr = to_user_pointer(items),
+	};
+	return __i915_query(i915, &q);
+}
+
+static void
+i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
+}
+
+static bool has_query(int i915)
+{
+	struct drm_i915_query query = {};
+
+	return __i915_query(i915, &query) == 0;
+}
+
+static bool has_engine_query(int i915)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
+}
+
+static void query_engines(void)
+{
+	struct i915_engine_class_instance *engines;
+	unsigned int num;
+
+	if (__engines_queried)
+		return;
+
+	__engines_queried = true;
+
+	if (!has_query(fd) || !has_engine_query(fd)) {
+		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
+		unsigned int i = 0;
+
+		igt_assert(num);
+
+		num = 1 + num_bsd;
+
+		if (gem_has_blt(fd))
+			num++;
+
+		if (gem_has_vebox(fd))
+			num++;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
+		engines[i].engine_instance = 0;
+		i++;
+
+		if (gem_has_blt(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd2(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 1;
+			i++;
+		}
+
+		if (gem_has_vebox(fd)) {
+			engines[i].engine_class =
+				I915_ENGINE_CLASS_VIDEO_ENHANCE;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+	} else {
+		struct drm_i915_query_engine_info *engine_info;
+		struct drm_i915_query_item item = {
+			.query_id = DRM_I915_QUERY_ENGINE_INFO,
+		};
+		const unsigned int sz = 4096;
+		unsigned int i;
+
+		engine_info = malloc(sz);
+		igt_assert(engine_info);
+		memset(engine_info, 0, sz);
+
+		item.data_ptr = to_user_pointer(engine_info);
+		item.length = sz;
+
+		i915_query_items(fd, &item, 1);
+		igt_assert(item.length > 0);
+		igt_assert(item.length <= sz);
+
+		num = engine_info->num_engines;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		for (i = 0; i < num; i++) {
+			struct drm_i915_engine_info *engine =
+				(struct drm_i915_engine_info *)&engine_info->engines[i];
+
+			engines[i] = engine->engine;
+		}
+	}
+
+	__engines = engines;
+	__num_engines = num;
+}
+
 static unsigned int num_engines_in_class(enum intel_engine_id class)
 {
+	unsigned int i, count = 0;
+
 	igt_assert(class == VCS);
 
-	return 2;
+	query_engines();
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class == I915_ENGINE_CLASS_VIDEO)
+			count++;
+	}
+
+	igt_assert(count);
+	return count;
 }
 
 static void
 fill_engines_class(struct i915_engine_class_instance *ci,
 		   enum intel_engine_id class)
 {
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
 
-	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[0].engine_instance = 0;
+	query_engines();
 
-	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[1].engine_instance = 1;
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		ci[j].engine_class = __engines[i].engine_class;
+		ci[j].engine_instance = __engines[i].engine_instance;
+		j++;
+	}
 }
 
 static void
 fill_engines_id_class(enum intel_engine_id *list,
 		      enum intel_engine_id class)
 {
+	enum intel_engine_id engine = VCS1;
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
+	igt_assert(num_engines_in_class(VCS) <= 2);
+
+	query_engines();
 
-	list[0] = VCS1;
-	list[1] = VCS2;
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		list[j++] = engine++;
+	}
 }
 
 static struct i915_engine_class_instance
@@ -400,6 +564,8 @@ get_engine(enum intel_engine_id engine)
 {
 	struct i915_engine_class_instance ci;
 
+	query_engines();
+
 	switch (engine) {
 	case RCS:
 		ci.engine_class = I915_ENGINE_CLASS_RENDER;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [PATCH i-g-t 25/25] gem_wsim: Support Icelake parts
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

On Icelake second vcs engine is vcs2 instead of vcs1 so add some logical
to physical instance remapping based on engine discovery to support it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 539de243f6e8..aa40c9f0dde5 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -559,6 +559,26 @@ fill_engines_id_class(enum intel_engine_id *list,
 	}
 }
 
+static unsigned int
+find_physical_instance(enum intel_engine_id class, unsigned int logical)
+{
+	unsigned int i, j = 0;
+
+	igt_assert(class == VCS);
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		/* Map logical to physical instances. */
+		if (logical == j++)
+			return __engines[i].engine_instance;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static struct i915_engine_class_instance
 get_engine(enum intel_engine_id engine)
 {
@@ -572,12 +592,9 @@ get_engine(enum intel_engine_id engine)
 		ci.engine_instance = 0;
 		break;
 	case VCS1:
-		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
-		ci.engine_instance = 0;
-		break;
 	case VCS2:
 		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
-		ci.engine_instance = 1;
+		ci.engine_instance = find_physical_instance(VCS, engine - VCS1);
 		break;
 	default:
 		igt_assert(0);
@@ -1367,11 +1384,12 @@ static unsigned int
 find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	    enum intel_engine_id engine)
 {
-	static struct i915_engine_class_instance map[] = {
+	unsigned int vcs1 = find_physical_instance(VCS, 1);
+	struct i915_engine_class_instance map[] = {
 		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
 		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
 		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
-		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
+		[VCS2] = { I915_ENGINE_CLASS_VIDEO, vcs1 },
 		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
 	};
 	unsigned int i;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [Intel-gfx] [PATCH i-g-t 25/25] gem_wsim: Support Icelake parts
@ 2019-05-17 11:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:25 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

On Icelake second vcs engine is vcs2 instead of vcs1 so add some logical
to physical instance remapping based on engine discovery to support it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 539de243f6e8..aa40c9f0dde5 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -559,6 +559,26 @@ fill_engines_id_class(enum intel_engine_id *list,
 	}
 }
 
+static unsigned int
+find_physical_instance(enum intel_engine_id class, unsigned int logical)
+{
+	unsigned int i, j = 0;
+
+	igt_assert(class == VCS);
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		/* Map logical to physical instances. */
+		if (logical == j++)
+			return __engines[i].engine_instance;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static struct i915_engine_class_instance
 get_engine(enum intel_engine_id engine)
 {
@@ -572,12 +592,9 @@ get_engine(enum intel_engine_id engine)
 		ci.engine_instance = 0;
 		break;
 	case VCS1:
-		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
-		ci.engine_instance = 0;
-		break;
 	case VCS2:
 		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
-		ci.engine_instance = 1;
+		ci.engine_instance = find_physical_instance(VCS, engine - VCS1);
 		break;
 	default:
 		igt_assert(0);
@@ -1367,11 +1384,12 @@ static unsigned int
 find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	    enum intel_engine_id engine)
 {
-	static struct i915_engine_class_instance map[] = {
+	unsigned int vcs1 = find_physical_instance(VCS, 1);
+	struct i915_engine_class_instance map[] = {
 		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
 		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
 		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
-		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
+		[VCS2] = { I915_ENGINE_CLASS_VIDEO, vcs1 },
 		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
 	};
 	unsigned int i;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:38     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 11:38 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
> @@ -184,3 +186,19 @@ Example:
>  M.1.VCS
>  
>  This sets up the engine map to all available VCS class engines.
> +
> +Context load balancing
> +----------------------
> +
> +Context load balancing (aka Virtual Engine) is an i915 feature where the driver
> +will pick the best engine (most idle) to submit to given previously configured

"most idle"? Currently we use first idle, aka greedy balancing.

s/the best engine/an engine/
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
@ 2019-05-17 11:38     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 11:38 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
> @@ -184,3 +186,19 @@ Example:
>  M.1.VCS
>  
>  This sets up the engine map to all available VCS class engines.
> +
> +Context load balancing
> +----------------------
> +
> +Context load balancing (aka Virtual Engine) is an i915 feature where the driver
> +will pick the best engine (most idle) to submit to given previously configured

"most idle"? Currently we use first idle, aka greedy balancing.

s/the best engine/an engine/
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 11:39     ` Andi Shyti
  -1 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 11:39 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx

Hi Tvrtko,

> +static int
> +__i915_query(int i915, struct drm_i915_query *q)
> +{
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> +		return -errno;
> +	return 0;
> +}
> +
> +static int
> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	struct drm_i915_query q = {
> +		.num_items = n_items,
> +		.items_ptr = to_user_pointer(items),
> +	};
> +	return __i915_query(i915, &q);
> +}
> +
> +static void
> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> +}
> +
> +static bool has_query(int i915)
> +{
> +	struct drm_i915_query query = {};
> +
> +	return __i915_query(i915, &query) == 0;
> +}
> +
> +static bool has_engine_query(int i915)
> +{
> +	struct drm_i915_query_item item = {
> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
> +	};
> +
> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> +}
> +
> +static void query_engines(void)
> +{

[...]

> +		struct drm_i915_query_engine_info *engine_info;
> +		struct drm_i915_query_item item = {
> +			.query_id = DRM_I915_QUERY_ENGINE_INFO,
> +		};
> +		const unsigned int sz = 4096;
> +		unsigned int i;
> +
> +		engine_info = malloc(sz);
> +		igt_assert(engine_info);
> +		memset(engine_info, 0, sz);
> +
> +		item.data_ptr = to_user_pointer(engine_info);
> +		item.length = sz;
> +
> +		i915_query_items(fd, &item, 1);
> +		igt_assert(item.length > 0);
> +		igt_assert(item.length <= sz);
> +
> +		num = engine_info->num_engines;
> +
> +		engines = calloc(num,
> +				 sizeof(struct i915_engine_class_instance));
> +		igt_assert(engines);
> +
> +		for (i = 0; i < num; i++) {
> +			struct drm_i915_engine_info *engine =
> +				(struct drm_i915_engine_info *)&engine_info->engines[i];
> +
> +			engines[i] = engine->engine;
> +		}
> +	}
> +
> +	__engines = engines;
> +	__num_engines = num;
> +}

would it make sense to make a library out of all the above? e.g.
gem_engine_topology does similar thing (all static functions like
here, though).

Andi
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 11:39     ` Andi Shyti
  0 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 11:39 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx, Andi Shyti, Tvrtko Ursulin

Hi Tvrtko,

> +static int
> +__i915_query(int i915, struct drm_i915_query *q)
> +{
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> +		return -errno;
> +	return 0;
> +}
> +
> +static int
> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	struct drm_i915_query q = {
> +		.num_items = n_items,
> +		.items_ptr = to_user_pointer(items),
> +	};
> +	return __i915_query(i915, &q);
> +}
> +
> +static void
> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> +}
> +
> +static bool has_query(int i915)
> +{
> +	struct drm_i915_query query = {};
> +
> +	return __i915_query(i915, &query) == 0;
> +}
> +
> +static bool has_engine_query(int i915)
> +{
> +	struct drm_i915_query_item item = {
> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
> +	};
> +
> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> +}
> +
> +static void query_engines(void)
> +{

[...]

> +		struct drm_i915_query_engine_info *engine_info;
> +		struct drm_i915_query_item item = {
> +			.query_id = DRM_I915_QUERY_ENGINE_INFO,
> +		};
> +		const unsigned int sz = 4096;
> +		unsigned int i;
> +
> +		engine_info = malloc(sz);
> +		igt_assert(engine_info);
> +		memset(engine_info, 0, sz);
> +
> +		item.data_ptr = to_user_pointer(engine_info);
> +		item.length = sz;
> +
> +		i915_query_items(fd, &item, 1);
> +		igt_assert(item.length > 0);
> +		igt_assert(item.length <= sz);
> +
> +		num = engine_info->num_engines;
> +
> +		engines = calloc(num,
> +				 sizeof(struct i915_engine_class_instance));
> +		igt_assert(engines);
> +
> +		for (i = 0; i < num; i++) {
> +			struct drm_i915_engine_info *engine =
> +				(struct drm_i915_engine_info *)&engine_info->engines[i];
> +
> +			engines[i] = engine->engine;
> +		}
> +	}
> +
> +	__engines = engines;
> +	__num_engines = num;
> +}

would it make sense to make a library out of all the above? e.g.
gem_engine_topology does similar thing (all static functions like
here, though).

Andi
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 11:39     ` Andi Shyti
@ 2019-05-17 11:51       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:51 UTC (permalink / raw)
  To: Andi Shyti; +Cc: igt-dev, Intel-gfx


On 17/05/2019 12:39, Andi Shyti wrote:
> Hi Tvrtko,
> 
>> +static int
>> +__i915_query(int i915, struct drm_i915_query *q)
>> +{
>> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
>> +		return -errno;
>> +	return 0;
>> +}
>> +
>> +static int
>> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	struct drm_i915_query q = {
>> +		.num_items = n_items,
>> +		.items_ptr = to_user_pointer(items),
>> +	};
>> +	return __i915_query(i915, &q);
>> +}
>> +
>> +static void
>> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
>> +}
>> +
>> +static bool has_query(int i915)
>> +{
>> +	struct drm_i915_query query = {};
>> +
>> +	return __i915_query(i915, &query) == 0;
>> +}
>> +
>> +static bool has_engine_query(int i915)
>> +{
>> +	struct drm_i915_query_item item = {
>> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
>> +	};
>> +
>> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
>> +}
>> +
>> +static void query_engines(void)
>> +{
> 
> [...]
> 
>> +		struct drm_i915_query_engine_info *engine_info;
>> +		struct drm_i915_query_item item = {
>> +			.query_id = DRM_I915_QUERY_ENGINE_INFO,
>> +		};
>> +		const unsigned int sz = 4096;
>> +		unsigned int i;
>> +
>> +		engine_info = malloc(sz);
>> +		igt_assert(engine_info);
>> +		memset(engine_info, 0, sz);
>> +
>> +		item.data_ptr = to_user_pointer(engine_info);
>> +		item.length = sz;
>> +
>> +		i915_query_items(fd, &item, 1);
>> +		igt_assert(item.length > 0);
>> +		igt_assert(item.length <= sz);
>> +
>> +		num = engine_info->num_engines;
>> +
>> +		engines = calloc(num,
>> +				 sizeof(struct i915_engine_class_instance));
>> +		igt_assert(engines);
>> +
>> +		for (i = 0; i < num; i++) {
>> +			struct drm_i915_engine_info *engine =
>> +				(struct drm_i915_engine_info *)&engine_info->engines[i];
>> +
>> +			engines[i] = engine->engine;
>> +		}
>> +	}
>> +
>> +	__engines = engines;
>> +	__num_engines = num;
>> +}
> 
> would it make sense to make a library out of all the above? e.g.
> gem_engine_topology does similar thing (all static functions like
> here, though).

Definitely yes, but coordinating all series seems tricky. I think best 
would be to consolidate once everything gets merged?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 11:51       ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:51 UTC (permalink / raw)
  To: Andi Shyti; +Cc: igt-dev, Intel-gfx, Andi Shyti, Tvrtko Ursulin


On 17/05/2019 12:39, Andi Shyti wrote:
> Hi Tvrtko,
> 
>> +static int
>> +__i915_query(int i915, struct drm_i915_query *q)
>> +{
>> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
>> +		return -errno;
>> +	return 0;
>> +}
>> +
>> +static int
>> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	struct drm_i915_query q = {
>> +		.num_items = n_items,
>> +		.items_ptr = to_user_pointer(items),
>> +	};
>> +	return __i915_query(i915, &q);
>> +}
>> +
>> +static void
>> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
>> +}
>> +
>> +static bool has_query(int i915)
>> +{
>> +	struct drm_i915_query query = {};
>> +
>> +	return __i915_query(i915, &query) == 0;
>> +}
>> +
>> +static bool has_engine_query(int i915)
>> +{
>> +	struct drm_i915_query_item item = {
>> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
>> +	};
>> +
>> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
>> +}
>> +
>> +static void query_engines(void)
>> +{
> 
> [...]
> 
>> +		struct drm_i915_query_engine_info *engine_info;
>> +		struct drm_i915_query_item item = {
>> +			.query_id = DRM_I915_QUERY_ENGINE_INFO,
>> +		};
>> +		const unsigned int sz = 4096;
>> +		unsigned int i;
>> +
>> +		engine_info = malloc(sz);
>> +		igt_assert(engine_info);
>> +		memset(engine_info, 0, sz);
>> +
>> +		item.data_ptr = to_user_pointer(engine_info);
>> +		item.length = sz;
>> +
>> +		i915_query_items(fd, &item, 1);
>> +		igt_assert(item.length > 0);
>> +		igt_assert(item.length <= sz);
>> +
>> +		num = engine_info->num_engines;
>> +
>> +		engines = calloc(num,
>> +				 sizeof(struct i915_engine_class_instance));
>> +		igt_assert(engines);
>> +
>> +		for (i = 0; i < num; i++) {
>> +			struct drm_i915_engine_info *engine =
>> +				(struct drm_i915_engine_info *)&engine_info->engines[i];
>> +
>> +			engines[i] = engine->engine;
>> +		}
>> +	}
>> +
>> +	__engines = engines;
>> +	__num_engines = num;
>> +}
> 
> would it make sense to make a library out of all the above? e.g.
> gem_engine_topology does similar thing (all static functions like
> here, though).

Definitely yes, but coordinating all series seems tricky. I think best 
would be to consolidate once everything gets merged?

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
  2019-05-17 11:38     ` Chris Wilson
@ 2019-05-17 11:52       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:52 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 17/05/2019 12:38, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
>> @@ -184,3 +186,19 @@ Example:
>>   M.1.VCS
>>   
>>   This sets up the engine map to all available VCS class engines.
>> +
>> +Context load balancing
>> +----------------------
>> +
>> +Context load balancing (aka Virtual Engine) is an i915 feature where the driver
>> +will pick the best engine (most idle) to submit to given previously configured
> 
> "most idle"? Currently we use first idle, aka greedy balancing.

What about "most idle" - is it bad English? :)

> s/the best engine/an engine/

Okay. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
@ 2019-05-17 11:52       ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 11:52 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 17/05/2019 12:38, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
>> @@ -184,3 +186,19 @@ Example:
>>   M.1.VCS
>>   
>>   This sets up the engine map to all available VCS class engines.
>> +
>> +Context load balancing
>> +----------------------
>> +
>> +Context load balancing (aka Virtual Engine) is an i915 feature where the driver
>> +will pick the best engine (most idle) to submit to given previously configured
> 
> "most idle"? Currently we use first idle, aka greedy balancing.

What about "most idle" - is it bad English? :)

> s/the best engine/an engine/

Okay. :)

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 11:51       ` Tvrtko Ursulin
@ 2019-05-17 11:55         ` Andi Shyti
  -1 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 11:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx


Hi Tvrtko,

> > > +static int
> > > +__i915_query(int i915, struct drm_i915_query *q)
> > > +{
> > > +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> > > +		return -errno;
> > > +	return 0;
> > > +}
> > > +
> > > +static int
> > > +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> > > +{
> > > +	struct drm_i915_query q = {
> > > +		.num_items = n_items,
> > > +		.items_ptr = to_user_pointer(items),
> > > +	};
> > > +	return __i915_query(i915, &q);
> > > +}
> > > +
> > > +static void
> > > +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> > > +{
> > > +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> > > +}
> > > +
> > > +static bool has_query(int i915)
> > > +{
> > > +	struct drm_i915_query query = {};
> > > +
> > > +	return __i915_query(i915, &query) == 0;
> > > +}
> > > +
> > > +static bool has_engine_query(int i915)
> > > +{
> > > +	struct drm_i915_query_item item = {
> > > +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
> > > +	};
> > > +
> > > +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> > > +}
> > > +
> > > +static void query_engines(void)
> > > +{
> > 
> > [...]
> > 
> > > +		struct drm_i915_query_engine_info *engine_info;
> > > +		struct drm_i915_query_item item = {
> > > +			.query_id = DRM_I915_QUERY_ENGINE_INFO,
> > > +		};
> > > +		const unsigned int sz = 4096;
> > > +		unsigned int i;
> > > +
> > > +		engine_info = malloc(sz);
> > > +		igt_assert(engine_info);
> > > +		memset(engine_info, 0, sz);
> > > +
> > > +		item.data_ptr = to_user_pointer(engine_info);
> > > +		item.length = sz;
> > > +
> > > +		i915_query_items(fd, &item, 1);
> > > +		igt_assert(item.length > 0);
> > > +		igt_assert(item.length <= sz);
> > > +
> > > +		num = engine_info->num_engines;
> > > +
> > > +		engines = calloc(num,
> > > +				 sizeof(struct i915_engine_class_instance));
> > > +		igt_assert(engines);
> > > +
> > > +		for (i = 0; i < num; i++) {
> > > +			struct drm_i915_engine_info *engine =
> > > +				(struct drm_i915_engine_info *)&engine_info->engines[i];
> > > +
> > > +			engines[i] = engine->engine;
> > > +		}
> > > +	}
> > > +
> > > +	__engines = engines;
> > > +	__num_engines = num;
> > > +}
> > 
> > would it make sense to make a library out of all the above? e.g.
> > gem_engine_topology does similar thing (all static functions like
> > here, though).
> 
> Definitely yes, but coordinating all series seems tricky. I think best would
> be to consolidate once everything gets merged?

yes, sure! let's get everything in first :)

Andi
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 11:55         ` Andi Shyti
  0 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 11:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx, Andi Shyti, Tvrtko Ursulin


Hi Tvrtko,

> > > +static int
> > > +__i915_query(int i915, struct drm_i915_query *q)
> > > +{
> > > +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> > > +		return -errno;
> > > +	return 0;
> > > +}
> > > +
> > > +static int
> > > +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> > > +{
> > > +	struct drm_i915_query q = {
> > > +		.num_items = n_items,
> > > +		.items_ptr = to_user_pointer(items),
> > > +	};
> > > +	return __i915_query(i915, &q);
> > > +}
> > > +
> > > +static void
> > > +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> > > +{
> > > +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> > > +}
> > > +
> > > +static bool has_query(int i915)
> > > +{
> > > +	struct drm_i915_query query = {};
> > > +
> > > +	return __i915_query(i915, &query) == 0;
> > > +}
> > > +
> > > +static bool has_engine_query(int i915)
> > > +{
> > > +	struct drm_i915_query_item item = {
> > > +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
> > > +	};
> > > +
> > > +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> > > +}
> > > +
> > > +static void query_engines(void)
> > > +{
> > 
> > [...]
> > 
> > > +		struct drm_i915_query_engine_info *engine_info;
> > > +		struct drm_i915_query_item item = {
> > > +			.query_id = DRM_I915_QUERY_ENGINE_INFO,
> > > +		};
> > > +		const unsigned int sz = 4096;
> > > +		unsigned int i;
> > > +
> > > +		engine_info = malloc(sz);
> > > +		igt_assert(engine_info);
> > > +		memset(engine_info, 0, sz);
> > > +
> > > +		item.data_ptr = to_user_pointer(engine_info);
> > > +		item.length = sz;
> > > +
> > > +		i915_query_items(fd, &item, 1);
> > > +		igt_assert(item.length > 0);
> > > +		igt_assert(item.length <= sz);
> > > +
> > > +		num = engine_info->num_engines;
> > > +
> > > +		engines = calloc(num,
> > > +				 sizeof(struct i915_engine_class_instance));
> > > +		igt_assert(engines);
> > > +
> > > +		for (i = 0; i < num; i++) {
> > > +			struct drm_i915_engine_info *engine =
> > > +				(struct drm_i915_engine_info *)&engine_info->engines[i];
> > > +
> > > +			engines[i] = engine->engine;
> > > +		}
> > > +	}
> > > +
> > > +	__engines = engines;
> > > +	__num_engines = num;
> > > +}
> > 
> > would it make sense to make a library out of all the above? e.g.
> > gem_engine_topology does similar thing (all static functions like
> > here, though).
> 
> Definitely yes, but coordinating all series seems tricky. I think best would
> be to consolidate once everything gets merged?

yes, sure! let's get everything in first :)

Andi
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 12:10     ` Andi Shyti
  -1 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 12:10 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx

On Fri, May 17, 2019 at 12:25:25PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Instead of hardcoding the VCS balancing engines, discover, both with the
> new engines query, or with the legacy get_param in the fallback case, so
> class based addressing always works.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c | 180 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 173 insertions(+), 7 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index d43e7c767801..539de243f6e8 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -365,34 +365,198 @@ static int str_to_engine(const char *str)
>  	return -1;
>  }
>  
> +static bool __engines_queried;
> +static unsigned int __num_engines;
> +static struct i915_engine_class_instance *__engines;
> +
> +static int
> +__i915_query(int i915, struct drm_i915_query *q)
> +{
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> +		return -errno;
> +	return 0;
> +}
> +
> +static int
> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	struct drm_i915_query q = {
> +		.num_items = n_items,
> +		.items_ptr = to_user_pointer(items),
> +	};
> +	return __i915_query(i915, &q);
> +}
> +
> +static void
> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> +}
> +
> +static bool has_query(int i915)
> +{
> +	struct drm_i915_query query = {};
> +
> +	return __i915_query(i915, &query) == 0;
> +}
> +
> +static bool has_engine_query(int i915)
> +{
> +	struct drm_i915_query_item item = {
> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
> +	};
> +
> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> +}
> +
> +static void query_engines(void)
> +{
> +	struct i915_engine_class_instance *engines;
> +	unsigned int num;
> +
> +	if (__engines_queried)
> +		return;
> +
> +	__engines_queried = true;
> +
> +	if (!has_query(fd) || !has_engine_query(fd)) {

One question, still. What is the real use of this check and
'has_query' that is used only here.

I mean... here you want to check whether the "ioctl is not
implemented" or "ioctl is not implemented and length  is 0".

Wouldn't in this case just '!has_engine_query()' be enough? or
have I missed any case?

> +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
> +		unsigned int i = 0;
> +
> +		igt_assert(num);
> +
> +		num = 1 + num_bsd;

did you mean the above two lines swapped?

> +
> +		if (gem_has_blt(fd))
> +			num++;
> +
> +		if (gem_has_vebox(fd))
> +			num++;
> +
> +		engines = calloc(num,
> +				 sizeof(struct i915_engine_class_instance));
> +		igt_assert(engines);
> +
> +		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
> +		engines[i].engine_instance = 0;
> +		i++;
> +
> +		if (gem_has_blt(fd)) {
> +			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
> +			engines[i].engine_instance = 0;
> +			i++;
> +		}
> +
> +		if (gem_has_bsd(fd)) {
> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
> +			engines[i].engine_instance = 0;
> +			i++;
> +		}
> +
> +		if (gem_has_bsd2(fd)) {
> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
> +			engines[i].engine_instance = 1;
> +			i++;
> +		}
> +
> +		if (gem_has_vebox(fd)) {
> +			engines[i].engine_class =
> +				I915_ENGINE_CLASS_VIDEO_ENHANCE;
> +			engines[i].engine_instance = 0;
> +			i++;
> +		}

mmhhh... isn't this the intel_execution_engine2[]? Yet another
way for having engine list... in the long run, updating here (as
well) won't be easy to remember.

Andi
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 12:10     ` Andi Shyti
  0 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 12:10 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx, Tvrtko Ursulin

On Fri, May 17, 2019 at 12:25:25PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Instead of hardcoding the VCS balancing engines, discover, both with the
> new engines query, or with the legacy get_param in the fallback case, so
> class based addressing always works.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c | 180 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 173 insertions(+), 7 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index d43e7c767801..539de243f6e8 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -365,34 +365,198 @@ static int str_to_engine(const char *str)
>  	return -1;
>  }
>  
> +static bool __engines_queried;
> +static unsigned int __num_engines;
> +static struct i915_engine_class_instance *__engines;
> +
> +static int
> +__i915_query(int i915, struct drm_i915_query *q)
> +{
> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> +		return -errno;
> +	return 0;
> +}
> +
> +static int
> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	struct drm_i915_query q = {
> +		.num_items = n_items,
> +		.items_ptr = to_user_pointer(items),
> +	};
> +	return __i915_query(i915, &q);
> +}
> +
> +static void
> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> +{
> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> +}
> +
> +static bool has_query(int i915)
> +{
> +	struct drm_i915_query query = {};
> +
> +	return __i915_query(i915, &query) == 0;
> +}
> +
> +static bool has_engine_query(int i915)
> +{
> +	struct drm_i915_query_item item = {
> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
> +	};
> +
> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> +}
> +
> +static void query_engines(void)
> +{
> +	struct i915_engine_class_instance *engines;
> +	unsigned int num;
> +
> +	if (__engines_queried)
> +		return;
> +
> +	__engines_queried = true;
> +
> +	if (!has_query(fd) || !has_engine_query(fd)) {

One question, still. What is the real use of this check and
'has_query' that is used only here.

I mean... here you want to check whether the "ioctl is not
implemented" or "ioctl is not implemented and length  is 0".

Wouldn't in this case just '!has_engine_query()' be enough? or
have I missed any case?

> +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
> +		unsigned int i = 0;
> +
> +		igt_assert(num);
> +
> +		num = 1 + num_bsd;

did you mean the above two lines swapped?

> +
> +		if (gem_has_blt(fd))
> +			num++;
> +
> +		if (gem_has_vebox(fd))
> +			num++;
> +
> +		engines = calloc(num,
> +				 sizeof(struct i915_engine_class_instance));
> +		igt_assert(engines);
> +
> +		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
> +		engines[i].engine_instance = 0;
> +		i++;
> +
> +		if (gem_has_blt(fd)) {
> +			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
> +			engines[i].engine_instance = 0;
> +			i++;
> +		}
> +
> +		if (gem_has_bsd(fd)) {
> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
> +			engines[i].engine_instance = 0;
> +			i++;
> +		}
> +
> +		if (gem_has_bsd2(fd)) {
> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
> +			engines[i].engine_instance = 1;
> +			i++;
> +		}
> +
> +		if (gem_has_vebox(fd)) {
> +			engines[i].engine_class =
> +				I915_ENGINE_CLASS_VIDEO_ENHANCE;
> +			engines[i].engine_instance = 0;
> +			i++;
> +		}

mmhhh... isn't this the intel_execution_engine2[]? Yet another
way for having engine list... in the long run, updating here (as
well) won't be easy to remember.

Andi
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Media scalability tooling (rev3)
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
                   ` (25 preceding siblings ...)
  (?)
@ 2019-05-17 12:18 ` Patchwork
  -1 siblings, 0 replies; 109+ messages in thread
From: Patchwork @ 2019-05-17 12:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Media scalability tooling (rev3)
URL   : https://patchwork.freedesktop.org/series/51193/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6093 -> IGTPW_2998
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/51193/revisions/3/mbox/

Known issues
------------

  Here are the changes found in IGTPW_2998 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live_contexts:
    - fi-bdw-gvtdvm:      [PASS][1] -> [DMESG-FAIL][2] ([fdo#110235])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/fi-bdw-gvtdvm/igt@i915_selftest@live_contexts.html

  * igt@i915_selftest@live_evict:
    - fi-bsw-kefka:       [PASS][3] -> [DMESG-WARN][4] ([fdo#107709])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/fi-bsw-kefka/igt@i915_selftest@live_evict.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/fi-bsw-kefka/igt@i915_selftest@live_evict.html

  
#### Possible fixes ####

  * igt@prime_vgem@basic-fence-flip:
    - fi-ilk-650:         [DMESG-WARN][5] ([fdo#106387]) -> [PASS][6] +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/fi-ilk-650/igt@prime_vgem@basic-fence-flip.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/fi-ilk-650/igt@prime_vgem@basic-fence-flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#106387]: https://bugs.freedesktop.org/show_bug.cgi?id=106387
  [fdo#107709]: https://bugs.freedesktop.org/show_bug.cgi?id=107709
  [fdo#110235]: https://bugs.freedesktop.org/show_bug.cgi?id=110235
  [fdo#110246]: https://bugs.freedesktop.org/show_bug.cgi?id=110246


Participating hosts (53 -> 46)
------------------------------

  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * IGT: IGT_4994 -> IGTPW_2998

  CI_DRM_6093: 3521a84b80042a6ff62b7a29ffb291acbb601d31 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_2998: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/
  IGT_4994: 555019f862c35f1619627761d6da21385be40920 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+igt@i915_query@engine-info
+igt@i915_query@engine-info-invalid

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 12:10     ` Andi Shyti
@ 2019-05-17 12:19       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 12:19 UTC (permalink / raw)
  To: Andi Shyti; +Cc: igt-dev, Intel-gfx


On 17/05/2019 13:10, Andi Shyti wrote:
> On Fri, May 17, 2019 at 12:25:25PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Instead of hardcoding the VCS balancing engines, discover, both with the
>> new engines query, or with the legacy get_param in the fallback case, so
>> class based addressing always works.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c | 180 ++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 173 insertions(+), 7 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index d43e7c767801..539de243f6e8 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -365,34 +365,198 @@ static int str_to_engine(const char *str)
>>   	return -1;
>>   }
>>   
>> +static bool __engines_queried;
>> +static unsigned int __num_engines;
>> +static struct i915_engine_class_instance *__engines;
>> +
>> +static int
>> +__i915_query(int i915, struct drm_i915_query *q)
>> +{
>> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
>> +		return -errno;
>> +	return 0;
>> +}
>> +
>> +static int
>> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	struct drm_i915_query q = {
>> +		.num_items = n_items,
>> +		.items_ptr = to_user_pointer(items),
>> +	};
>> +	return __i915_query(i915, &q);
>> +}
>> +
>> +static void
>> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
>> +}
>> +
>> +static bool has_query(int i915)
>> +{
>> +	struct drm_i915_query query = {};
>> +
>> +	return __i915_query(i915, &query) == 0;
>> +}
>> +
>> +static bool has_engine_query(int i915)
>> +{
>> +	struct drm_i915_query_item item = {
>> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
>> +	};
>> +
>> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
>> +}
>> +
>> +static void query_engines(void)
>> +{
>> +	struct i915_engine_class_instance *engines;
>> +	unsigned int num;
>> +
>> +	if (__engines_queried)
>> +		return;
>> +
>> +	__engines_queried = true;
>> +
>> +	if (!has_query(fd) || !has_engine_query(fd)) {
> 
> One question, still. What is the real use of this check and
> 'has_query' that is used only here.
> 
> I mean... here you want to check whether the "ioctl is not
> implemented" or "ioctl is not implemented and length  is 0".
> 
> Wouldn't in this case just '!has_engine_query()' be enough? or
> have I missed any case?

You haven't missed anything. I have been pointlessly verbose and a bit 
lazy by copy-pasting a lot.

has_engine_query is a superset of has_query for the purpose of ioctl 
detection.

> 
>> +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
>> +		unsigned int i = 0;
>> +
>> +		igt_assert(num);
>> +
>> +		num = 1 + num_bsd;
> 
> did you mean the above two lines swapped?

No, I want to avoid running on platforms with no vcs engines since no 
one ever tested gem_wsim there.

>> +
>> +		if (gem_has_blt(fd))
>> +			num++;
>> +
>> +		if (gem_has_vebox(fd))
>> +			num++;
>> +
>> +		engines = calloc(num,
>> +				 sizeof(struct i915_engine_class_instance));
>> +		igt_assert(engines);
>> +
>> +		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
>> +		engines[i].engine_instance = 0;
>> +		i++;
>> +
>> +		if (gem_has_blt(fd)) {
>> +			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
>> +			engines[i].engine_instance = 0;
>> +			i++;
>> +		}
>> +
>> +		if (gem_has_bsd(fd)) {
>> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
>> +			engines[i].engine_instance = 0;
>> +			i++;
>> +		}
>> +
>> +		if (gem_has_bsd2(fd)) {
>> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
>> +			engines[i].engine_instance = 1;
>> +			i++;
>> +		}
>> +
>> +		if (gem_has_vebox(fd)) {
>> +			engines[i].engine_class =
>> +				I915_ENGINE_CLASS_VIDEO_ENHANCE;
>> +			engines[i].engine_instance = 0;
>> +			i++;
>> +		}
> 
> mmhhh... isn't this the intel_execution_engine2[]? Yet another
> way for having engine list... in the long run, updating here (as
> well) won't be easy to remember.

Not here, gem_wsim uses some of the IGT libraries, but should keep it at 
minimum. So I think we don't want to pull in the engine array etc.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 12:19       ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 12:19 UTC (permalink / raw)
  To: Andi Shyti; +Cc: igt-dev, Intel-gfx, Tvrtko Ursulin


On 17/05/2019 13:10, Andi Shyti wrote:
> On Fri, May 17, 2019 at 12:25:25PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Instead of hardcoding the VCS balancing engines, discover, both with the
>> new engines query, or with the legacy get_param in the fallback case, so
>> class based addressing always works.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c | 180 ++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 173 insertions(+), 7 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index d43e7c767801..539de243f6e8 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -365,34 +365,198 @@ static int str_to_engine(const char *str)
>>   	return -1;
>>   }
>>   
>> +static bool __engines_queried;
>> +static unsigned int __num_engines;
>> +static struct i915_engine_class_instance *__engines;
>> +
>> +static int
>> +__i915_query(int i915, struct drm_i915_query *q)
>> +{
>> +	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
>> +		return -errno;
>> +	return 0;
>> +}
>> +
>> +static int
>> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	struct drm_i915_query q = {
>> +		.num_items = n_items,
>> +		.items_ptr = to_user_pointer(items),
>> +	};
>> +	return __i915_query(i915, &q);
>> +}
>> +
>> +static void
>> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
>> +{
>> +	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
>> +}
>> +
>> +static bool has_query(int i915)
>> +{
>> +	struct drm_i915_query query = {};
>> +
>> +	return __i915_query(i915, &query) == 0;
>> +}
>> +
>> +static bool has_engine_query(int i915)
>> +{
>> +	struct drm_i915_query_item item = {
>> +		.query_id = DRM_I915_QUERY_ENGINE_INFO,
>> +	};
>> +
>> +	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
>> +}
>> +
>> +static void query_engines(void)
>> +{
>> +	struct i915_engine_class_instance *engines;
>> +	unsigned int num;
>> +
>> +	if (__engines_queried)
>> +		return;
>> +
>> +	__engines_queried = true;
>> +
>> +	if (!has_query(fd) || !has_engine_query(fd)) {
> 
> One question, still. What is the real use of this check and
> 'has_query' that is used only here.
> 
> I mean... here you want to check whether the "ioctl is not
> implemented" or "ioctl is not implemented and length  is 0".
> 
> Wouldn't in this case just '!has_engine_query()' be enough? or
> have I missed any case?

You haven't missed anything. I have been pointlessly verbose and a bit 
lazy by copy-pasting a lot.

has_engine_query is a superset of has_query for the purpose of ioctl 
detection.

> 
>> +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
>> +		unsigned int i = 0;
>> +
>> +		igt_assert(num);
>> +
>> +		num = 1 + num_bsd;
> 
> did you mean the above two lines swapped?

No, I want to avoid running on platforms with no vcs engines since no 
one ever tested gem_wsim there.

>> +
>> +		if (gem_has_blt(fd))
>> +			num++;
>> +
>> +		if (gem_has_vebox(fd))
>> +			num++;
>> +
>> +		engines = calloc(num,
>> +				 sizeof(struct i915_engine_class_instance));
>> +		igt_assert(engines);
>> +
>> +		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
>> +		engines[i].engine_instance = 0;
>> +		i++;
>> +
>> +		if (gem_has_blt(fd)) {
>> +			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
>> +			engines[i].engine_instance = 0;
>> +			i++;
>> +		}
>> +
>> +		if (gem_has_bsd(fd)) {
>> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
>> +			engines[i].engine_instance = 0;
>> +			i++;
>> +		}
>> +
>> +		if (gem_has_bsd2(fd)) {
>> +			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
>> +			engines[i].engine_instance = 1;
>> +			i++;
>> +		}
>> +
>> +		if (gem_has_vebox(fd)) {
>> +			engines[i].engine_class =
>> +				I915_ENGINE_CLASS_VIDEO_ENHANCE;
>> +			engines[i].engine_instance = 0;
>> +			i++;
>> +		}
> 
> mmhhh... isn't this the intel_execution_engine2[]? Yet another
> way for having engine list... in the long run, updating here (as
> well) won't be easy to remember.

Not here, gem_wsim uses some of the IGT libraries, but should keep it at 
minimum. So I think we don't want to pull in the engine array etc.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 12:19       ` Tvrtko Ursulin
@ 2019-05-17 13:02         ` Andi Shyti
  -1 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 13:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx

> > > +static void query_engines(void)
> > > +{
> > > +	struct i915_engine_class_instance *engines;
> > > +	unsigned int num;
> > > +
> > > +	if (__engines_queried)
> > > +		return;
> > > +
> > > +	__engines_queried = true;
> > > +
> > > +	if (!has_query(fd) || !has_engine_query(fd)) {

[...]

> > > +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
> > > +		unsigned int i = 0;
> > > +
> > > +		igt_assert(num);
> > > +
> > > +		num = 1 + num_bsd;
> > 
> > did you mean the above two lines swapped?
> 
> No, I want to avoid running on platforms with no vcs engines since no one
> ever tested gem_wsim there.

but you are asserting on 'num' while num is not initialized.

so that I guess it should first be

	num = 1 + num_bsd;

and then

	igt_assert(num);

right?

Andi
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 13:02         ` Andi Shyti
  0 siblings, 0 replies; 109+ messages in thread
From: Andi Shyti @ 2019-05-17 13:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx, Tvrtko Ursulin

> > > +static void query_engines(void)
> > > +{
> > > +	struct i915_engine_class_instance *engines;
> > > +	unsigned int num;
> > > +
> > > +	if (__engines_queried)
> > > +		return;
> > > +
> > > +	__engines_queried = true;
> > > +
> > > +	if (!has_query(fd) || !has_engine_query(fd)) {

[...]

> > > +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
> > > +		unsigned int i = 0;
> > > +
> > > +		igt_assert(num);
> > > +
> > > +		num = 1 + num_bsd;
> > 
> > did you mean the above two lines swapped?
> 
> No, I want to avoid running on platforms with no vcs engines since no one
> ever tested gem_wsim there.

but you are asserting on 'num' while num is not initialized.

so that I guess it should first be

	num = 1 + num_bsd;

and then

	igt_assert(num);

right?

Andi
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 13:02         ` Andi Shyti
@ 2019-05-17 13:05           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 13:05 UTC (permalink / raw)
  To: Andi Shyti; +Cc: igt-dev, Intel-gfx


On 17/05/2019 14:02, Andi Shyti wrote:
>>>> +static void query_engines(void)
>>>> +{
>>>> +	struct i915_engine_class_instance *engines;
>>>> +	unsigned int num;
>>>> +
>>>> +	if (__engines_queried)
>>>> +		return;
>>>> +
>>>> +	__engines_queried = true;
>>>> +
>>>> +	if (!has_query(fd) || !has_engine_query(fd)) {
> 
> [...]
> 
>>>> +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
>>>> +		unsigned int i = 0;
>>>> +
>>>> +		igt_assert(num);
>>>> +
>>>> +		num = 1 + num_bsd;
>>>
>>> did you mean the above two lines swapped?
>>
>> No, I want to avoid running on platforms with no vcs engines since no one
>> ever tested gem_wsim there.
> 
> but you are asserting on 'num' while num is not initialized.

True, my bad. Where are compiler warnings when you need them.

> so that I guess it should first be
> 
> 	num = 1 + num_bsd;
> 
> and then
> 
> 	igt_assert(num);
> 
> right?

I just want igt_assert(num_bsd).

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 13:05           ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-17 13:05 UTC (permalink / raw)
  To: Andi Shyti; +Cc: igt-dev, Intel-gfx, Tvrtko Ursulin


On 17/05/2019 14:02, Andi Shyti wrote:
>>>> +static void query_engines(void)
>>>> +{
>>>> +	struct i915_engine_class_instance *engines;
>>>> +	unsigned int num;
>>>> +
>>>> +	if (__engines_queried)
>>>> +		return;
>>>> +
>>>> +	__engines_queried = true;
>>>> +
>>>> +	if (!has_query(fd) || !has_engine_query(fd)) {
> 
> [...]
> 
>>>> +		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
>>>> +		unsigned int i = 0;
>>>> +
>>>> +		igt_assert(num);
>>>> +
>>>> +		num = 1 + num_bsd;
>>>
>>> did you mean the above two lines swapped?
>>
>> No, I want to avoid running on platforms with no vcs engines since no one
>> ever tested gem_wsim there.
> 
> but you are asserting on 'num' while num is not initialized.

True, my bad. Where are compiler warnings when you need them.

> so that I guess it should first be
> 
> 	num = 1 + num_bsd;
> 
> and then
> 
> 	igt_assert(num);
> 
> right?

I just want igt_assert(num_bsd).

Regards,

Tvrtko

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
  2019-05-17 11:52       ` Tvrtko Ursulin
@ 2019-05-17 13:19         ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 13:19 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:52:36)
> 
> On 17/05/2019 12:38, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
> >> @@ -184,3 +186,19 @@ Example:
> >>   M.1.VCS
> >>   
> >>   This sets up the engine map to all available VCS class engines.
> >> +
> >> +Context load balancing
> >> +----------------------
> >> +
> >> +Context load balancing (aka Virtual Engine) is an i915 feature where the driver
> >> +will pick the best engine (most idle) to submit to given previously configured
> > 
> > "most idle"? Currently we use first idle, aka greedy balancing.
> 
> What about "most idle" - is it bad English? :)

No, I fear it implies an optimality that we can't deliver. :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
@ 2019-05-17 13:19         ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 13:19 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:52:36)
> 
> On 17/05/2019 12:38, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
> >> @@ -184,3 +186,19 @@ Example:
> >>   M.1.VCS
> >>   
> >>   This sets up the engine map to all available VCS class engines.
> >> +
> >> +Context load balancing
> >> +----------------------
> >> +
> >> +Context load balancing (aka Virtual Engine) is an i915 feature where the driver
> >> +will pick the best engine (most idle) to submit to given previously configured
> > 
> > "most idle"? Currently we use first idle, aka greedy balancing.
> 
> What about "most idle" - is it bad English? :)

No, I fear it implies an optimality that we can't deliver. :)
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for Media scalability tooling (rev3)
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
                   ` (26 preceding siblings ...)
  (?)
@ 2019-05-17 17:33 ` Patchwork
  -1 siblings, 0 replies; 109+ messages in thread
From: Patchwork @ 2019-05-17 17:33 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Media scalability tooling (rev3)
URL   : https://patchwork.freedesktop.org/series/51193/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6093_full -> IGTPW_2998_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/51193/revisions/3/mbox/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_2998_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@i915_query@engine-info} (NEW):
    - shard-iclb:         NOTRUN -> [SKIP][1] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb1/igt@i915_query@engine-info.html

  
New tests
---------

  New tests have been introduced between CI_DRM_6093_full and IGTPW_2998_full:

### New IGT tests (2) ###

  * igt@i915_query@engine-info:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  * igt@i915_query@engine-info-invalid:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_2998_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@vecs0-s3:
    - shard-iclb:         [PASS][2] -> [INCOMPLETE][3] ([fdo#107713] / [fdo#109100])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb8/igt@gem_ctx_isolation@vecs0-s3.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb1/igt@gem_ctx_isolation@vecs0-s3.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [PASS][4] -> [DMESG-WARN][5] ([fdo#108566]) +4 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-apl2/igt@i915_suspend@fence-restore-tiled2untiled.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-apl1/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy:
    - shard-glk:          [PASS][6] -> [FAIL][7] ([fdo#104873])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-glk3/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-glk8/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy.html

  * igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic:
    - shard-glk:          [PASS][8] -> [FAIL][9] ([fdo#107409])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-glk7/igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-glk9/igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-kbl:          [PASS][10] -> [DMESG-WARN][11] ([fdo#108566])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-kbl1/igt@kms_flip@flip-vs-suspend.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-kbl7/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-cpu:
    - shard-hsw:          [PASS][12] -> [SKIP][13] ([fdo#109271]) +1 similar issue
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-hsw8/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-cpu.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-hsw2/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-cur-indfb-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render:
    - shard-iclb:         [PASS][14] -> [FAIL][15] ([fdo#103167]) +2 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_psr@no_drrs:
    - shard-iclb:         [PASS][16] -> [FAIL][17] ([fdo#108341])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb6/igt@kms_psr@no_drrs.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb1/igt@kms_psr@no_drrs.html

  * igt@kms_psr@psr2_primary_blt:
    - shard-iclb:         [PASS][18] -> [SKIP][19] ([fdo#109441])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb2/igt@kms_psr@psr2_primary_blt.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb5/igt@kms_psr@psr2_primary_blt.html

  * igt@perf_pmu@rc6:
    - shard-kbl:          [PASS][20] -> [SKIP][21] ([fdo#109271])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-kbl4/igt@perf_pmu@rc6.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-kbl2/igt@perf_pmu@rc6.html

  * igt@perf_pmu@rc6-runtime-pm:
    - shard-kbl:          [PASS][22] -> [FAIL][23] ([fdo#105010])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-kbl3/igt@perf_pmu@rc6-runtime-pm.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-kbl3/igt@perf_pmu@rc6-runtime-pm.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s3:
    - shard-kbl:          [DMESG-WARN][24] ([fdo#108566]) -> [PASS][25]
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-kbl7/igt@gem_exec_suspend@basic-s3.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-kbl3/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_tiled_swapping@non-threaded:
    - shard-hsw:          [FAIL][26] ([fdo#108686]) -> [PASS][27]
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-hsw8/igt@gem_tiled_swapping@non-threaded.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-hsw2/igt@gem_tiled_swapping@non-threaded.html

  * igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic:
    - shard-glk:          [FAIL][28] ([fdo#106509] / [fdo#107409]) -> [PASS][29]
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-glk4/igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-glk8/igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-render:
    - shard-iclb:         [FAIL][30] ([fdo#103167]) -> [PASS][31] +3 similar issues
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-render.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb3/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-render.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-iclb:         [FAIL][32] ([fdo#103166]) -> [PASS][33]
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb6/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb2/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [SKIP][34] ([fdo#109642]) -> [PASS][35]
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb4/igt@kms_psr2_su@page_flip.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb2/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_basic:
    - shard-iclb:         [SKIP][36] ([fdo#109441]) -> [PASS][37] +2 similar issues
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb1/igt@kms_psr@psr2_basic.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb2/igt@kms_psr@psr2_basic.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-apl:          [DMESG-WARN][38] ([fdo#108566]) -> [PASS][39] +8 similar issues
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-apl7/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-apl2/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  * igt@perf_pmu@rc6-runtime-pm-long:
    - shard-apl:          [FAIL][40] ([fdo#105010]) -> [PASS][41]
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-apl6/igt@perf_pmu@rc6-runtime-pm-long.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-apl3/igt@perf_pmu@rc6-runtime-pm-long.html
    - shard-kbl:          [FAIL][42] ([fdo#105010]) -> [PASS][43]
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-kbl2/igt@perf_pmu@rc6-runtime-pm-long.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-kbl7/igt@perf_pmu@rc6-runtime-pm-long.html

  
#### Warnings ####

  * igt@gem_mmap_gtt@forked-big-copy-odd:
    - shard-iclb:         [INCOMPLETE][44] ([fdo#107713] / [fdo#109100]) -> [TIMEOUT][45] ([fdo#109673]) +1 similar issue
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6093/shard-iclb3/igt@gem_mmap_gtt@forked-big-copy-odd.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/shard-iclb8/igt@gem_mmap_gtt@forked-big-copy-odd.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#104873]: https://bugs.freedesktop.org/show_bug.cgi?id=104873
  [fdo#105010]: https://bugs.freedesktop.org/show_bug.cgi?id=105010
  [fdo#106509]: https://bugs.freedesktop.org/show_bug.cgi?id=106509
  [fdo#107409]: https://bugs.freedesktop.org/show_bug.cgi?id=107409
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108341]: https://bugs.freedesktop.org/show_bug.cgi?id=108341
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#108686]: https://bugs.freedesktop.org/show_bug.cgi?id=108686
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#109673]: https://bugs.freedesktop.org/show_bug.cgi?id=109673


Participating hosts (10 -> 6)
------------------------------

  Missing    (4): pig-skl-6260u shard-skl pig-hsw-4770r pig-glk-j5005 


Build changes
-------------

  * IGT: IGT_4994 -> IGTPW_2998
  * Piglit: piglit_4509 -> None

  CI_DRM_6093: 3521a84b80042a6ff62b7a29ffb291acbb601d31 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_2998: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/
  IGT_4994: 555019f862c35f1619627761d6da21385be40920 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_2998/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences
  2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-17 19:20     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:03)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> gem_wsim uses the sw_fence timeline and confuses the script.

sw_sync

How does this fare with clflush fences (which are .driver="i915") and
all of the future .driver="i915" fences?

Looks like we are still prone to hitting that die. (Should die pretty
quick on !llc)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences
@ 2019-05-17 19:20     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:03)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> gem_wsim uses the sw_fence timeline and confuses the script.

sw_sync

How does this fare with clflush fences (which are .driver="i915") and
all of the future .driver="i915" fences?

Looks like we are still prone to hitting that die. (Should die pretty
quick on !llc)
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH i-g-t 04/25] trace.pl: Virtual engine support
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:23     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:05)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Add virtual/queue timelines to both stdout and HTML output.
> 
> A new timeline is created for each queue/virtual engine to display
> associated requests in queued and runnable states. Once requests are
> submitted to a real engine for executing they show up on the physical
> engine timeline.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

Maybe an unfair comment, but looks just as untidy as if I had hacked it
in ;)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Though that's really just a glance and trying to follow the flow.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/25] trace.pl: Virtual engine support
@ 2019-05-17 19:23     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:05)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Add virtual/queue timelines to both stdout and HTML output.
> 
> A new timeline is created for each queue/virtual engine to display
> associated requests in queued and runnable states. Once requests are
> submitted to a real engine for executing they show up on the physical
> engine timeline.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

Maybe an unfair comment, but looks just as untidy as if I had hacked it
in ;)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Though that's really just a glance and trying to follow the flow.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/25] trace.pl: Virtual engine preemption support
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:24     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:24 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:06)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Use the 'completed?' tracepoint field to detect more robustly when a
> request has been preempted and remove it from the engine database if so.
> 
> Otherwise the script can hit a scenario where the same global seqno will
> be mentioned multiple times (on an engine seqno) which aborts processing.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ok. In the future, we will end up with requests still in the db, but
this does what you say on the tin.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 05/25] trace.pl: Virtual engine preemption support
@ 2019-05-17 19:24     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:24 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:06)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Use the 'completed?' tracepoint field to detect more robustly when a
> request has been preempted and remove it from the engine database if so.
> 
> Otherwise the script can hit a scenario where the same global seqno will
> be mentioned multiple times (on an engine seqno) which aborts processing.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ok. In the future, we will end up with requests still in the db, but
this does what you say on the tin.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:35     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support new i915 uAPI for configuring contexts with engine maps.
> 
> Please refer to the README file for more detailed explanation.
> 
> v2:
>  * Allow defining engine maps by class.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
>  benchmarks/wsim/README |  25 ++++-
>  2 files changed, 204 insertions(+), 32 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index 60b7d32e22d4..e5b12e37490e 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -57,6 +57,7 @@
>  #include "ewma.h"
>  
>  enum intel_engine_id {
> +       DEFAULT,
>         RCS,
>         BCS,
>         VCS,
> @@ -81,7 +82,8 @@ enum w_type
>         SW_FENCE,
>         SW_FENCE_SIGNAL,
>         CTX_PRIORITY,
> -       PREEMPTION
> +       PREEMPTION,
> +       ENGINE_MAP
>  };
>  
>  struct deps
> @@ -115,6 +117,10 @@ struct w_step
>                 int throttle;
>                 int fence_signal;
>                 int priority;
> +               struct {
> +                       unsigned int engine_map_count;
> +                       enum intel_engine_id *engine_map;
> +               };
>         };
>  
>         /* Implementation details */
> @@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
>  struct ctx {
>         uint32_t id;
>         int priority;
> +       unsigned int engine_map_count;
> +       enum intel_engine_id *engine_map;
>         bool targets_instance;
>         bool wants_balance;
>         unsigned int static_vcs;
> @@ -200,10 +208,10 @@ struct workload
>                 int fd;
>                 bool first;
>                 unsigned int num_engines;
> -               unsigned int engine_map[5];
> +               unsigned int engine_map[NUM_ENGINES];
>                 uint64_t t_prev;
> -               uint64_t prev[5];
> -               double busy[5];
> +               uint64_t prev[NUM_ENGINES];
> +               double busy[NUM_ENGINES];
>         } busy_balancer;
>  };
>  
> @@ -234,6 +242,7 @@ static int fd;
>  #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
>  
>  static const char *ring_str_map[NUM_ENGINES] = {
> +       [DEFAULT] = "DEFAULT",
>         [RCS] = "RCS",
>         [BCS] = "BCS",
>         [VCS] = "VCS",
> @@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
>         return -1;
>  }
>  
> +static int parse_engine_map(struct w_step *step, const char *_str)
> +{
> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> +
> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> +               enum intel_engine_id engine;
> +               unsigned int add;
> +
> +               tstart = NULL;
> +
> +               if (!strcmp(token, "DEFAULT"))
> +                       return -1;
> +
> +               engine = str_to_engine(token);
> +               if ((int)engine < 0)
> +                       return -1;
> +
> +               if (engine != VCS && engine != VCS1 && engine != VCS2)
> +                       return -1; /* TODO */

Still a little concerned that the map is VCS only. It just doesn't fit
my expectations of what the map will be.

> +
> +               add = engine == VCS ? 2 : 1;

Will we not every ask what happens if we had millions of engines at our
disposal. But that's a tommorrow problem, ok.

> +               step->engine_map_count += add;
> +               step->engine_map = realloc(step->engine_map,
> +                                          step->engine_map_count *
> +                                          sizeof(step->engine_map[0]));
> +
> +               if (engine != VCS) {
> +                       step->engine_map[step->engine_map_count - 1] = engine;
> +               } else {
> +                       step->engine_map[step->engine_map_count - 2] = VCS1;
> +                       step->engine_map[step->engine_map_count - 1] = VCS2;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
>  static struct workload *
>  parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  {
> @@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>                         } else if (!strcmp(field, "f")) {
>                                 step.type = SW_FENCE;
>                                 goto add_step;
> +                       } else if (!strcmp(field, "M")) {
> +                               unsigned int nr = 0;
> +                               while ((field = strtok_r(fstart, ".", &fctx)) !=
> +                                   NULL) {
> +                                       tmp = atoi(field);
> +                                       check_arg(nr == 0 && tmp <= 0,
> +                                                 "Invalid context at step %u!\n",
> +                                                 nr_steps);
> +                                       check_arg(nr > 1,
> +                                                 "Invalid engine map format at step %u!\n",
> +                                                 nr_steps);
> +
> +                                       if (nr == 0) {
> +                                               step.context = tmp;
> +                                       } else {
> +                                               tmp = parse_engine_map(&step,
> +                                                                      field);
> +                                               check_arg(tmp < 0,
> +                                                         "Invalid engine map list at step %u!\n",
> +                                                         nr_steps);
> +                                       }
> +
> +                                       nr++;
> +                               }
> +
> +                               step.type = ENGINE_MAP;
> +                               goto add_step;
>                         } else if (!strcmp(field, "X")) {
>                                 unsigned int nr = 0;
>                                 while ((field = strtok_r(fstart, ".", &fctx)) !=
> @@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>  }
>  
>  static const unsigned int eb_engine_map[NUM_ENGINES] = {
> +       [DEFAULT] = I915_EXEC_DEFAULT,
>         [RCS] = I915_EXEC_RENDER,
>         [BCS] = I915_EXEC_BLT,
>         [VCS] = I915_EXEC_BSD,
> @@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
>                 eb->flags = eb_engine_map[engine];
>  }
>  
> +static unsigned int
> +find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
> +{
> +       unsigned int i;
> +
> +       for (i = 0; i < ctx->engine_map_count; i++) {
> +               if (ctx->engine_map[i] == engine)
> +                       return i + 1;
> +       }
> +
> +       igt_assert(0);
> +       return 0;

No balancer in the map at this point?

> +}
> +
> +static struct ctx *
> +__get_ctx(struct workload *wrk, struct w_step *w)
> +{
> +       return &wrk->ctx_list[w->context * 2];
> +}
> +
>  static void
> -eb_update_flags(struct w_step *w, enum intel_engine_id engine,
> -               unsigned int flags)
> +eb_update_flags(struct workload *wrk, struct w_step *w,
> +               enum intel_engine_id engine, unsigned int flags)
>  {
> -       eb_set_engine(&w->eb, engine, flags);
> +       struct ctx *ctx = __get_ctx(wrk, w);
> +
> +       if (ctx->engine_map)
> +               w->eb.flags = find_engine_in_map(ctx, engine);
> +       else
> +               eb_set_engine(&w->eb, engine, flags);
>  
>         w->eb.flags |= I915_EXEC_HANDLE_LUT;
>         w->eb.flags |= I915_EXEC_NO_RELOC;
> @@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
>                 return wrk->status_object;
>  }
>  
> -static struct ctx *
> -__get_ctx(struct workload *wrk, struct w_step *w)
> -{
> -       return &wrk->ctx_list[w->context * 2];
> -}
> -
>  static uint32_t
>  get_ctxid(struct workload *wrk, struct w_step *w)
>  {
> @@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>                 engine = VCS2;
>         else if (flags & SWAPVCS && engine == VCS2)
>                 engine = VCS1;
> -       eb_update_flags(w, engine, flags);
> +       eb_update_flags(wrk, w, engine, flags);
>  #ifdef DEBUG
>         printf("%u: %u:|", w->idx, w->eb.buffer_count);
>         for (i = 0; i <= j; i++)
> @@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
>         igt_assert_eq(__vm_destroy(i915, vm_id), 0);
>  }
>  
> -static void
> +static int
>  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>  {
>         unsigned int ctx_vcs;
> @@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>         /*
>          * Identify if contexts target specific engine instances and if they
>          * want to be balanced.
> +        *
> +        * Transfer over engine map configuration from the workload step.
>          */
>         for (j = 0; j < wrk->nr_ctxs; j += 2) {
>                 bool targets = false;
>                 bool balance = false;
>  
>                 for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
> -                       if (w->type != BATCH)
> -                               continue;
> -
>                         if (w->context != (j / 2))
>                                 continue;
>  
> -                       if (w->engine == VCS)
> -                               balance = true;
> -                       else
> -                               targets = true;
> +                       if (w->type == BATCH) {
> +                               if (w->engine == VCS)
> +                                       balance = true;
> +                               else
> +                                       targets = true;
> +                       } else if (w->type == ENGINE_MAP) {
> +                               wrk->ctx_list[j].engine_map = w->engine_map;
> +                               wrk->ctx_list[j].engine_map_count =
> +                                       w->engine_map_count;
> +                       }
>                 }
>  
> -               if (flags & I915) {
> -                       wrk->ctx_list[j].targets_instance = targets;
> +               wrk->ctx_list[j].targets_instance = targets;
> +               if (flags & I915)
>                         wrk->ctx_list[j].wants_balance = balance;
> +       }
> +
> +       /*
> +        * Ensure VCS is not allowed with engine map contexts.
> +        */
> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
> +                       if (w->context != (j / 2))
> +                               continue;
> +
> +                       if (w->type != BATCH)
> +                               continue;
> +
> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {

But wouldn't VCS still be meaning use the balancer and not a specific
engine???

I'm not understanding how you are using maps in the .wsim :(
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
@ 2019-05-17 19:35     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support new i915 uAPI for configuring contexts with engine maps.
> 
> Please refer to the README file for more detailed explanation.
> 
> v2:
>  * Allow defining engine maps by class.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
>  benchmarks/wsim/README |  25 ++++-
>  2 files changed, 204 insertions(+), 32 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index 60b7d32e22d4..e5b12e37490e 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -57,6 +57,7 @@
>  #include "ewma.h"
>  
>  enum intel_engine_id {
> +       DEFAULT,
>         RCS,
>         BCS,
>         VCS,
> @@ -81,7 +82,8 @@ enum w_type
>         SW_FENCE,
>         SW_FENCE_SIGNAL,
>         CTX_PRIORITY,
> -       PREEMPTION
> +       PREEMPTION,
> +       ENGINE_MAP
>  };
>  
>  struct deps
> @@ -115,6 +117,10 @@ struct w_step
>                 int throttle;
>                 int fence_signal;
>                 int priority;
> +               struct {
> +                       unsigned int engine_map_count;
> +                       enum intel_engine_id *engine_map;
> +               };
>         };
>  
>         /* Implementation details */
> @@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
>  struct ctx {
>         uint32_t id;
>         int priority;
> +       unsigned int engine_map_count;
> +       enum intel_engine_id *engine_map;
>         bool targets_instance;
>         bool wants_balance;
>         unsigned int static_vcs;
> @@ -200,10 +208,10 @@ struct workload
>                 int fd;
>                 bool first;
>                 unsigned int num_engines;
> -               unsigned int engine_map[5];
> +               unsigned int engine_map[NUM_ENGINES];
>                 uint64_t t_prev;
> -               uint64_t prev[5];
> -               double busy[5];
> +               uint64_t prev[NUM_ENGINES];
> +               double busy[NUM_ENGINES];
>         } busy_balancer;
>  };
>  
> @@ -234,6 +242,7 @@ static int fd;
>  #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
>  
>  static const char *ring_str_map[NUM_ENGINES] = {
> +       [DEFAULT] = "DEFAULT",
>         [RCS] = "RCS",
>         [BCS] = "BCS",
>         [VCS] = "VCS",
> @@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
>         return -1;
>  }
>  
> +static int parse_engine_map(struct w_step *step, const char *_str)
> +{
> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> +
> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> +               enum intel_engine_id engine;
> +               unsigned int add;
> +
> +               tstart = NULL;
> +
> +               if (!strcmp(token, "DEFAULT"))
> +                       return -1;
> +
> +               engine = str_to_engine(token);
> +               if ((int)engine < 0)
> +                       return -1;
> +
> +               if (engine != VCS && engine != VCS1 && engine != VCS2)
> +                       return -1; /* TODO */

Still a little concerned that the map is VCS only. It just doesn't fit
my expectations of what the map will be.

> +
> +               add = engine == VCS ? 2 : 1;

Will we not every ask what happens if we had millions of engines at our
disposal. But that's a tommorrow problem, ok.

> +               step->engine_map_count += add;
> +               step->engine_map = realloc(step->engine_map,
> +                                          step->engine_map_count *
> +                                          sizeof(step->engine_map[0]));
> +
> +               if (engine != VCS) {
> +                       step->engine_map[step->engine_map_count - 1] = engine;
> +               } else {
> +                       step->engine_map[step->engine_map_count - 2] = VCS1;
> +                       step->engine_map[step->engine_map_count - 1] = VCS2;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
>  static struct workload *
>  parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  {
> @@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>                         } else if (!strcmp(field, "f")) {
>                                 step.type = SW_FENCE;
>                                 goto add_step;
> +                       } else if (!strcmp(field, "M")) {
> +                               unsigned int nr = 0;
> +                               while ((field = strtok_r(fstart, ".", &fctx)) !=
> +                                   NULL) {
> +                                       tmp = atoi(field);
> +                                       check_arg(nr == 0 && tmp <= 0,
> +                                                 "Invalid context at step %u!\n",
> +                                                 nr_steps);
> +                                       check_arg(nr > 1,
> +                                                 "Invalid engine map format at step %u!\n",
> +                                                 nr_steps);
> +
> +                                       if (nr == 0) {
> +                                               step.context = tmp;
> +                                       } else {
> +                                               tmp = parse_engine_map(&step,
> +                                                                      field);
> +                                               check_arg(tmp < 0,
> +                                                         "Invalid engine map list at step %u!\n",
> +                                                         nr_steps);
> +                                       }
> +
> +                                       nr++;
> +                               }
> +
> +                               step.type = ENGINE_MAP;
> +                               goto add_step;
>                         } else if (!strcmp(field, "X")) {
>                                 unsigned int nr = 0;
>                                 while ((field = strtok_r(fstart, ".", &fctx)) !=
> @@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>  }
>  
>  static const unsigned int eb_engine_map[NUM_ENGINES] = {
> +       [DEFAULT] = I915_EXEC_DEFAULT,
>         [RCS] = I915_EXEC_RENDER,
>         [BCS] = I915_EXEC_BLT,
>         [VCS] = I915_EXEC_BSD,
> @@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
>                 eb->flags = eb_engine_map[engine];
>  }
>  
> +static unsigned int
> +find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
> +{
> +       unsigned int i;
> +
> +       for (i = 0; i < ctx->engine_map_count; i++) {
> +               if (ctx->engine_map[i] == engine)
> +                       return i + 1;
> +       }
> +
> +       igt_assert(0);
> +       return 0;

No balancer in the map at this point?

> +}
> +
> +static struct ctx *
> +__get_ctx(struct workload *wrk, struct w_step *w)
> +{
> +       return &wrk->ctx_list[w->context * 2];
> +}
> +
>  static void
> -eb_update_flags(struct w_step *w, enum intel_engine_id engine,
> -               unsigned int flags)
> +eb_update_flags(struct workload *wrk, struct w_step *w,
> +               enum intel_engine_id engine, unsigned int flags)
>  {
> -       eb_set_engine(&w->eb, engine, flags);
> +       struct ctx *ctx = __get_ctx(wrk, w);
> +
> +       if (ctx->engine_map)
> +               w->eb.flags = find_engine_in_map(ctx, engine);
> +       else
> +               eb_set_engine(&w->eb, engine, flags);
>  
>         w->eb.flags |= I915_EXEC_HANDLE_LUT;
>         w->eb.flags |= I915_EXEC_NO_RELOC;
> @@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
>                 return wrk->status_object;
>  }
>  
> -static struct ctx *
> -__get_ctx(struct workload *wrk, struct w_step *w)
> -{
> -       return &wrk->ctx_list[w->context * 2];
> -}
> -
>  static uint32_t
>  get_ctxid(struct workload *wrk, struct w_step *w)
>  {
> @@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>                 engine = VCS2;
>         else if (flags & SWAPVCS && engine == VCS2)
>                 engine = VCS1;
> -       eb_update_flags(w, engine, flags);
> +       eb_update_flags(wrk, w, engine, flags);
>  #ifdef DEBUG
>         printf("%u: %u:|", w->idx, w->eb.buffer_count);
>         for (i = 0; i <= j; i++)
> @@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
>         igt_assert_eq(__vm_destroy(i915, vm_id), 0);
>  }
>  
> -static void
> +static int
>  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>  {
>         unsigned int ctx_vcs;
> @@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>         /*
>          * Identify if contexts target specific engine instances and if they
>          * want to be balanced.
> +        *
> +        * Transfer over engine map configuration from the workload step.
>          */
>         for (j = 0; j < wrk->nr_ctxs; j += 2) {
>                 bool targets = false;
>                 bool balance = false;
>  
>                 for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
> -                       if (w->type != BATCH)
> -                               continue;
> -
>                         if (w->context != (j / 2))
>                                 continue;
>  
> -                       if (w->engine == VCS)
> -                               balance = true;
> -                       else
> -                               targets = true;
> +                       if (w->type == BATCH) {
> +                               if (w->engine == VCS)
> +                                       balance = true;
> +                               else
> +                                       targets = true;
> +                       } else if (w->type == ENGINE_MAP) {
> +                               wrk->ctx_list[j].engine_map = w->engine_map;
> +                               wrk->ctx_list[j].engine_map_count =
> +                                       w->engine_map_count;
> +                       }
>                 }
>  
> -               if (flags & I915) {
> -                       wrk->ctx_list[j].targets_instance = targets;
> +               wrk->ctx_list[j].targets_instance = targets;
> +               if (flags & I915)
>                         wrk->ctx_list[j].wants_balance = balance;
> +       }
> +
> +       /*
> +        * Ensure VCS is not allowed with engine map contexts.
> +        */
> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
> +                       if (w->context != (j / 2))
> +                               continue;
> +
> +                       if (w->type != BATCH)
> +                               continue;
> +
> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {

But wouldn't VCS still be meaning use the balancer and not a specific
engine???

I'm not understanding how you are using maps in the .wsim :(
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:36     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:36 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.

Why?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
@ 2019-05-17 19:36     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:36 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.

Why?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/25] gem_wsim: Engine bond command
  2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-17 19:41     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:41 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:17)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Engine bonds are an i915 uAPI applicable to load balanced contexts with
> engine map. They allow expression rules of engine selection between two
> contexts when submissions are also tied with submit fences.
> 
> Please refer to the README for a more detailed description.
> 
> v2:
>  * Use list of symbolic engine names instead of the mask. (Chris)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c  | 159 +++++++++++++++++++++++++++++++++++++++--
>  benchmarks/wsim/README |  50 +++++++++++++
>  2 files changed, 202 insertions(+), 7 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index f7f84d05010a..bd9201c2928b 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -85,6 +85,7 @@ enum w_type
>         PREEMPTION,
>         ENGINE_MAP,
>         LOAD_BALANCE,
> +       BOND,
>  };
>  
>  struct deps
> @@ -100,6 +101,11 @@ struct w_arg {
>         int prio;
>  };
>  
> +struct bond {
> +       uint64_t mask;
> +       enum intel_engine_id master;
> +};
> +
>  struct w_step
>  {
>         /* Workload step metadata */
> @@ -123,6 +129,10 @@ struct w_step
>                         enum intel_engine_id *engine_map;
>                 };
>                 bool load_balance;
> +               struct {
> +                       uint64_t bond_mask;
> +                       enum intel_engine_id bond_master;
> +               };
>         };
>  
>         /* Implementation details */
> @@ -152,6 +162,8 @@ struct ctx {
>         int priority;
>         unsigned int engine_map_count;
>         enum intel_engine_id *engine_map;
> +       unsigned int bond_count;
> +       struct bond *bonds;
>         bool targets_instance;
>         bool wants_balance;
>         unsigned int static_vcs;
> @@ -378,6 +390,26 @@ static int parse_engine_map(struct w_step *step, const char *_str)
>         return 0;
>  }
>  
> +static uint64_t engine_list_mask(const char *_str)
> +{
> +       uint64_t mask = 0;
> +
> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> +
> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> +               enum intel_engine_id engine = str_to_engine(token);
> +
> +               if ((int)engine < 0 || engine == DEFAULT || engine == VCS)
> +                       return 0;
> +
> +               mask |= 1 << engine;
> +
> +               tstart = NULL;
> +       }
> +
> +       return mask;
> +}
> +
>  #define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
>         if ((field = strtok_r(fstart, ".", &fctx))) { \
>                 tmp = atoi(field); \
> @@ -528,6 +560,39 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  
>                                 step.type = LOAD_BALANCE;
>                                 goto add_step;
> +                       } else if (!strcmp(field, "b")) {
> +                               unsigned int nr = 0;
> +                               while ((field = strtok_r(fstart, ".", &fctx))) {
> +                                       check_arg(nr > 2,
> +                                                 "Invalid bond format at step %u!\n",
> +                                                 nr_steps);
> +
> +                                       if (nr == 0) {
> +                                               tmp = atoi(field);
> +                                               step.context = tmp;
> +                                               check_arg(tmp <= 0,
> +                                                         "Invalid context at step %u!\n",
> +                                                         nr_steps);
> +                                       } else if (nr == 1) {
> +                                               step.bond_mask = engine_list_mask(field);
> +                                               check_arg(step.bond_mask == 0,
> +                                                       "Invalid siblings list at step %u!\n",
> +                                                       nr_steps);
> +                                       } else if (nr == 2) {
> +                                               tmp = str_to_engine(field);
> +                                               check_arg(tmp <= 0 ||
> +                                                         tmp == VCS ||
> +                                                         tmp == DEFAULT,
> +                                                         "Invalid master engine at step %u!\n",
> +                                                         nr_steps);
> +                                               step.bond_master = tmp;
> +                                       }
> +
> +                                       nr++;
> +                               }
> +
> +                               step.type = BOND;
> +                               goto add_step;
>                         }
>  
>                         if (!field) {
> @@ -1011,6 +1076,31 @@ static void vm_destroy(int i915, uint32_t vm_id)
>         igt_assert_eq(__vm_destroy(i915, vm_id), 0);
>  }
>  
> +static unsigned int
> +find_engine(struct i915_engine_class_instance *ci, unsigned int count,
> +           enum intel_engine_id engine)
> +{
> +       static struct i915_engine_class_instance map[] = {
> +               [RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
> +               [BCS] = { I915_ENGINE_CLASS_COPY, 0 },
> +               [VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
> +               [VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
> +               [VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
> +       };
> +       unsigned int i;
> +
> +       igt_assert(engine < ARRAY_SIZE(map));
> +       igt_assert(engine == RCS || map[engine].engine_class);
> +
> +       for (i = 0; i < count; i++, ci++) {
> +               if (!memcmp(&map[engine], ci, sizeof(*ci)))
> +                       return i;
> +       }
> +
> +       igt_assert(0);
> +       return 0;
> +}
> +
>  static int
>  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>  {
> @@ -1078,6 +1168,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>          * Transfer over engine map configuration from the workload step.
>          */
>         for (j = 0; j < wrk->nr_ctxs; j += 2) {
> +               struct ctx *ctx = &wrk->ctx_list[j];
> +
>                 bool targets = false;
>                 bool balance = false;
>  
> @@ -1091,16 +1183,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                 else
>                                         targets = true;
>                         } else if (w->type == ENGINE_MAP) {
> -                               wrk->ctx_list[j].engine_map = w->engine_map;
> -                               wrk->ctx_list[j].engine_map_count =
> -                                       w->engine_map_count;
> +                               ctx->engine_map = w->engine_map;
> +                               ctx->engine_map_count = w->engine_map_count;
>                         } else if (w->type == LOAD_BALANCE) {
> -                               if (!wrk->ctx_list[j].engine_map) {
> +                               if (!ctx->engine_map) {
>                                         wsim_err("Load balancing needs an engine map!\n");
>                                         return 1;
>                                 }
> -                               wrk->ctx_list[j].wants_balance =
> -                                       w->load_balance;
> +                               ctx->wants_balance = w->load_balance;
> +                       } else if (w->type == BOND) {
> +                               if (!ctx->wants_balance) {
> +                                       wsim_err("Engine bonds need load balancing engine map!\n");
> +                                       return 1;
> +                               }
> +                               ctx->bond_count++;
> +                               ctx->bonds = realloc(ctx->bonds,
> +                                                    ctx->bond_count *
> +                                                    sizeof(struct bond));
> +                               igt_assert(ctx->bonds);
> +                               ctx->bonds[ctx->bond_count - 1].mask =
> +                                       w->bond_mask;
> +                               ctx->bonds[ctx->bond_count - 1].master =
> +                                       w->bond_master;
>                         }
>                 }
>  
> @@ -1281,6 +1385,46 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                         ctx->engine_map[j - 1] - VCS1; /* FIXME */
>                         }
>  
> +                       for (j = 0; j < ctx->bond_count; j++) {
> +                               unsigned long mask = ctx->bonds[j].mask;
> +                               I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
> +                                                                __builtin_popcount(mask));
> +                               struct i915_context_engines_bond *p = NULL, *prev;
> +                               unsigned int b, e;
> +
> +                               prev = p;
> +                               p = alloca(sizeof(bond));
> +                               assert(p);
> +                               memset(p, 0, sizeof(bond));
> +
> +                               if (j == 0)
> +                                       load_balance.base.next_extension =
> +                                               to_user_pointer(p);
> +                               else if (j < (ctx->bond_count - 1))
> +                                       prev->base.next_extension =
> +                                               to_user_pointer(p);
> +
> +                               p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
> +                               p->virtual_index = 0;
> +                               p->master.engine_class =
> +                                       I915_ENGINE_CLASS_VIDEO;
> +                               p->master.engine_instance =
> +                                       ctx->bonds[j].master - VCS1;
> +
> +                               for (b = 0, e = 0; mask; e++, mask >>= 1) {
> +                                       unsigned int idx;
> +
> +                                       if (!(mask & 1))
> +                                               continue;
> +
> +                                       idx = find_engine(&set_engines.engines[1],
> +                                                         ctx->engine_map_count,
> +                                                         e);
> +                                       p->engines[b++] =
> +                                               set_engines.engines[1 + idx];
> +                               }
> +                       }

Ok, I was a little nervous of the transport through mask, but it checks
out.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 16/25] gem_wsim: Engine bond command
@ 2019-05-17 19:41     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:41 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:17)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Engine bonds are an i915 uAPI applicable to load balanced contexts with
> engine map. They allow expression rules of engine selection between two
> contexts when submissions are also tied with submit fences.
> 
> Please refer to the README for a more detailed description.
> 
> v2:
>  * Use list of symbolic engine names instead of the mask. (Chris)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  benchmarks/gem_wsim.c  | 159 +++++++++++++++++++++++++++++++++++++++--
>  benchmarks/wsim/README |  50 +++++++++++++
>  2 files changed, 202 insertions(+), 7 deletions(-)
> 
> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
> index f7f84d05010a..bd9201c2928b 100644
> --- a/benchmarks/gem_wsim.c
> +++ b/benchmarks/gem_wsim.c
> @@ -85,6 +85,7 @@ enum w_type
>         PREEMPTION,
>         ENGINE_MAP,
>         LOAD_BALANCE,
> +       BOND,
>  };
>  
>  struct deps
> @@ -100,6 +101,11 @@ struct w_arg {
>         int prio;
>  };
>  
> +struct bond {
> +       uint64_t mask;
> +       enum intel_engine_id master;
> +};
> +
>  struct w_step
>  {
>         /* Workload step metadata */
> @@ -123,6 +129,10 @@ struct w_step
>                         enum intel_engine_id *engine_map;
>                 };
>                 bool load_balance;
> +               struct {
> +                       uint64_t bond_mask;
> +                       enum intel_engine_id bond_master;
> +               };
>         };
>  
>         /* Implementation details */
> @@ -152,6 +162,8 @@ struct ctx {
>         int priority;
>         unsigned int engine_map_count;
>         enum intel_engine_id *engine_map;
> +       unsigned int bond_count;
> +       struct bond *bonds;
>         bool targets_instance;
>         bool wants_balance;
>         unsigned int static_vcs;
> @@ -378,6 +390,26 @@ static int parse_engine_map(struct w_step *step, const char *_str)
>         return 0;
>  }
>  
> +static uint64_t engine_list_mask(const char *_str)
> +{
> +       uint64_t mask = 0;
> +
> +       char *token, *tctx = NULL, *tstart = (char *)_str;
> +
> +       while ((token = strtok_r(tstart, "|", &tctx))) {
> +               enum intel_engine_id engine = str_to_engine(token);
> +
> +               if ((int)engine < 0 || engine == DEFAULT || engine == VCS)
> +                       return 0;
> +
> +               mask |= 1 << engine;
> +
> +               tstart = NULL;
> +       }
> +
> +       return mask;
> +}
> +
>  #define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
>         if ((field = strtok_r(fstart, ".", &fctx))) { \
>                 tmp = atoi(field); \
> @@ -528,6 +560,39 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>  
>                                 step.type = LOAD_BALANCE;
>                                 goto add_step;
> +                       } else if (!strcmp(field, "b")) {
> +                               unsigned int nr = 0;
> +                               while ((field = strtok_r(fstart, ".", &fctx))) {
> +                                       check_arg(nr > 2,
> +                                                 "Invalid bond format at step %u!\n",
> +                                                 nr_steps);
> +
> +                                       if (nr == 0) {
> +                                               tmp = atoi(field);
> +                                               step.context = tmp;
> +                                               check_arg(tmp <= 0,
> +                                                         "Invalid context at step %u!\n",
> +                                                         nr_steps);
> +                                       } else if (nr == 1) {
> +                                               step.bond_mask = engine_list_mask(field);
> +                                               check_arg(step.bond_mask == 0,
> +                                                       "Invalid siblings list at step %u!\n",
> +                                                       nr_steps);
> +                                       } else if (nr == 2) {
> +                                               tmp = str_to_engine(field);
> +                                               check_arg(tmp <= 0 ||
> +                                                         tmp == VCS ||
> +                                                         tmp == DEFAULT,
> +                                                         "Invalid master engine at step %u!\n",
> +                                                         nr_steps);
> +                                               step.bond_master = tmp;
> +                                       }
> +
> +                                       nr++;
> +                               }
> +
> +                               step.type = BOND;
> +                               goto add_step;
>                         }
>  
>                         if (!field) {
> @@ -1011,6 +1076,31 @@ static void vm_destroy(int i915, uint32_t vm_id)
>         igt_assert_eq(__vm_destroy(i915, vm_id), 0);
>  }
>  
> +static unsigned int
> +find_engine(struct i915_engine_class_instance *ci, unsigned int count,
> +           enum intel_engine_id engine)
> +{
> +       static struct i915_engine_class_instance map[] = {
> +               [RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
> +               [BCS] = { I915_ENGINE_CLASS_COPY, 0 },
> +               [VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
> +               [VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
> +               [VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
> +       };
> +       unsigned int i;
> +
> +       igt_assert(engine < ARRAY_SIZE(map));
> +       igt_assert(engine == RCS || map[engine].engine_class);
> +
> +       for (i = 0; i < count; i++, ci++) {
> +               if (!memcmp(&map[engine], ci, sizeof(*ci)))
> +                       return i;
> +       }
> +
> +       igt_assert(0);
> +       return 0;
> +}
> +
>  static int
>  prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>  {
> @@ -1078,6 +1168,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>          * Transfer over engine map configuration from the workload step.
>          */
>         for (j = 0; j < wrk->nr_ctxs; j += 2) {
> +               struct ctx *ctx = &wrk->ctx_list[j];
> +
>                 bool targets = false;
>                 bool balance = false;
>  
> @@ -1091,16 +1183,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                 else
>                                         targets = true;
>                         } else if (w->type == ENGINE_MAP) {
> -                               wrk->ctx_list[j].engine_map = w->engine_map;
> -                               wrk->ctx_list[j].engine_map_count =
> -                                       w->engine_map_count;
> +                               ctx->engine_map = w->engine_map;
> +                               ctx->engine_map_count = w->engine_map_count;
>                         } else if (w->type == LOAD_BALANCE) {
> -                               if (!wrk->ctx_list[j].engine_map) {
> +                               if (!ctx->engine_map) {
>                                         wsim_err("Load balancing needs an engine map!\n");
>                                         return 1;
>                                 }
> -                               wrk->ctx_list[j].wants_balance =
> -                                       w->load_balance;
> +                               ctx->wants_balance = w->load_balance;
> +                       } else if (w->type == BOND) {
> +                               if (!ctx->wants_balance) {
> +                                       wsim_err("Engine bonds need load balancing engine map!\n");
> +                                       return 1;
> +                               }
> +                               ctx->bond_count++;
> +                               ctx->bonds = realloc(ctx->bonds,
> +                                                    ctx->bond_count *
> +                                                    sizeof(struct bond));
> +                               igt_assert(ctx->bonds);
> +                               ctx->bonds[ctx->bond_count - 1].mask =
> +                                       w->bond_mask;
> +                               ctx->bonds[ctx->bond_count - 1].master =
> +                                       w->bond_master;
>                         }
>                 }
>  
> @@ -1281,6 +1385,46 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>                                         ctx->engine_map[j - 1] - VCS1; /* FIXME */
>                         }
>  
> +                       for (j = 0; j < ctx->bond_count; j++) {
> +                               unsigned long mask = ctx->bonds[j].mask;
> +                               I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
> +                                                                __builtin_popcount(mask));
> +                               struct i915_context_engines_bond *p = NULL, *prev;
> +                               unsigned int b, e;
> +
> +                               prev = p;
> +                               p = alloca(sizeof(bond));
> +                               assert(p);
> +                               memset(p, 0, sizeof(bond));
> +
> +                               if (j == 0)
> +                                       load_balance.base.next_extension =
> +                                               to_user_pointer(p);
> +                               else if (j < (ctx->bond_count - 1))
> +                                       prev->base.next_extension =
> +                                               to_user_pointer(p);
> +
> +                               p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
> +                               p->virtual_index = 0;
> +                               p->master.engine_class =
> +                                       I915_ENGINE_CLASS_VIDEO;
> +                               p->master.engine_instance =
> +                                       ctx->bonds[j].master - VCS1;
> +
> +                               for (b = 0, e = 0; mask; e++, mask >>= 1) {
> +                                       unsigned int idx;
> +
> +                                       if (!(mask & 1))
> +                                               continue;
> +
> +                                       idx = find_engine(&set_engines.engines[1],
> +                                                         ctx->engine_map_count,
> +                                                         e);
> +                                       p->engines[b++] =
> +                                               set_engines.engines[1 + idx];
> +                               }
> +                       }

Ok, I was a little nervous of the transport through mask, but it checks
out.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 19/25] gem_wsim: Command line switch for specifying low slice count workloads
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:43     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:43 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:20)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new command line switch ('-s') is added which toggles the low slice
> count mode for workloads following on the command line.
> 
> This enables easy benchmarking of the effect of running the existing media
> workloads in parallel against another client. For example:
> 
>   ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim
> 
> Adding or removing the '-s' switch before the second workload enables
> analyzing the cost of dynamic SSEU switching impacted to the first
> (master) workload.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Looks simple enough
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 19/25] gem_wsim: Command line switch for specifying low slice count workloads
@ 2019-05-17 19:43     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:43 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:20)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new command line switch ('-s') is added which toggles the low slice
> count mode for workloads following on the command line.
> 
> This enables easy benchmarking of the effect of running the existing media
> workloads in parallel against another client. For example:
> 
>   ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim
> 
> Adding or removing the '-s' switch before the second workload enables
> analyzing the cost of dynamic SSEU switching impacted to the first
> (master) workload.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Looks simple enough
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 20/25] gem_wsim: Per context SSEU control
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:44     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:21)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command ('S') is added which allows per context slice
> (re-)configuration.
> 
> v2:
>  * Only query device SSEU on first use. (Chris)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Fair enough,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 20/25] gem_wsim: Per context SSEU control
@ 2019-05-17 19:44     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:21)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command ('S') is added which allows per context slice
> (re-)configuration.
> 
> v2:
>  * Only query device SSEU on first use. (Chris)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Fair enough,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 21/25] gem_wsim: Allow RCS virtual engine with SSEU control
  2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-17 19:45     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:22)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> To allow exercising the SSEU configuration in combination with Virtual
> Engine, allow RCS to be specified in the engine map and use appropriate
> index based addressing when applying SSEU configuration to it.

Heh, I wouldn't have even bothered filtering and let the kernel complain
if it was invalid :)
 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 21/25] gem_wsim: Allow RCS virtual engine with SSEU control
@ 2019-05-17 19:45     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:25:22)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> To allow exercising the SSEU configuration in combination with Virtual
> Engine, allow RCS to be specified in the engine map and use appropriate
> index based addressing when applying SSEU configuration to it.

Heh, I wouldn't have even bothered filtering and let the kernel complain
if it was invalid :)
 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
  2019-05-17 11:51       ` Tvrtko Ursulin
@ 2019-05-17 19:50         ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:50 UTC (permalink / raw)
  To: Andi Shyti, Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:51:06)
> 
> On 17/05/2019 12:39, Andi Shyti wrote:
> > Hi Tvrtko,
> > 
> >> +static int
> >> +__i915_query(int i915, struct drm_i915_query *q)
> >> +{
> >> +    if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> >> +            return -errno;
> >> +    return 0;
> >> +}
> >> +
> >> +static int
> >> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> >> +{
> >> +    struct drm_i915_query q = {
> >> +            .num_items = n_items,
> >> +            .items_ptr = to_user_pointer(items),
> >> +    };
> >> +    return __i915_query(i915, &q);
> >> +}
> >> +
> >> +static void
> >> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> >> +{
> >> +    igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> >> +}
> >> +
> >> +static bool has_query(int i915)
> >> +{
> >> +    struct drm_i915_query query = {};
> >> +
> >> +    return __i915_query(i915, &query) == 0;
> >> +}
> >> +
> >> +static bool has_engine_query(int i915)
> >> +{
> >> +    struct drm_i915_query_item item = {
> >> +            .query_id = DRM_I915_QUERY_ENGINE_INFO,
> >> +    };
> >> +
> >> +    return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> >> +}
> >> +
> >> +static void query_engines(void)
> >> +{
> > 
> > [...]
> > 
> >> +            struct drm_i915_query_engine_info *engine_info;
> >> +            struct drm_i915_query_item item = {
> >> +                    .query_id = DRM_I915_QUERY_ENGINE_INFO,
> >> +            };
> >> +            const unsigned int sz = 4096;
> >> +            unsigned int i;
> >> +
> >> +            engine_info = malloc(sz);
> >> +            igt_assert(engine_info);
> >> +            memset(engine_info, 0, sz);
> >> +
> >> +            item.data_ptr = to_user_pointer(engine_info);
> >> +            item.length = sz;
> >> +
> >> +            i915_query_items(fd, &item, 1);
> >> +            igt_assert(item.length > 0);
> >> +            igt_assert(item.length <= sz);
> >> +
> >> +            num = engine_info->num_engines;
> >> +
> >> +            engines = calloc(num,
> >> +                             sizeof(struct i915_engine_class_instance));
> >> +            igt_assert(engines);
> >> +
> >> +            for (i = 0; i < num; i++) {
> >> +                    struct drm_i915_engine_info *engine =
> >> +                            (struct drm_i915_engine_info *)&engine_info->engines[i];
> >> +
> >> +                    engines[i] = engine->engine;
> >> +            }
> >> +    }
> >> +
> >> +    __engines = engines;
> >> +    __num_engines = num;
> >> +}
> > 
> > would it make sense to make a library out of all the above? e.g.
> > gem_engine_topology does similar thing (all static functions like
> > here, though).
> 
> Definitely yes, but coordinating all series seems tricky. I think best 
> would be to consolidate once everything gets merged?

The challenge is carving out the core into a separate library that
doesn't pull libigt.la in. (Tvrtko has already committed the cardinal sin
of using libigt outside of tests/.) At which point, you have just a
bunch of ioctl wrappers, and fwiw some of us may wish gem_wsim itself
was a scripting engine...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 24/25] gem_wsim: Discover engines
@ 2019-05-17 19:50         ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:50 UTC (permalink / raw)
  To: Andi Shyti, Tvrtko Ursulin; +Cc: igt-dev, Intel-gfx, Andi Shyti, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-17 12:51:06)
> 
> On 17/05/2019 12:39, Andi Shyti wrote:
> > Hi Tvrtko,
> > 
> >> +static int
> >> +__i915_query(int i915, struct drm_i915_query *q)
> >> +{
> >> +    if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
> >> +            return -errno;
> >> +    return 0;
> >> +}
> >> +
> >> +static int
> >> +__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> >> +{
> >> +    struct drm_i915_query q = {
> >> +            .num_items = n_items,
> >> +            .items_ptr = to_user_pointer(items),
> >> +    };
> >> +    return __i915_query(i915, &q);
> >> +}
> >> +
> >> +static void
> >> +i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
> >> +{
> >> +    igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
> >> +}
> >> +
> >> +static bool has_query(int i915)
> >> +{
> >> +    struct drm_i915_query query = {};
> >> +
> >> +    return __i915_query(i915, &query) == 0;
> >> +}
> >> +
> >> +static bool has_engine_query(int i915)
> >> +{
> >> +    struct drm_i915_query_item item = {
> >> +            .query_id = DRM_I915_QUERY_ENGINE_INFO,
> >> +    };
> >> +
> >> +    return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
> >> +}
> >> +
> >> +static void query_engines(void)
> >> +{
> > 
> > [...]
> > 
> >> +            struct drm_i915_query_engine_info *engine_info;
> >> +            struct drm_i915_query_item item = {
> >> +                    .query_id = DRM_I915_QUERY_ENGINE_INFO,
> >> +            };
> >> +            const unsigned int sz = 4096;
> >> +            unsigned int i;
> >> +
> >> +            engine_info = malloc(sz);
> >> +            igt_assert(engine_info);
> >> +            memset(engine_info, 0, sz);
> >> +
> >> +            item.data_ptr = to_user_pointer(engine_info);
> >> +            item.length = sz;
> >> +
> >> +            i915_query_items(fd, &item, 1);
> >> +            igt_assert(item.length > 0);
> >> +            igt_assert(item.length <= sz);
> >> +
> >> +            num = engine_info->num_engines;
> >> +
> >> +            engines = calloc(num,
> >> +                             sizeof(struct i915_engine_class_instance));
> >> +            igt_assert(engines);
> >> +
> >> +            for (i = 0; i < num; i++) {
> >> +                    struct drm_i915_engine_info *engine =
> >> +                            (struct drm_i915_engine_info *)&engine_info->engines[i];
> >> +
> >> +                    engines[i] = engine->engine;
> >> +            }
> >> +    }
> >> +
> >> +    __engines = engines;
> >> +    __num_engines = num;
> >> +}
> > 
> > would it make sense to make a library out of all the above? e.g.
> > gem_engine_topology does similar thing (all static functions like
> > here, though).
> 
> Definitely yes, but coordinating all series seems tricky. I think best 
> would be to consolidate once everything gets merged?

The challenge is carving out the core into a separate library that
doesn't pull libigt.la in. (Tvrtko has already committed the cardinal sin
of using libigt outside of tests/.) At which point, you have just a
bunch of ioctl wrappers, and fwiw some of us may wish gem_wsim itself
was a scripting engine...
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH i-g-t 25/25] gem_wsim: Support Icelake parts
  2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-17 19:51     ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:26)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> On Icelake second vcs engine is vcs2 instead of vcs1 so add some logical
> to physical instance remapping based on engine discovery to support it.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

It does the trick for now, but still feels very limited.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 25/25] gem_wsim: Support Icelake parts
@ 2019-05-17 19:51     ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-17 19:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-17 12:25:26)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> On Icelake second vcs engine is vcs2 instead of vcs1 so add some logical
> to physical instance remapping based on engine discovery to support it.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

It does the trick for now, but still feels very limited.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
  2019-05-17 19:36     ` Chris Wilson
@ 2019-05-20 10:27       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 10:27 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 17/05/2019 20:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
> 
> Why?

Hm... don't think there is a real reason and I definitely don't remember 
what I was thinking back when I wrote this sentence. :) I'll try lifting 
the restriction and see what happens.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command
@ 2019-05-20 10:27       ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 10:27 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 17/05/2019 20:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:16)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
> 
> Why?

Hm... don't think there is a real reason and I definitely don't remember 
what I was thinking back when I wrote this sentence. :) I'll try lifting 
the restriction and see what happens.

Regards,

Tvrtko


_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences
  2019-05-17 19:20     ` [igt-dev] [Intel-gfx] " Chris Wilson
@ 2019-05-20 10:30       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 10:30 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 17/05/2019 20:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:03)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> gem_wsim uses the sw_fence timeline and confuses the script.
> 
> sw_sync
> 
> How does this fare with clflush fences (which are .driver="i915") and
> all of the future .driver="i915" fences?
> 
> Looks like we are still prone to hitting that die. (Should die pretty
> quick on !llc)

Okay I have to use the timeline name as well, thanks!

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences
@ 2019-05-20 10:30       ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 10:30 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 17/05/2019 20:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:03)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> gem_wsim uses the sw_fence timeline and confuses the script.
> 
> sw_sync
> 
> How does this fare with clflush fences (which are .driver="i915") and
> all of the future .driver="i915" fences?
> 
> Looks like we are still prone to hitting that die. (Should die pretty
> quick on !llc)

Okay I have to use the timeline name as well, thanks!

Regards,

Tvrtko

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
  2019-05-17 19:35     ` Chris Wilson
@ 2019-05-20 10:49       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 10:49 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 17/05/2019 20:35, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Support new i915 uAPI for configuring contexts with engine maps.
>>
>> Please refer to the README file for more detailed explanation.
>>
>> v2:
>>   * Allow defining engine maps by class.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
>>   benchmarks/wsim/README |  25 ++++-
>>   2 files changed, 204 insertions(+), 32 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index 60b7d32e22d4..e5b12e37490e 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -57,6 +57,7 @@
>>   #include "ewma.h"
>>   
>>   enum intel_engine_id {
>> +       DEFAULT,
>>          RCS,
>>          BCS,
>>          VCS,
>> @@ -81,7 +82,8 @@ enum w_type
>>          SW_FENCE,
>>          SW_FENCE_SIGNAL,
>>          CTX_PRIORITY,
>> -       PREEMPTION
>> +       PREEMPTION,
>> +       ENGINE_MAP
>>   };
>>   
>>   struct deps
>> @@ -115,6 +117,10 @@ struct w_step
>>                  int throttle;
>>                  int fence_signal;
>>                  int priority;
>> +               struct {
>> +                       unsigned int engine_map_count;
>> +                       enum intel_engine_id *engine_map;
>> +               };
>>          };
>>   
>>          /* Implementation details */
>> @@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
>>   struct ctx {
>>          uint32_t id;
>>          int priority;
>> +       unsigned int engine_map_count;
>> +       enum intel_engine_id *engine_map;
>>          bool targets_instance;
>>          bool wants_balance;
>>          unsigned int static_vcs;
>> @@ -200,10 +208,10 @@ struct workload
>>                  int fd;
>>                  bool first;
>>                  unsigned int num_engines;
>> -               unsigned int engine_map[5];
>> +               unsigned int engine_map[NUM_ENGINES];
>>                  uint64_t t_prev;
>> -               uint64_t prev[5];
>> -               double busy[5];
>> +               uint64_t prev[NUM_ENGINES];
>> +               double busy[NUM_ENGINES];
>>          } busy_balancer;
>>   };
>>   
>> @@ -234,6 +242,7 @@ static int fd;
>>   #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
>>   
>>   static const char *ring_str_map[NUM_ENGINES] = {
>> +       [DEFAULT] = "DEFAULT",
>>          [RCS] = "RCS",
>>          [BCS] = "BCS",
>>          [VCS] = "VCS",
>> @@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
>>          return -1;
>>   }
>>   
>> +static int parse_engine_map(struct w_step *step, const char *_str)
>> +{
>> +       char *token, *tctx = NULL, *tstart = (char *)_str;
>> +
>> +       while ((token = strtok_r(tstart, "|", &tctx))) {
>> +               enum intel_engine_id engine;
>> +               unsigned int add;
>> +
>> +               tstart = NULL;
>> +
>> +               if (!strcmp(token, "DEFAULT"))
>> +                       return -1;
>> +
>> +               engine = str_to_engine(token);
>> +               if ((int)engine < 0)
>> +                       return -1;
>> +
>> +               if (engine != VCS && engine != VCS1 && engine != VCS2)
>> +                       return -1; /* TODO */
> 
> Still a little concerned that the map is VCS only. It just doesn't fit
> my expectations of what the map will be.

I think I could update this now that load_balance takes a list.

>> +
>> +               add = engine == VCS ? 2 : 1;
> 
> Will we not every ask what happens if we had millions of engines at our
> disposal. But that's a tommorrow problem, ok.

This is improved in a later patch. It felt easier to generalize at the 
end in this instance instead of trying to rebase the whole series.

> 
>> +               step->engine_map_count += add;
>> +               step->engine_map = realloc(step->engine_map,
>> +                                          step->engine_map_count *
>> +                                          sizeof(step->engine_map[0]));
>> +
>> +               if (engine != VCS) {
>> +                       step->engine_map[step->engine_map_count - 1] = engine;
>> +               } else {
>> +                       step->engine_map[step->engine_map_count - 2] = VCS1;
>> +                       step->engine_map[step->engine_map_count - 1] = VCS2;
>> +               }
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>   static struct workload *
>>   parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   {
>> @@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>                          } else if (!strcmp(field, "f")) {
>>                                  step.type = SW_FENCE;
>>                                  goto add_step;
>> +                       } else if (!strcmp(field, "M")) {
>> +                               unsigned int nr = 0;
>> +                               while ((field = strtok_r(fstart, ".", &fctx)) !=
>> +                                   NULL) {
>> +                                       tmp = atoi(field);
>> +                                       check_arg(nr == 0 && tmp <= 0,
>> +                                                 "Invalid context at step %u!\n",
>> +                                                 nr_steps);
>> +                                       check_arg(nr > 1,
>> +                                                 "Invalid engine map format at step %u!\n",
>> +                                                 nr_steps);
>> +
>> +                                       if (nr == 0) {
>> +                                               step.context = tmp;
>> +                                       } else {
>> +                                               tmp = parse_engine_map(&step,
>> +                                                                      field);
>> +                                               check_arg(tmp < 0,
>> +                                                         "Invalid engine map list at step %u!\n",
>> +                                                         nr_steps);
>> +                                       }
>> +
>> +                                       nr++;
>> +                               }
>> +
>> +                               step.type = ENGINE_MAP;
>> +                               goto add_step;
>>                          } else if (!strcmp(field, "X")) {
>>                                  unsigned int nr = 0;
>>                                  while ((field = strtok_r(fstart, ".", &fctx)) !=
>> @@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>   }
>>   
>>   static const unsigned int eb_engine_map[NUM_ENGINES] = {
>> +       [DEFAULT] = I915_EXEC_DEFAULT,
>>          [RCS] = I915_EXEC_RENDER,
>>          [BCS] = I915_EXEC_BLT,
>>          [VCS] = I915_EXEC_BSD,
>> @@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
>>                  eb->flags = eb_engine_map[engine];
>>   }
>>   
>> +static unsigned int
>> +find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
>> +{
>> +       unsigned int i;
>> +
>> +       for (i = 0; i < ctx->engine_map_count; i++) {
>> +               if (ctx->engine_map[i] == engine)
>> +                       return i + 1;
>> +       }
>> +
>> +       igt_assert(0);
>> +       return 0;
> 
> No balancer in the map at this point?

Correct, only in one of the later patches.

> 
>> +}
>> +
>> +static struct ctx *
>> +__get_ctx(struct workload *wrk, struct w_step *w)
>> +{
>> +       return &wrk->ctx_list[w->context * 2];
>> +}
>> +
>>   static void
>> -eb_update_flags(struct w_step *w, enum intel_engine_id engine,
>> -               unsigned int flags)
>> +eb_update_flags(struct workload *wrk, struct w_step *w,
>> +               enum intel_engine_id engine, unsigned int flags)
>>   {
>> -       eb_set_engine(&w->eb, engine, flags);
>> +       struct ctx *ctx = __get_ctx(wrk, w);
>> +
>> +       if (ctx->engine_map)
>> +               w->eb.flags = find_engine_in_map(ctx, engine);
>> +       else
>> +               eb_set_engine(&w->eb, engine, flags);
>>   
>>          w->eb.flags |= I915_EXEC_HANDLE_LUT;
>>          w->eb.flags |= I915_EXEC_NO_RELOC;
>> @@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
>>                  return wrk->status_object;
>>   }
>>   
>> -static struct ctx *
>> -__get_ctx(struct workload *wrk, struct w_step *w)
>> -{
>> -       return &wrk->ctx_list[w->context * 2];
>> -}
>> -
>>   static uint32_t
>>   get_ctxid(struct workload *wrk, struct w_step *w)
>>   {
>> @@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>>                  engine = VCS2;
>>          else if (flags & SWAPVCS && engine == VCS2)
>>                  engine = VCS1;
>> -       eb_update_flags(w, engine, flags);
>> +       eb_update_flags(wrk, w, engine, flags);
>>   #ifdef DEBUG
>>          printf("%u: %u:|", w->idx, w->eb.buffer_count);
>>          for (i = 0; i <= j; i++)
>> @@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
>>          igt_assert_eq(__vm_destroy(i915, vm_id), 0);
>>   }
>>   
>> -static void
>> +static int
>>   prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>   {
>>          unsigned int ctx_vcs;
>> @@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>          /*
>>           * Identify if contexts target specific engine instances and if they
>>           * want to be balanced.
>> +        *
>> +        * Transfer over engine map configuration from the workload step.
>>           */
>>          for (j = 0; j < wrk->nr_ctxs; j += 2) {
>>                  bool targets = false;
>>                  bool balance = false;
>>   
>>                  for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
>> -                       if (w->type != BATCH)
>> -                               continue;
>> -
>>                          if (w->context != (j / 2))
>>                                  continue;
>>   
>> -                       if (w->engine == VCS)
>> -                               balance = true;
>> -                       else
>> -                               targets = true;
>> +                       if (w->type == BATCH) {
>> +                               if (w->engine == VCS)
>> +                                       balance = true;
>> +                               else
>> +                                       targets = true;
>> +                       } else if (w->type == ENGINE_MAP) {
>> +                               wrk->ctx_list[j].engine_map = w->engine_map;
>> +                               wrk->ctx_list[j].engine_map_count =
>> +                                       w->engine_map_count;
>> +                       }
>>                  }
>>   
>> -               if (flags & I915) {
>> -                       wrk->ctx_list[j].targets_instance = targets;
>> +               wrk->ctx_list[j].targets_instance = targets;
>> +               if (flags & I915)
>>                          wrk->ctx_list[j].wants_balance = balance;
>> +       }
>> +
>> +       /*
>> +        * Ensure VCS is not allowed with engine map contexts.
>> +        */
>> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
>> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
>> +                       if (w->context != (j / 2))
>> +                               continue;
>> +
>> +                       if (w->type != BATCH)
>> +                               continue;
>> +
>> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
> 
> But wouldn't VCS still be meaning use the balancer and not a specific
> engine???
> 
> I'm not understanding how you are using maps in the .wsim :(

Batch sent to VCS means any VCS if not a context with a map, but VCS 
mentioned in the map now auto-expands to all present VCS instances.

VCS as engine specifier at execbuf time could be allowed if code would 
then check if there is a load balancer built of vcs engines in this context.

But what use case you think is not covered?

We got legacy wsim files which implicitly create a map by doing:

1.VCS.1000.0.0 -> submit a batch to any vcs

And then after this series you can also do:

M.1.VCS
B.1
1.DEFAULT.1000.0.0

Which would have the same effect.

You would seem want:

M.1.VCS
B.1
1.VCS.1000.0.0

?

But I don't see what it gains?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
@ 2019-05-20 10:49       ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 10:49 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 17/05/2019 20:35, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Support new i915 uAPI for configuring contexts with engine maps.
>>
>> Please refer to the README file for more detailed explanation.
>>
>> v2:
>>   * Allow defining engine maps by class.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
>>   benchmarks/wsim/README |  25 ++++-
>>   2 files changed, 204 insertions(+), 32 deletions(-)
>>
>> diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
>> index 60b7d32e22d4..e5b12e37490e 100644
>> --- a/benchmarks/gem_wsim.c
>> +++ b/benchmarks/gem_wsim.c
>> @@ -57,6 +57,7 @@
>>   #include "ewma.h"
>>   
>>   enum intel_engine_id {
>> +       DEFAULT,
>>          RCS,
>>          BCS,
>>          VCS,
>> @@ -81,7 +82,8 @@ enum w_type
>>          SW_FENCE,
>>          SW_FENCE_SIGNAL,
>>          CTX_PRIORITY,
>> -       PREEMPTION
>> +       PREEMPTION,
>> +       ENGINE_MAP
>>   };
>>   
>>   struct deps
>> @@ -115,6 +117,10 @@ struct w_step
>>                  int throttle;
>>                  int fence_signal;
>>                  int priority;
>> +               struct {
>> +                       unsigned int engine_map_count;
>> +                       enum intel_engine_id *engine_map;
>> +               };
>>          };
>>   
>>          /* Implementation details */
>> @@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
>>   struct ctx {
>>          uint32_t id;
>>          int priority;
>> +       unsigned int engine_map_count;
>> +       enum intel_engine_id *engine_map;
>>          bool targets_instance;
>>          bool wants_balance;
>>          unsigned int static_vcs;
>> @@ -200,10 +208,10 @@ struct workload
>>                  int fd;
>>                  bool first;
>>                  unsigned int num_engines;
>> -               unsigned int engine_map[5];
>> +               unsigned int engine_map[NUM_ENGINES];
>>                  uint64_t t_prev;
>> -               uint64_t prev[5];
>> -               double busy[5];
>> +               uint64_t prev[NUM_ENGINES];
>> +               double busy[NUM_ENGINES];
>>          } busy_balancer;
>>   };
>>   
>> @@ -234,6 +242,7 @@ static int fd;
>>   #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
>>   
>>   static const char *ring_str_map[NUM_ENGINES] = {
>> +       [DEFAULT] = "DEFAULT",
>>          [RCS] = "RCS",
>>          [BCS] = "BCS",
>>          [VCS] = "VCS",
>> @@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
>>          return -1;
>>   }
>>   
>> +static int parse_engine_map(struct w_step *step, const char *_str)
>> +{
>> +       char *token, *tctx = NULL, *tstart = (char *)_str;
>> +
>> +       while ((token = strtok_r(tstart, "|", &tctx))) {
>> +               enum intel_engine_id engine;
>> +               unsigned int add;
>> +
>> +               tstart = NULL;
>> +
>> +               if (!strcmp(token, "DEFAULT"))
>> +                       return -1;
>> +
>> +               engine = str_to_engine(token);
>> +               if ((int)engine < 0)
>> +                       return -1;
>> +
>> +               if (engine != VCS && engine != VCS1 && engine != VCS2)
>> +                       return -1; /* TODO */
> 
> Still a little concerned that the map is VCS only. It just doesn't fit
> my expectations of what the map will be.

I think I could update this now that load_balance takes a list.

>> +
>> +               add = engine == VCS ? 2 : 1;
> 
> Will we not every ask what happens if we had millions of engines at our
> disposal. But that's a tommorrow problem, ok.

This is improved in a later patch. It felt easier to generalize at the 
end in this instance instead of trying to rebase the whole series.

> 
>> +               step->engine_map_count += add;
>> +               step->engine_map = realloc(step->engine_map,
>> +                                          step->engine_map_count *
>> +                                          sizeof(step->engine_map[0]));
>> +
>> +               if (engine != VCS) {
>> +                       step->engine_map[step->engine_map_count - 1] = engine;
>> +               } else {
>> +                       step->engine_map[step->engine_map_count - 2] = VCS1;
>> +                       step->engine_map[step->engine_map_count - 1] = VCS2;
>> +               }
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>   static struct workload *
>>   parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>   {
>> @@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
>>                          } else if (!strcmp(field, "f")) {
>>                                  step.type = SW_FENCE;
>>                                  goto add_step;
>> +                       } else if (!strcmp(field, "M")) {
>> +                               unsigned int nr = 0;
>> +                               while ((field = strtok_r(fstart, ".", &fctx)) !=
>> +                                   NULL) {
>> +                                       tmp = atoi(field);
>> +                                       check_arg(nr == 0 && tmp <= 0,
>> +                                                 "Invalid context at step %u!\n",
>> +                                                 nr_steps);
>> +                                       check_arg(nr > 1,
>> +                                                 "Invalid engine map format at step %u!\n",
>> +                                                 nr_steps);
>> +
>> +                                       if (nr == 0) {
>> +                                               step.context = tmp;
>> +                                       } else {
>> +                                               tmp = parse_engine_map(&step,
>> +                                                                      field);
>> +                                               check_arg(tmp < 0,
>> +                                                         "Invalid engine map list at step %u!\n",
>> +                                                         nr_steps);
>> +                                       }
>> +
>> +                                       nr++;
>> +                               }
>> +
>> +                               step.type = ENGINE_MAP;
>> +                               goto add_step;
>>                          } else if (!strcmp(field, "X")) {
>>                                  unsigned int nr = 0;
>>                                  while ((field = strtok_r(fstart, ".", &fctx)) !=
>> @@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>   }
>>   
>>   static const unsigned int eb_engine_map[NUM_ENGINES] = {
>> +       [DEFAULT] = I915_EXEC_DEFAULT,
>>          [RCS] = I915_EXEC_RENDER,
>>          [BCS] = I915_EXEC_BLT,
>>          [VCS] = I915_EXEC_BSD,
>> @@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
>>                  eb->flags = eb_engine_map[engine];
>>   }
>>   
>> +static unsigned int
>> +find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
>> +{
>> +       unsigned int i;
>> +
>> +       for (i = 0; i < ctx->engine_map_count; i++) {
>> +               if (ctx->engine_map[i] == engine)
>> +                       return i + 1;
>> +       }
>> +
>> +       igt_assert(0);
>> +       return 0;
> 
> No balancer in the map at this point?

Correct, only in one of the later patches.

> 
>> +}
>> +
>> +static struct ctx *
>> +__get_ctx(struct workload *wrk, struct w_step *w)
>> +{
>> +       return &wrk->ctx_list[w->context * 2];
>> +}
>> +
>>   static void
>> -eb_update_flags(struct w_step *w, enum intel_engine_id engine,
>> -               unsigned int flags)
>> +eb_update_flags(struct workload *wrk, struct w_step *w,
>> +               enum intel_engine_id engine, unsigned int flags)
>>   {
>> -       eb_set_engine(&w->eb, engine, flags);
>> +       struct ctx *ctx = __get_ctx(wrk, w);
>> +
>> +       if (ctx->engine_map)
>> +               w->eb.flags = find_engine_in_map(ctx, engine);
>> +       else
>> +               eb_set_engine(&w->eb, engine, flags);
>>   
>>          w->eb.flags |= I915_EXEC_HANDLE_LUT;
>>          w->eb.flags |= I915_EXEC_NO_RELOC;
>> @@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
>>                  return wrk->status_object;
>>   }
>>   
>> -static struct ctx *
>> -__get_ctx(struct workload *wrk, struct w_step *w)
>> -{
>> -       return &wrk->ctx_list[w->context * 2];
>> -}
>> -
>>   static uint32_t
>>   get_ctxid(struct workload *wrk, struct w_step *w)
>>   {
>> @@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>>                  engine = VCS2;
>>          else if (flags & SWAPVCS && engine == VCS2)
>>                  engine = VCS1;
>> -       eb_update_flags(w, engine, flags);
>> +       eb_update_flags(wrk, w, engine, flags);
>>   #ifdef DEBUG
>>          printf("%u: %u:|", w->idx, w->eb.buffer_count);
>>          for (i = 0; i <= j; i++)
>> @@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
>>          igt_assert_eq(__vm_destroy(i915, vm_id), 0);
>>   }
>>   
>> -static void
>> +static int
>>   prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>   {
>>          unsigned int ctx_vcs;
>> @@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
>>          /*
>>           * Identify if contexts target specific engine instances and if they
>>           * want to be balanced.
>> +        *
>> +        * Transfer over engine map configuration from the workload step.
>>           */
>>          for (j = 0; j < wrk->nr_ctxs; j += 2) {
>>                  bool targets = false;
>>                  bool balance = false;
>>   
>>                  for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
>> -                       if (w->type != BATCH)
>> -                               continue;
>> -
>>                          if (w->context != (j / 2))
>>                                  continue;
>>   
>> -                       if (w->engine == VCS)
>> -                               balance = true;
>> -                       else
>> -                               targets = true;
>> +                       if (w->type == BATCH) {
>> +                               if (w->engine == VCS)
>> +                                       balance = true;
>> +                               else
>> +                                       targets = true;
>> +                       } else if (w->type == ENGINE_MAP) {
>> +                               wrk->ctx_list[j].engine_map = w->engine_map;
>> +                               wrk->ctx_list[j].engine_map_count =
>> +                                       w->engine_map_count;
>> +                       }
>>                  }
>>   
>> -               if (flags & I915) {
>> -                       wrk->ctx_list[j].targets_instance = targets;
>> +               wrk->ctx_list[j].targets_instance = targets;
>> +               if (flags & I915)
>>                          wrk->ctx_list[j].wants_balance = balance;
>> +       }
>> +
>> +       /*
>> +        * Ensure VCS is not allowed with engine map contexts.
>> +        */
>> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
>> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
>> +                       if (w->context != (j / 2))
>> +                               continue;
>> +
>> +                       if (w->type != BATCH)
>> +                               continue;
>> +
>> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
> 
> But wouldn't VCS still be meaning use the balancer and not a specific
> engine???
> 
> I'm not understanding how you are using maps in the .wsim :(

Batch sent to VCS means any VCS if not a context with a map, but VCS 
mentioned in the map now auto-expands to all present VCS instances.

VCS as engine specifier at execbuf time could be allowed if code would 
then check if there is a load balancer built of vcs engines in this context.

But what use case you think is not covered?

We got legacy wsim files which implicitly create a map by doing:

1.VCS.1000.0.0 -> submit a batch to any vcs

And then after this series you can also do:

M.1.VCS
B.1
1.DEFAULT.1000.0.0

Which would have the same effect.

You would seem want:

M.1.VCS
B.1
1.VCS.1000.0.0

?

But I don't see what it gains?

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
  2019-05-20 10:49       ` Tvrtko Ursulin
@ 2019-05-20 10:59         ` Chris Wilson
  -1 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-20 10:59 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-20 11:49:13)
> 
> On 17/05/2019 20:35, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
> >> +       /*
> >> +        * Ensure VCS is not allowed with engine map contexts.
> >> +        */
> >> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
> >> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
> >> +                       if (w->context != (j / 2))
> >> +                               continue;
> >> +
> >> +                       if (w->type != BATCH)
> >> +                               continue;
> >> +
> >> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
> > 
> > But wouldn't VCS still be meaning use the balancer and not a specific
> > engine???
> > 
> > I'm not understanding how you are using maps in the .wsim :(
> 
> Batch sent to VCS means any VCS if not a context with a map, but VCS 
> mentioned in the map now auto-expands to all present VCS instances.
> 
> VCS as engine specifier at execbuf time could be allowed if code would 
> then check if there is a load balancer built of vcs engines in this context.
> 
> But what use case you think is not covered?
> 
> We got legacy wsim files which implicitly create a map by doing:
> 
> 1.VCS.1000.0.0 -> submit a batch to any vcs
> 
> And then after this series you can also do:
> 
> M.1.VCS
> B.1
> 1.DEFAULT.1000.0.0
> 
> Which would have the same effect.
> 
> You would seem want:
> 
> M.1.VCS
> B.1
> 1.VCS.1000.0.0
> 
> ?
> 
> But I don't see what it gains?

I just have a picture of a map consisting of

	[RCS] = rcs0,
	[BCS] = 0,
	[VCS] = (vcs0, vcs2),

Then the workload would be a single context, feeding batches to RCS and
VCS, which are then mapped to hardware and balanced as suitable. One
could go even further with RCS0, RCS1 for different logical state within
the same client context (different pipelines, same vm). That is how I
think I would decompose the media workloads given a fresh start on top
of the new api -- and then probably cursing the limits of that api.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
@ 2019-05-20 10:59         ` Chris Wilson
  0 siblings, 0 replies; 109+ messages in thread
From: Chris Wilson @ 2019-05-20 10:59 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-20 11:49:13)
> 
> On 17/05/2019 20:35, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
> >> +       /*
> >> +        * Ensure VCS is not allowed with engine map contexts.
> >> +        */
> >> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
> >> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
> >> +                       if (w->context != (j / 2))
> >> +                               continue;
> >> +
> >> +                       if (w->type != BATCH)
> >> +                               continue;
> >> +
> >> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
> > 
> > But wouldn't VCS still be meaning use the balancer and not a specific
> > engine???
> > 
> > I'm not understanding how you are using maps in the .wsim :(
> 
> Batch sent to VCS means any VCS if not a context with a map, but VCS 
> mentioned in the map now auto-expands to all present VCS instances.
> 
> VCS as engine specifier at execbuf time could be allowed if code would 
> then check if there is a load balancer built of vcs engines in this context.
> 
> But what use case you think is not covered?
> 
> We got legacy wsim files which implicitly create a map by doing:
> 
> 1.VCS.1000.0.0 -> submit a batch to any vcs
> 
> And then after this series you can also do:
> 
> M.1.VCS
> B.1
> 1.DEFAULT.1000.0.0
> 
> Which would have the same effect.
> 
> You would seem want:
> 
> M.1.VCS
> B.1
> 1.VCS.1000.0.0
> 
> ?
> 
> But I don't see what it gains?

I just have a picture of a map consisting of

	[RCS] = rcs0,
	[BCS] = 0,
	[VCS] = (vcs0, vcs2),

Then the workload would be a single context, feeding batches to RCS and
VCS, which are then mapped to hardware and balanced as suitable. One
could go even further with RCS0, RCS1 for different logical state within
the same client context (different pipelines, same vm). That is how I
think I would decompose the media workloads given a fresh start on top
of the new api -- and then probably cursing the limits of that api.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
  2019-05-20 10:59         ` Chris Wilson
@ 2019-05-20 11:10           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 11:10 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 20/05/2019 11:59, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-20 11:49:13)
>>
>> On 17/05/2019 20:35, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
>>>> +       /*
>>>> +        * Ensure VCS is not allowed with engine map contexts.
>>>> +        */
>>>> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
>>>> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
>>>> +                       if (w->context != (j / 2))
>>>> +                               continue;
>>>> +
>>>> +                       if (w->type != BATCH)
>>>> +                               continue;
>>>> +
>>>> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
>>>
>>> But wouldn't VCS still be meaning use the balancer and not a specific
>>> engine???
>>>
>>> I'm not understanding how you are using maps in the .wsim :(
>>
>> Batch sent to VCS means any VCS if not a context with a map, but VCS
>> mentioned in the map now auto-expands to all present VCS instances.
>>
>> VCS as engine specifier at execbuf time could be allowed if code would
>> then check if there is a load balancer built of vcs engines in this context.
>>
>> But what use case you think is not covered?
>>
>> We got legacy wsim files which implicitly create a map by doing:
>>
>> 1.VCS.1000.0.0 -> submit a batch to any vcs
>>
>> And then after this series you can also do:
>>
>> M.1.VCS
>> B.1
>> 1.DEFAULT.1000.0.0
>>
>> Which would have the same effect.
>>
>> You would seem want:
>>
>> M.1.VCS
>> B.1
>> 1.VCS.1000.0.0
>>
>> ?
>>
>> But I don't see what it gains?
> 
> I just have a picture of a map consisting of
> 
> 	[RCS] = rcs0,
> 	[BCS] = 0,
> 	[VCS] = (vcs0, vcs2),
> 
> Then the workload would be a single context, feeding batches to RCS and
> VCS, which are then mapped to hardware and balanced as suitable. One
> could go even further with RCS0, RCS1 for different logical state within
> the same client context (different pipelines, same vm). That is how I
> think I would decompose the media workloads given a fresh start on top
> of the new api -- and then probably cursing the limits of that api.

Hm.. this is quite an appealing idea. I'll give it some thought to see 
how difficult or easy it would be to implement it. I however ask for 
dispensation to consider this follow up work since turning some 
implementation details on their head could be a bit time consuming.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 12/25] gem_wsim: Engine map support
@ 2019-05-20 11:10           ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 11:10 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 20/05/2019 11:59, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-20 11:49:13)
>>
>> On 17/05/2019 20:35, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-05-17 12:25:13)
>>>> +       /*
>>>> +        * Ensure VCS is not allowed with engine map contexts.
>>>> +        */
>>>> +       for (j = 0; j < wrk->nr_ctxs; j += 2) {
>>>> +               for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
>>>> +                       if (w->context != (j / 2))
>>>> +                               continue;
>>>> +
>>>> +                       if (w->type != BATCH)
>>>> +                               continue;
>>>> +
>>>> +                       if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
>>>
>>> But wouldn't VCS still be meaning use the balancer and not a specific
>>> engine???
>>>
>>> I'm not understanding how you are using maps in the .wsim :(
>>
>> Batch sent to VCS means any VCS if not a context with a map, but VCS
>> mentioned in the map now auto-expands to all present VCS instances.
>>
>> VCS as engine specifier at execbuf time could be allowed if code would
>> then check if there is a load balancer built of vcs engines in this context.
>>
>> But what use case you think is not covered?
>>
>> We got legacy wsim files which implicitly create a map by doing:
>>
>> 1.VCS.1000.0.0 -> submit a batch to any vcs
>>
>> And then after this series you can also do:
>>
>> M.1.VCS
>> B.1
>> 1.DEFAULT.1000.0.0
>>
>> Which would have the same effect.
>>
>> You would seem want:
>>
>> M.1.VCS
>> B.1
>> 1.VCS.1000.0.0
>>
>> ?
>>
>> But I don't see what it gains?
> 
> I just have a picture of a map consisting of
> 
> 	[RCS] = rcs0,
> 	[BCS] = 0,
> 	[VCS] = (vcs0, vcs2),
> 
> Then the workload would be a single context, feeding batches to RCS and
> VCS, which are then mapped to hardware and balanced as suitable. One
> could go even further with RCS0, RCS1 for different logical state within
> the same client context (different pipelines, same vm). That is how I
> think I would decompose the media workloads given a fresh start on top
> of the new api -- and then probably cursing the limits of that api.

Hm.. this is quite an appealing idea. I'll give it some thought to see 
how difficult or easy it would be to implement it. I however ask for 
dispensation to consider this follow up work since turning some 
implementation details on their head could be a bit time consuming.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v2 02/25] trace.pl: Ignore signaling on non i915 fences
  2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-20 12:04     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 12:04 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

gem_wsim uses the sw_fence timeline and confuses the script.

v2:
 * Check the correct timeline as well. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index b7bbabc79f68..5f70cd23979b 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -439,6 +439,8 @@ while (<>) {
 	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
 		my $gkey;
 
+		next unless $tp{'driver'} eq 'i915' and
+			    $tp{'timeline'} eq 'signaled';
 		die unless exists $ctxengines{$tp{'context'}};
 
 		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] [PATCH v2 02/25] trace.pl: Ignore signaling on non i915 fences
@ 2019-05-20 12:04     ` Tvrtko Ursulin
  0 siblings, 0 replies; 109+ messages in thread
From: Tvrtko Ursulin @ 2019-05-20 12:04 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

gem_wsim uses the sw_fence timeline and confuses the script.

v2:
 * Check the correct timeline as well. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 scripts/trace.pl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/trace.pl b/scripts/trace.pl
index b7bbabc79f68..5f70cd23979b 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -439,6 +439,8 @@ while (<>) {
 	} elsif ($tp_name eq 'dma_fence:dma_fence_signaled:') {
 		my $gkey;
 
+		next unless $tp{'driver'} eq 'i915' and
+			    $tp{'timeline'} eq 'signaled';
 		die unless exists $ctxengines{$tp{'context'}};
 
 		$gkey = db_key($ctxengines{$tp{'context'}}, $tp{'context'}, $tp{'seqno'});
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Media scalability tooling (rev4)
  2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
                   ` (27 preceding siblings ...)
  (?)
@ 2019-05-20 13:30 ` Patchwork
  -1 siblings, 0 replies; 109+ messages in thread
From: Patchwork @ 2019-05-20 13:30 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Media scalability tooling (rev4)
URL   : https://patchwork.freedesktop.org/series/51193/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6100 -> IGTPW_3008
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/51193/revisions/4/mbox/

Known issues
------------

  Here are the changes found in IGTPW_3008 that come from known issues:

### IGT changes ###

#### Warnings ####

  * igt@i915_selftest@live_hangcheck:
    - fi-apl-guc:         [FAIL][1] ([fdo#110623]) -> [DMESG-FAIL][2] ([fdo#110620])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6100/fi-apl-guc/igt@i915_selftest@live_hangcheck.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3008/fi-apl-guc/igt@i915_selftest@live_hangcheck.html

  
  [fdo#110620]: https://bugs.freedesktop.org/show_bug.cgi?id=110620
  [fdo#110623]: https://bugs.freedesktop.org/show_bug.cgi?id=110623


Participating hosts (50 -> 45)
------------------------------

  Additional (1): fi-icl-y 
  Missing    (6): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper 


Build changes
-------------

  * IGT: IGT_4997 -> IGTPW_3008

  CI_DRM_6100: 6cd6c39683ceb3e48b98871fd25aea47f291f2b0 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_3008: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3008/
  IGT_4997: eff5d0db3248734845b78fcc2e2772dd4012e5af @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+igt@i915_query@engine-info
+igt@i915_query@engine-info-invalid

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3008/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 109+ messages in thread

end of thread, other threads:[~2019-05-20 13:30 UTC | newest]

Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17 11:25 [PATCH i-g-t 00/25] Media scalability tooling Tvrtko Ursulin
2019-05-17 11:25 ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 01/25] scripts/trace.pl: Fix after intel_engine_notify removal Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 02/25] trace.pl: Ignore signaling on non i915 fences Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 19:20   ` Chris Wilson
2019-05-17 19:20     ` [igt-dev] [Intel-gfx] " Chris Wilson
2019-05-20 10:30     ` Tvrtko Ursulin
2019-05-20 10:30       ` [igt-dev] [Intel-gfx] " Tvrtko Ursulin
2019-05-20 12:04   ` [PATCH v2 " Tvrtko Ursulin
2019-05-20 12:04     ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 03/25] headers: bump Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 04/25] trace.pl: Virtual engine support Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 19:23   ` Chris Wilson
2019-05-17 19:23     ` [igt-dev] " Chris Wilson
2019-05-17 11:25 ` [PATCH i-g-t 05/25] trace.pl: Virtual engine preemption support Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 19:24   ` Chris Wilson
2019-05-17 19:24     ` Chris Wilson
2019-05-17 11:25 ` [PATCH i-g-t 06/25] wsim/media-bench: i915 balancing Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 07/25] gem_wsim: Use IGT uapi headers Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 08/25] gem_wsim: Factor out common error handling Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 09/25] gem_wsim: More wsim_err Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 10/25] gem_wsim: Submit fence support Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 11/25] gem_wsim: Extract str to engine lookup Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 12/25] gem_wsim: Engine map support Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 19:35   ` Chris Wilson
2019-05-17 19:35     ` Chris Wilson
2019-05-20 10:49     ` Tvrtko Ursulin
2019-05-20 10:49       ` Tvrtko Ursulin
2019-05-20 10:59       ` Chris Wilson
2019-05-20 10:59         ` Chris Wilson
2019-05-20 11:10         ` Tvrtko Ursulin
2019-05-20 11:10           ` Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 13/25] gem_wsim: Save some lines by changing to implicit NULL checking Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 14/25] gem_wsim: Compact int command parsing with a macro Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 15/25] gem_wsim: Engine map load balance command Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:38   ` Chris Wilson
2019-05-17 11:38     ` Chris Wilson
2019-05-17 11:52     ` Tvrtko Ursulin
2019-05-17 11:52       ` Tvrtko Ursulin
2019-05-17 13:19       ` Chris Wilson
2019-05-17 13:19         ` Chris Wilson
2019-05-17 19:36   ` Chris Wilson
2019-05-17 19:36     ` Chris Wilson
2019-05-20 10:27     ` Tvrtko Ursulin
2019-05-20 10:27       ` Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 16/25] gem_wsim: Engine bond command Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 19:41   ` [igt-dev] " Chris Wilson
2019-05-17 19:41     ` Chris Wilson
2019-05-17 11:25 ` [PATCH i-g-t 17/25] gem_wsim: Some more example workloads Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 18/25] gem_wsim: Infinite batch support Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 19/25] gem_wsim: Command line switch for specifying low slice count workloads Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 19:43   ` Chris Wilson
2019-05-17 19:43     ` Chris Wilson
2019-05-17 11:25 ` [PATCH i-g-t 20/25] gem_wsim: Per context SSEU control Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 19:44   ` Chris Wilson
2019-05-17 19:44     ` Chris Wilson
2019-05-17 11:25 ` [PATCH i-g-t 21/25] gem_wsim: Allow RCS virtual engine with " Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 19:45   ` Chris Wilson
2019-05-17 19:45     ` Chris Wilson
2019-05-17 11:25 ` [PATCH i-g-t 22/25] tests/i915_query: Engine discovery tests Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 23/25] gem_wsim: Consolidate engine assignments into helpers Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 24/25] gem_wsim: Discover engines Tvrtko Ursulin
2019-05-17 11:25   ` [igt-dev] " Tvrtko Ursulin
2019-05-17 11:39   ` Andi Shyti
2019-05-17 11:39     ` Andi Shyti
2019-05-17 11:51     ` Tvrtko Ursulin
2019-05-17 11:51       ` Tvrtko Ursulin
2019-05-17 11:55       ` Andi Shyti
2019-05-17 11:55         ` Andi Shyti
2019-05-17 19:50       ` Chris Wilson
2019-05-17 19:50         ` Chris Wilson
2019-05-17 12:10   ` Andi Shyti
2019-05-17 12:10     ` Andi Shyti
2019-05-17 12:19     ` Tvrtko Ursulin
2019-05-17 12:19       ` Tvrtko Ursulin
2019-05-17 13:02       ` Andi Shyti
2019-05-17 13:02         ` Andi Shyti
2019-05-17 13:05         ` Tvrtko Ursulin
2019-05-17 13:05           ` Tvrtko Ursulin
2019-05-17 11:25 ` [PATCH i-g-t 25/25] gem_wsim: Support Icelake parts Tvrtko Ursulin
2019-05-17 11:25   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-17 19:51   ` Chris Wilson
2019-05-17 19:51     ` [igt-dev] [Intel-gfx] " Chris Wilson
2019-05-17 12:18 ` [igt-dev] ✓ Fi.CI.BAT: success for Media scalability tooling (rev3) Patchwork
2019-05-17 17:33 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
2019-05-20 13:30 ` [igt-dev] ✓ Fi.CI.BAT: success for Media scalability tooling (rev4) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.