All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH i-g-t 00/15] Remaining bits of Virtual Engine tooling
@ 2019-05-22 15:57 ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Yet unmerged work with five unreviewed patches.

Tvrtko Ursulin (15):
  gem_wsim: Engine map support
  gem_wsim: Save some lines by changing to implicit NULL checking
  gem_wsim: Compact int command parsing with a macro
  gem_wsim: Engine map load balance command
  gem_wsim: Engine bond command
  gem_wsim: Some more example workloads
  gem_wsim: Infinite batch support
  gem_wsim: Command line switch for specifying low slice count workloads
  gem_wsim: Per context SSEU control
  gem_wsim: Allow RCS virtual engine with SSEU control
  gem_wsim: Consolidate engine assignments into helpers
  gem_wsim: Discover engines
  gem_wsim: Support Icelake parts
  gem_wsim: Fix prng usage
  gem_wsim: Allow random seed control

 benchmarks/gem_wsim.c                       | 1020 ++++++++++++++++---
 benchmarks/wsim/README                      |  122 ++-
 benchmarks/wsim/frame-split-60fps.wsim      |   18 +
 benchmarks/wsim/high-composited-game.wsim   |   11 +
 benchmarks/wsim/media-1080p-player.wsim     |    5 +
 benchmarks/wsim/medium-composited-game.wsim |    9 +
 6 files changed, 1022 insertions(+), 163 deletions(-)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 00/15] Remaining bits of Virtual Engine tooling
@ 2019-05-22 15:57 ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Yet unmerged work with five unreviewed patches.

Tvrtko Ursulin (15):
  gem_wsim: Engine map support
  gem_wsim: Save some lines by changing to implicit NULL checking
  gem_wsim: Compact int command parsing with a macro
  gem_wsim: Engine map load balance command
  gem_wsim: Engine bond command
  gem_wsim: Some more example workloads
  gem_wsim: Infinite batch support
  gem_wsim: Command line switch for specifying low slice count workloads
  gem_wsim: Per context SSEU control
  gem_wsim: Allow RCS virtual engine with SSEU control
  gem_wsim: Consolidate engine assignments into helpers
  gem_wsim: Discover engines
  gem_wsim: Support Icelake parts
  gem_wsim: Fix prng usage
  gem_wsim: Allow random seed control

 benchmarks/gem_wsim.c                       | 1020 ++++++++++++++++---
 benchmarks/wsim/README                      |  122 ++-
 benchmarks/wsim/frame-split-60fps.wsim      |   18 +
 benchmarks/wsim/high-composited-game.wsim   |   11 +
 benchmarks/wsim/media-1080p-player.wsim     |    5 +
 benchmarks/wsim/medium-composited-game.wsim |    9 +
 6 files changed, 1022 insertions(+), 163 deletions(-)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 01/15] gem_wsim: Engine map support
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support new i915 uAPI for configuring contexts with engine maps.

Please refer to the README file for more detailed explanation.

v2:
 * Allow defining engine maps by class.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README |  25 ++++-
 2 files changed, 204 insertions(+), 32 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 60b7d32e22d4..e5b12e37490e 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -57,6 +57,7 @@
 #include "ewma.h"
 
 enum intel_engine_id {
+	DEFAULT,
 	RCS,
 	BCS,
 	VCS,
@@ -81,7 +82,8 @@ enum w_type
 	SW_FENCE,
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
-	PREEMPTION
+	PREEMPTION,
+	ENGINE_MAP
 };
 
 struct deps
@@ -115,6 +117,10 @@ struct w_step
 		int throttle;
 		int fence_signal;
 		int priority;
+		struct {
+			unsigned int engine_map_count;
+			enum intel_engine_id *engine_map;
+		};
 	};
 
 	/* Implementation details */
@@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
 struct ctx {
 	uint32_t id;
 	int priority;
+	unsigned int engine_map_count;
+	enum intel_engine_id *engine_map;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -200,10 +208,10 @@ struct workload
 		int fd;
 		bool first;
 		unsigned int num_engines;
-		unsigned int engine_map[5];
+		unsigned int engine_map[NUM_ENGINES];
 		uint64_t t_prev;
-		uint64_t prev[5];
-		double busy[5];
+		uint64_t prev[NUM_ENGINES];
+		double busy[NUM_ENGINES];
 	} busy_balancer;
 };
 
@@ -234,6 +242,7 @@ static int fd;
 #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
 
 static const char *ring_str_map[NUM_ENGINES] = {
+	[DEFAULT] = "DEFAULT",
 	[RCS] = "RCS",
 	[BCS] = "BCS",
 	[VCS] = "VCS",
@@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static int parse_engine_map(struct w_step *step, const char *_str)
+{
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine;
+		unsigned int add;
+
+		tstart = NULL;
+
+		if (!strcmp(token, "DEFAULT"))
+			return -1;
+
+		engine = str_to_engine(token);
+		if ((int)engine < 0)
+			return -1;
+
+		if (engine != VCS && engine != VCS1 && engine != VCS2)
+			return -1; /* TODO */
+
+		add = engine == VCS ? 2 : 1;
+		step->engine_map_count += add;
+		step->engine_map = realloc(step->engine_map,
+					   step->engine_map_count *
+					   sizeof(step->engine_map[0]));
+
+		if (engine != VCS) {
+			step->engine_map[step->engine_map_count - 1] = engine;
+		} else {
+			step->engine_map[step->engine_map_count - 2] = VCS1;
+			step->engine_map[step->engine_map_count - 1] = VCS2;
+		}
+	}
+
+	return 0;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
+			} else if (!strcmp(field, "M")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx)) !=
+				    NULL) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid engine map format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else {
+						tmp = parse_engine_map(&step,
+								       field);
+						check_arg(tmp < 0,
+							  "Invalid engine map list at step %u!\n",
+							  nr_steps);
+					}
+
+					nr++;
+				}
+
+				step.type = ENGINE_MAP;
+				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
@@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
+	[DEFAULT] = I915_EXEC_DEFAULT,
 	[RCS] = I915_EXEC_RENDER,
 	[BCS] = I915_EXEC_BLT,
 	[VCS] = I915_EXEC_BSD,
@@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 		eb->flags = eb_engine_map[engine];
 }
 
+static unsigned int
+find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
+{
+	unsigned int i;
+
+	for (i = 0; i < ctx->engine_map_count; i++) {
+		if (ctx->engine_map[i] == engine)
+			return i + 1;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
 static void
-eb_update_flags(struct w_step *w, enum intel_engine_id engine,
-		unsigned int flags)
+eb_update_flags(struct workload *wrk, struct w_step *w,
+		enum intel_engine_id engine, unsigned int flags)
 {
-	eb_set_engine(&w->eb, engine, flags);
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->engine_map)
+		w->eb.flags = find_engine_in_map(ctx, engine);
+	else
+		eb_set_engine(&w->eb, engine, flags);
 
 	w->eb.flags |= I915_EXEC_HANDLE_LUT;
 	w->eb.flags |= I915_EXEC_NO_RELOC;
@@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
-static struct ctx *
-__get_ctx(struct workload *wrk, struct w_step *w)
-{
-	return &wrk->ctx_list[w->context * 2];
-}
-
 static uint32_t
 get_ctxid(struct workload *wrk, struct w_step *w)
 {
@@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		engine = VCS2;
 	else if (flags & SWAPVCS && engine == VCS2)
 		engine = VCS1;
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 #ifdef DEBUG
 	printf("%u: %u:|", w->idx, w->eb.buffer_count);
 	for (i = 0; i <= j; i++)
@@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
-static void
+static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs;
@@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	/*
 	 * Identify if contexts target specific engine instances and if they
 	 * want to be balanced.
+	 *
+	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
 		bool targets = false;
 		bool balance = false;
 
 		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-			if (w->type != BATCH)
-				continue;
-
 			if (w->context != (j / 2))
 				continue;
 
-			if (w->engine == VCS)
-				balance = true;
-			else
-				targets = true;
+			if (w->type == BATCH) {
+				if (w->engine == VCS)
+					balance = true;
+				else
+					targets = true;
+			} else if (w->type == ENGINE_MAP) {
+				wrk->ctx_list[j].engine_map = w->engine_map;
+				wrk->ctx_list[j].engine_map_count =
+					w->engine_map_count;
+			}
 		}
 
-		if (flags & I915) {
-			wrk->ctx_list[j].targets_instance = targets;
+		wrk->ctx_list[j].targets_instance = targets;
+		if (flags & I915)
 			wrk->ctx_list[j].wants_balance = balance;
+	}
+
+	/*
+	 * Ensure VCS is not allowed with engine map contexts.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->type != BATCH)
+				continue;
+
+			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+				wsim_err("Batches targetting engine maps must use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
+
 	/*
 	 * Create and configure contexts.
 	 */
@@ -1033,7 +1149,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->id)
 			continue;
 
-		if (flags & I915) {
+		if ((flags & I915) || ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1063,7 +1179,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->targets_instance)
+			if (!ctx->engine_map && !ctx->targets_instance)
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1096,7 +1212,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		 * both want to target specific engines and be balanced by i915?
 		 */
 		if ((flags & I915) && ctx->wants_balance &&
-		    ctx->targets_instance) {
+		    ctx->targets_instance && !ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1121,7 +1237,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			__ctx_set_prio(ctx_id, wrk->prio);
 		}
 
-		if (ctx->wants_balance) {
+		if (ctx->engine_map) {
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  ctx->engine_map_count + 1);
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			set_engines.extensions = 0;
+
+			/* Reserve slot for virtual engine. */
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+
+			for (j = 1; j <= ctx->engine_map_count; j++) {
+				set_engines.engines[j].engine_class =
+					I915_ENGINE_CLASS_VIDEO; /* FIXME */
+				set_engines.engines[j].engine_instance =
+					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+			}
+
+			gem_context_set_param(fd, &param);
+		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
 				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
 				.num_siblings = 2,
@@ -1204,6 +1346,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		alloc_step_batch(wrk, w, _flags);
 	}
+
+	return 0;
 }
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
@@ -1941,7 +2085,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	uint32_t seqno = new_seqno(wrk, engine);
 	unsigned int i;
 
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 
 	if (flags & SEQNO)
 		update_bb_seqno(w, engine, seqno);
@@ -2090,7 +2234,8 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
-			} else if (w->type == PREEMPTION) {
+			} else if (w->type == PREEMPTION ||
+				   w->type == ENGINE_MAP) {
 				continue;
 			}
 
@@ -2648,7 +2793,11 @@ int main(int argc, char **argv)
 		w[i]->print_stats = verbose > 1 ||
 				    (verbose > 0 && master_workload == i);
 
-		prepare_workload(i, w[i], flags_);
+		if (prepare_workload(i, w[i], flags_)) {
+			wsim_err("Failed to prepare workload %u!\n", i);
+			return 1;
+		}
+
 
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4786f116b4ac..53f814a73c73 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
 f
@@ -23,10 +24,11 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
 
-Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS
+Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
 
 Example (leading spaces must not be present in the actual file):
 ----------------------------------------------------------------
@@ -161,3 +163,24 @@ The same context is then marked to have batches which can be preempted every
 
 Same as with context priority, context preemption commands are valid until
 optionally overriden by another preemption control change on the same context.
+
+Engine maps
+-----------
+
+Engine maps are a per context feature which changes the way engine selection is
+done in the driver.
+
+Example:
+
+  M.1.VCS1|VCS2
+
+This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
+Submission to this context can now only reference these two engines.
+
+Engine maps can also be defined based on class like VCS.
+
+Example:
+
+M.1.VCS
+
+This sets up the engine map to all available VCS class engines.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 01/15] gem_wsim: Engine map support
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Support new i915 uAPI for configuring contexts with engine maps.

Please refer to the README file for more detailed explanation.

v2:
 * Allow defining engine maps by class.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 211 +++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README |  25 ++++-
 2 files changed, 204 insertions(+), 32 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 60b7d32e22d4..e5b12e37490e 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -57,6 +57,7 @@
 #include "ewma.h"
 
 enum intel_engine_id {
+	DEFAULT,
 	RCS,
 	BCS,
 	VCS,
@@ -81,7 +82,8 @@ enum w_type
 	SW_FENCE,
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
-	PREEMPTION
+	PREEMPTION,
+	ENGINE_MAP
 };
 
 struct deps
@@ -115,6 +117,10 @@ struct w_step
 		int throttle;
 		int fence_signal;
 		int priority;
+		struct {
+			unsigned int engine_map_count;
+			enum intel_engine_id *engine_map;
+		};
 	};
 
 	/* Implementation details */
@@ -142,6 +148,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2)
 struct ctx {
 	uint32_t id;
 	int priority;
+	unsigned int engine_map_count;
+	enum intel_engine_id *engine_map;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -200,10 +208,10 @@ struct workload
 		int fd;
 		bool first;
 		unsigned int num_engines;
-		unsigned int engine_map[5];
+		unsigned int engine_map[NUM_ENGINES];
 		uint64_t t_prev;
-		uint64_t prev[5];
-		double busy[5];
+		uint64_t prev[NUM_ENGINES];
+		double busy[NUM_ENGINES];
 	} busy_balancer;
 };
 
@@ -234,6 +242,7 @@ static int fd;
 #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x)
 
 static const char *ring_str_map[NUM_ENGINES] = {
+	[DEFAULT] = "DEFAULT",
 	[RCS] = "RCS",
 	[BCS] = "BCS",
 	[VCS] = "VCS",
@@ -330,6 +339,43 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static int parse_engine_map(struct w_step *step, const char *_str)
+{
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine;
+		unsigned int add;
+
+		tstart = NULL;
+
+		if (!strcmp(token, "DEFAULT"))
+			return -1;
+
+		engine = str_to_engine(token);
+		if ((int)engine < 0)
+			return -1;
+
+		if (engine != VCS && engine != VCS1 && engine != VCS2)
+			return -1; /* TODO */
+
+		add = engine == VCS ? 2 : 1;
+		step->engine_map_count += add;
+		step->engine_map = realloc(step->engine_map,
+					   step->engine_map_count *
+					   sizeof(step->engine_map[0]));
+
+		if (engine != VCS) {
+			step->engine_map[step->engine_map_count - 1] = engine;
+		} else {
+			step->engine_map[step->engine_map_count - 2] = VCS1;
+			step->engine_map[step->engine_map_count - 1] = VCS2;
+		}
+	}
+
+	return 0;
+}
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -448,6 +494,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
+			} else if (!strcmp(field, "M")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx)) !=
+				    NULL) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid engine map format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						step.context = tmp;
+					} else {
+						tmp = parse_engine_map(&step,
+								       field);
+						check_arg(tmp < 0,
+							  "Invalid engine map list at step %u!\n",
+							  nr_steps);
+					}
+
+					nr++;
+				}
+
+				step.type = ENGINE_MAP;
+				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx)) !=
@@ -774,6 +847,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
+	[DEFAULT] = I915_EXEC_DEFAULT,
 	[RCS] = I915_EXEC_RENDER,
 	[BCS] = I915_EXEC_BLT,
 	[VCS] = I915_EXEC_BSD,
@@ -796,11 +870,36 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb,
 		eb->flags = eb_engine_map[engine];
 }
 
+static unsigned int
+find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
+{
+	unsigned int i;
+
+	for (i = 0; i < ctx->engine_map_count; i++) {
+		if (ctx->engine_map[i] == engine)
+			return i + 1;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
+static struct ctx *
+__get_ctx(struct workload *wrk, struct w_step *w)
+{
+	return &wrk->ctx_list[w->context * 2];
+}
+
 static void
-eb_update_flags(struct w_step *w, enum intel_engine_id engine,
-		unsigned int flags)
+eb_update_flags(struct workload *wrk, struct w_step *w,
+		enum intel_engine_id engine, unsigned int flags)
 {
-	eb_set_engine(&w->eb, engine, flags);
+	struct ctx *ctx = __get_ctx(wrk, w);
+
+	if (ctx->engine_map)
+		w->eb.flags = find_engine_in_map(ctx, engine);
+	else
+		eb_set_engine(&w->eb, engine, flags);
 
 	w->eb.flags |= I915_EXEC_HANDLE_LUT;
 	w->eb.flags |= I915_EXEC_NO_RELOC;
@@ -819,12 +918,6 @@ get_status_objects(struct workload *wrk)
 		return wrk->status_object;
 }
 
-static struct ctx *
-__get_ctx(struct workload *wrk, struct w_step *w)
-{
-	return &wrk->ctx_list[w->context * 2];
-}
-
 static uint32_t
 get_ctxid(struct workload *wrk, struct w_step *w)
 {
@@ -894,7 +987,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		engine = VCS2;
 	else if (flags & SWAPVCS && engine == VCS2)
 		engine = VCS1;
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 #ifdef DEBUG
 	printf("%u: %u:|", w->idx, w->eb.buffer_count);
 	for (i = 0; i <= j; i++)
@@ -936,7 +1029,7 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
-static void
+static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
 	unsigned int ctx_vcs;
@@ -999,30 +1092,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	/*
 	 * Identify if contexts target specific engine instances and if they
 	 * want to be balanced.
+	 *
+	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
 		bool targets = false;
 		bool balance = false;
 
 		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
-			if (w->type != BATCH)
-				continue;
-
 			if (w->context != (j / 2))
 				continue;
 
-			if (w->engine == VCS)
-				balance = true;
-			else
-				targets = true;
+			if (w->type == BATCH) {
+				if (w->engine == VCS)
+					balance = true;
+				else
+					targets = true;
+			} else if (w->type == ENGINE_MAP) {
+				wrk->ctx_list[j].engine_map = w->engine_map;
+				wrk->ctx_list[j].engine_map_count =
+					w->engine_map_count;
+			}
 		}
 
-		if (flags & I915) {
-			wrk->ctx_list[j].targets_instance = targets;
+		wrk->ctx_list[j].targets_instance = targets;
+		if (flags & I915)
 			wrk->ctx_list[j].wants_balance = balance;
+	}
+
+	/*
+	 * Ensure VCS is not allowed with engine map contexts.
+	 */
+	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+			if (w->context != (j / 2))
+				continue;
+
+			if (w->type != BATCH)
+				continue;
+
+			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+				wsim_err("Batches targetting engine maps must use explicit engines!\n");
+				return -1;
+			}
 		}
 	}
 
+
 	/*
 	 * Create and configure contexts.
 	 */
@@ -1033,7 +1149,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->id)
 			continue;
 
-		if (flags & I915) {
+		if ((flags & I915) || ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1063,7 +1179,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->targets_instance)
+			if (!ctx->engine_map && !ctx->targets_instance)
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1096,7 +1212,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		 * both want to target specific engines and be balanced by i915?
 		 */
 		if ((flags & I915) && ctx->wants_balance &&
-		    ctx->targets_instance) {
+		    ctx->targets_instance && !ctx->engine_map) {
 			struct drm_i915_gem_context_create_ext_setparam ext = {
 				.base.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
 				.param.param = I915_CONTEXT_PARAM_VM,
@@ -1121,7 +1237,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			__ctx_set_prio(ctx_id, wrk->prio);
 		}
 
-		if (ctx->wants_balance) {
+		if (ctx->engine_map) {
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  ctx->engine_map_count + 1);
+			struct drm_i915_gem_context_param param = {
+				.ctx_id = ctx_id,
+				.param = I915_CONTEXT_PARAM_ENGINES,
+				.size = sizeof(set_engines),
+				.value = to_user_pointer(&set_engines),
+			};
+
+			set_engines.extensions = 0;
+
+			/* Reserve slot for virtual engine. */
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+
+			for (j = 1; j <= ctx->engine_map_count; j++) {
+				set_engines.engines[j].engine_class =
+					I915_ENGINE_CLASS_VIDEO; /* FIXME */
+				set_engines.engines[j].engine_instance =
+					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+			}
+
+			gem_context_set_param(fd, &param);
+		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
 				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
 				.num_siblings = 2,
@@ -1204,6 +1346,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		alloc_step_batch(wrk, w, _flags);
 	}
+
+	return 0;
 }
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
@@ -1941,7 +2085,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	uint32_t seqno = new_seqno(wrk, engine);
 	unsigned int i;
 
-	eb_update_flags(w, engine, flags);
+	eb_update_flags(wrk, w, engine, flags);
 
 	if (flags & SEQNO)
 		update_bb_seqno(w, engine, seqno);
@@ -2090,7 +2234,8 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
-			} else if (w->type == PREEMPTION) {
+			} else if (w->type == PREEMPTION ||
+				   w->type == ENGINE_MAP) {
 				continue;
 			}
 
@@ -2648,7 +2793,11 @@ int main(int argc, char **argv)
 		w[i]->print_stats = verbose > 1 ||
 				    (verbose > 0 && master_workload == i);
 
-		prepare_workload(i, w[i], flags_);
+		if (prepare_workload(i, w[i], flags_)) {
+			wsim_err("Failed to prepare workload %u!\n", i);
+			return 1;
+		}
+
 
 		if (balancer && balancer->init) {
 			int ret = balancer->init(balancer, w[i]);
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 4786f116b4ac..53f814a73c73 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
 f
@@ -23,10 +24,11 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
 
-Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS
+Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
 
 Example (leading spaces must not be present in the actual file):
 ----------------------------------------------------------------
@@ -161,3 +163,24 @@ The same context is then marked to have batches which can be preempted every
 
 Same as with context priority, context preemption commands are valid until
 optionally overriden by another preemption control change on the same context.
+
+Engine maps
+-----------
+
+Engine maps are a per context feature which changes the way engine selection is
+done in the driver.
+
+Example:
+
+  M.1.VCS1|VCS2
+
+This sets up context 1 with an engine map containing VCS1 and VCS2 engine.
+Submission to this context can now only reference these two engines.
+
+Engine maps can also be defined based on class like VCS.
+
+Example:
+
+M.1.VCS
+
+This sets up the engine map to all available VCS class engines.
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 02/15] gem_wsim: Save some lines by changing to implicit NULL checking
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can improve the parsing loop readability a bit more by avoiding some
line breaks caused by explicit NULL checks.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index e5b12e37490e..baa389c3f0e7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -391,7 +391,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 	igt_assert(desc);
 
-	while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) {
+	while ((_token = strtok_r(tstart, ",", &tctx))) {
 		tstart = NULL;
 		token = strdup(_token);
 		igt_assert(token);
@@ -399,12 +399,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		valid = 0;
 		memset(&step, 0, sizeof(step));
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid delay at step %u!\n",
@@ -414,8 +413,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid period at step %u!\n",
@@ -426,8 +424,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -447,8 +444,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0 ||
 						  ((int)nr_steps + tmp) < 0,
@@ -459,8 +455,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid throttle at step %u!\n",
@@ -470,8 +465,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid qd throttle at step %u!\n",
@@ -481,8 +475,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0,
 						  "Invalid sw fence signal at step %u!\n",
@@ -496,8 +489,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "M")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -523,8 +515,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -564,7 +555,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			i = str_to_engine(field);
@@ -579,7 +570,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				bcs_used = true;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			char *sep = NULL;
 			long int tmpl;
 
@@ -607,7 +598,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
@@ -617,7 +608,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			check_arg(strlen(field) != 1 ||
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 02/15] gem_wsim: Save some lines by changing to implicit NULL checking
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can improve the parsing loop readability a bit more by avoiding some
line breaks caused by explicit NULL checks.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index e5b12e37490e..baa389c3f0e7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -391,7 +391,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 	igt_assert(desc);
 
-	while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) {
+	while ((_token = strtok_r(tstart, ",", &tctx))) {
 		tstart = NULL;
 		token = strdup(_token);
 		igt_assert(token);
@@ -399,12 +399,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 		valid = 0;
 		memset(&step, 0, sizeof(step));
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid delay at step %u!\n",
@@ -414,8 +413,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp <= 0,
 						  "Invalid period at step %u!\n",
@@ -426,8 +424,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				}
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -447,8 +444,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0 ||
 						  ((int)nr_steps + tmp) < 0,
@@ -459,8 +455,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid throttle at step %u!\n",
@@ -470,8 +465,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp < 0,
 						  "Invalid qd throttle at step %u!\n",
@@ -481,8 +475,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 					goto add_step;
 				}
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				if ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(tmp >= 0,
 						  "Invalid sw fence signal at step %u!\n",
@@ -496,8 +489,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "M")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -523,8 +515,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				goto add_step;
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
-				while ((field = strtok_r(fstart, ".", &fctx)) !=
-				    NULL) {
+				while ((field = strtok_r(fstart, ".", &fctx))) {
 					tmp = atoi(field);
 					check_arg(nr == 0 && tmp <= 0,
 						  "Invalid context at step %u!\n",
@@ -564,7 +555,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			i = str_to_engine(field);
@@ -579,7 +570,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				bcs_used = true;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			char *sep = NULL;
 			long int tmpl;
 
@@ -607,7 +598,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			tmp = parse_dependencies(nr_steps, &step, field);
@@ -617,7 +608,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			valid++;
 		}
 
-		if ((field = strtok_r(fstart, ".", &fctx)) != NULL) {
+		if ((field = strtok_r(fstart, ".", &fctx))) {
 			fstart = NULL;
 
 			check_arg(strlen(field) != 1 ||
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 03/15] gem_wsim: Compact int command parsing with a macro
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Parsing an integer workload descriptor field is a common pattern which we
can extract to a helper macro and by doing so further improve the
readability of the main parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 55 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index baa389c3f0e7..66832f74e34a 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -376,6 +376,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
+	if ((field = strtok_r(fstart, ".", &fctx))) { \
+		tmp = atoi(field); \
+		check_arg(_COND_, _ERR_, nr_steps); \
+		step.type = _STEP_; \
+		step._FIELD_ = tmp; \
+		goto add_step; \
+	} \
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -403,25 +412,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid delay at step %u!\n",
-						  nr_steps);
-					step.type = DELAY;
-					step.delay = tmp;
-					goto add_step;
-				}
+				int_field(DELAY, delay, tmp <= 0,
+					  "Invalid delay at step %u!\n");
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid period at step %u!\n",
-						  nr_steps);
-					step.type = PERIOD;
-					step.period = tmp;
-					goto add_step;
-				}
+				int_field(PERIOD, period, tmp <= 0,
+					  "Invalid period at step %u!\n");
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -444,46 +439,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0 ||
-						  ((int)nr_steps + tmp) < 0,
-						  "Invalid sync target at step %u!\n",
-						  nr_steps);
-					step.type = SYNC;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SYNC, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid sync target at step %u!\n");
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid throttle at step %u!\n",
-						  nr_steps);
-					step.type = THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid throttle at step %u!\n");
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid qd throttle at step %u!\n",
-						  nr_steps);
-					step.type = QD_THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(QD_THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid qd throttle at step %u!\n");
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0,
-						  "Invalid sw fence signal at step %u!\n",
-						  nr_steps);
-					step.type = SW_FENCE_SIGNAL;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SW_FENCE_SIGNAL, target,
+					  tmp >= 0,
+					  "Invalid sw fence signal at step %u!\n");
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 03/15] gem_wsim: Compact int command parsing with a macro
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Parsing an integer workload descriptor field is a common pattern which we
can extract to a helper macro and by doing so further improve the
readability of the main parsing loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 80 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 55 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index baa389c3f0e7..66832f74e34a 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -376,6 +376,15 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
+	if ((field = strtok_r(fstart, ".", &fctx))) { \
+		tmp = atoi(field); \
+		check_arg(_COND_, _ERR_, nr_steps); \
+		step.type = _STEP_; \
+		step._FIELD_ = tmp; \
+		goto add_step; \
+	} \
+
 static struct workload *
 parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 {
@@ -403,25 +412,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 			fstart = NULL;
 
 			if (!strcmp(field, "d")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid delay at step %u!\n",
-						  nr_steps);
-					step.type = DELAY;
-					step.delay = tmp;
-					goto add_step;
-				}
+				int_field(DELAY, delay, tmp <= 0,
+					  "Invalid delay at step %u!\n");
 			} else if (!strcmp(field, "p")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp <= 0,
-						  "Invalid period at step %u!\n",
-						  nr_steps);
-					step.type = PERIOD;
-					step.period = tmp;
-					goto add_step;
-				}
+				int_field(PERIOD, period, tmp <= 0,
+					  "Invalid period at step %u!\n");
 			} else if (!strcmp(field, "P")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -444,46 +439,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				step.type = CTX_PRIORITY;
 				goto add_step;
 			} else if (!strcmp(field, "s")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0 ||
-						  ((int)nr_steps + tmp) < 0,
-						  "Invalid sync target at step %u!\n",
-						  nr_steps);
-					step.type = SYNC;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SYNC, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid sync target at step %u!\n");
 			} else if (!strcmp(field, "t")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid throttle at step %u!\n",
-						  nr_steps);
-					step.type = THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid throttle at step %u!\n");
 			} else if (!strcmp(field, "q")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp < 0,
-						  "Invalid qd throttle at step %u!\n",
-						  nr_steps);
-					step.type = QD_THROTTLE;
-					step.throttle = tmp;
-					goto add_step;
-				}
+				int_field(QD_THROTTLE, throttle,
+					  tmp < 0,
+					  "Invalid qd throttle at step %u!\n");
 			} else if (!strcmp(field, "a")) {
-				if ((field = strtok_r(fstart, ".", &fctx))) {
-					tmp = atoi(field);
-					check_arg(tmp >= 0,
-						  "Invalid sw fence signal at step %u!\n",
-						  nr_steps);
-					step.type = SW_FENCE_SIGNAL;
-					step.target = tmp;
-					goto add_step;
-				}
+				int_field(SW_FENCE_SIGNAL, target,
+					  tmp >= 0,
+					  "Invalid sw fence signal at step %u!\n");
 			} else if (!strcmp(field, "f")) {
 				step.type = SW_FENCE;
 				goto add_step;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

v2:
 * Lift restriction to only use load balancer when enabled in context map.
   (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 66 +++++++++++++++++++++++++++++++++++++-----
 benchmarks/wsim/README | 15 ++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 66832f74e34a..d645611cf8c2 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@ enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@ struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -507,6 +509,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -841,7 +862,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1073,12 +1094,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1092,7 +1120,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
@@ -1140,7 +1170,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1201,6 +1232,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1208,7 +1241,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2196,7 +2247,8 @@ static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 53f814a73c73..2c085921e97b 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -184,3 +186,16 @@ Example:
 M.1.VCS
 
 This sets up the engine map to all available VCS class engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:

  B.1

This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.

v2:
 * Lift restriction to only use load balancer when enabled in context map.
   (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c  | 66 +++++++++++++++++++++++++++++++++++++-----
 benchmarks/wsim/README | 15 ++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 66832f74e34a..d645611cf8c2 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -83,7 +83,8 @@ enum w_type
 	SW_FENCE_SIGNAL,
 	CTX_PRIORITY,
 	PREEMPTION,
-	ENGINE_MAP
+	ENGINE_MAP,
+	LOAD_BALANCE,
 };
 
 struct deps
@@ -121,6 +122,7 @@ struct w_step
 			unsigned int engine_map_count;
 			enum intel_engine_id *engine_map;
 		};
+		bool load_balance;
 	};
 
 	/* Implementation details */
@@ -507,6 +509,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = PREEMPTION;
 				goto add_step;
+			} else if (!strcmp(field, "B")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(nr == 0 && tmp <= 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 0,
+						  "Invalid load balance format at step %u!\n",
+						  nr_steps);
+
+					step.context = tmp;
+					step.load_balance = true;
+
+					nr++;
+				}
+
+				step.type = LOAD_BALANCE;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -841,7 +862,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine)
 			return i + 1;
 	}
 
-	igt_assert(0);
+	igt_assert(ctx->wants_balance);
 	return 0;
 }
 
@@ -1073,12 +1094,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				wrk->ctx_list[j].engine_map = w->engine_map;
 				wrk->ctx_list[j].engine_map_count =
 					w->engine_map_count;
+			} else if (w->type == LOAD_BALANCE) {
+				if (!wrk->ctx_list[j].engine_map) {
+					wsim_err("Load balancing needs an engine map!\n");
+					return 1;
+				}
+				wrk->ctx_list[j].wants_balance =
+					w->load_balance;
 			}
 		}
 
 		wrk->ctx_list[j].targets_instance = targets;
 		if (flags & I915)
-			wrk->ctx_list[j].wants_balance = balance;
+			wrk->ctx_list[j].wants_balance |= balance;
 	}
 
 	/*
@@ -1092,7 +1120,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			if (w->type != BATCH)
 				continue;
 
-			if (wrk->ctx_list[j].engine_map && w->engine == VCS) {
+			if (wrk->ctx_list[j].engine_map &&
+			    !wrk->ctx_list[j].wants_balance &&
+			    (w->engine == VCS || w->engine == DEFAULT)) {
 				wsim_err("Batches targetting engine maps must use explicit engines!\n");
 				return -1;
 			}
@@ -1140,7 +1170,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				break;
 			}
 
-			if (!ctx->engine_map && !ctx->targets_instance)
+			if ((!ctx->engine_map && !ctx->targets_instance) ||
+			    (ctx->engine_map && ctx->wants_balance))
 				args.flags |=
 				     I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE;
 
@@ -1201,6 +1232,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		if (ctx->engine_map) {
 			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
 							  ctx->engine_map_count + 1);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 ctx->engine_map_count);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1208,7 +1241,25 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
-			set_engines.extensions = 0;
+			if (ctx->wants_balance) {
+				set_engines.extensions =
+					to_user_pointer(&load_balance);
+
+				memset(&load_balance, 0, sizeof(load_balance));
+				load_balance.base.name =
+					I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+				load_balance.num_siblings =
+					ctx->engine_map_count;
+
+				for (j = 0; j < ctx->engine_map_count; j++) {
+					load_balance.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					load_balance.engines[j].engine_instance =
+						ctx->engine_map[j] - VCS1; /* FIXME */
+				}
+			} else {
+				set_engines.extensions = 0;
+			}
 
 			/* Reserve slot for virtual engine. */
 			set_engines.engines[0].engine_class =
@@ -2196,7 +2247,8 @@ static void *run_workload(void *data)
 				}
 				continue;
 			} else if (w->type == PREEMPTION ||
-				   w->type == ENGINE_MAP) {
+				   w->type == ENGINE_MAP ||
+				   w->type == LOAD_BALANCE) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 53f814a73c73..2c085921e97b 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -3,6 +3,7 @@ Workload descriptor format
 
 ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
@@ -24,6 +25,7 @@ Additional workload steps are also supported:
  'q' - Throttle to n max queue depth.
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
+ 'B' - Turn on context load balancing.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -184,3 +186,16 @@ Example:
 M.1.VCS
 
 This sets up the engine map to all available VCS class engines.
+
+Context load balancing
+----------------------
+
+Context load balancing (aka Virtual Engine) is an i915 feature where the driver
+will pick the best engine (most idle) to submit to given previously configured
+engine map.
+
+Example:
+
+  B.1
+
+This enables load balancing for context number one.
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 05/15] gem_wsim: Engine bond command
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.

Please refer to the README for a more detailed description.

v2:
 * Use list of symbolic engine names instead of the mask. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 159 +++++++++++++++++++++++++++++++++++++++--
 benchmarks/wsim/README |  50 +++++++++++++
 2 files changed, 202 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index d645611cf8c2..6fb6c8cef2c7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -85,6 +85,7 @@ enum w_type
 	PREEMPTION,
 	ENGINE_MAP,
 	LOAD_BALANCE,
+	BOND,
 };
 
 struct deps
@@ -100,6 +101,11 @@ struct w_arg {
 	int prio;
 };
 
+struct bond {
+	uint64_t mask;
+	enum intel_engine_id master;
+};
+
 struct w_step
 {
 	/* Workload step metadata */
@@ -123,6 +129,10 @@ struct w_step
 			enum intel_engine_id *engine_map;
 		};
 		bool load_balance;
+		struct {
+			uint64_t bond_mask;
+			enum intel_engine_id bond_master;
+		};
 	};
 
 	/* Implementation details */
@@ -152,6 +162,8 @@ struct ctx {
 	int priority;
 	unsigned int engine_map_count;
 	enum intel_engine_id *engine_map;
+	unsigned int bond_count;
+	struct bond *bonds;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -378,6 +390,26 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+static uint64_t engine_list_mask(const char *_str)
+{
+	uint64_t mask = 0;
+
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine = str_to_engine(token);
+
+		if ((int)engine < 0 || engine == DEFAULT || engine == VCS)
+			return 0;
+
+		mask |= 1 << engine;
+
+		tstart = NULL;
+	}
+
+	return mask;
+}
+
 #define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
 	if ((field = strtok_r(fstart, ".", &fctx))) { \
 		tmp = atoi(field); \
@@ -528,6 +560,39 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = LOAD_BALANCE;
 				goto add_step;
+			} else if (!strcmp(field, "b")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					check_arg(nr > 2,
+						  "Invalid bond format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						tmp = atoi(field);
+						step.context = tmp;
+						check_arg(tmp <= 0,
+							  "Invalid context at step %u!\n",
+							  nr_steps);
+					} else if (nr == 1) {
+						step.bond_mask = engine_list_mask(field);
+						check_arg(step.bond_mask == 0,
+							"Invalid siblings list at step %u!\n",
+							nr_steps);
+					} else if (nr == 2) {
+						tmp = str_to_engine(field);
+						check_arg(tmp <= 0 ||
+							  tmp == VCS ||
+							  tmp == DEFAULT,
+							  "Invalid master engine at step %u!\n",
+							  nr_steps);
+						step.bond_master = tmp;
+					}
+
+					nr++;
+				}
+
+				step.type = BOND;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -1011,6 +1076,31 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
+static unsigned int
+find_engine(struct i915_engine_class_instance *ci, unsigned int count,
+	    enum intel_engine_id engine)
+{
+	static struct i915_engine_class_instance map[] = {
+		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
+		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
+		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
+		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
+		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
+	};
+	unsigned int i;
+
+	igt_assert(engine < ARRAY_SIZE(map));
+	igt_assert(engine == RCS || map[engine].engine_class);
+
+	for (i = 0; i < count; i++, ci++) {
+		if (!memcmp(&map[engine], ci, sizeof(*ci)))
+			return i;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1078,6 +1168,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		struct ctx *ctx = &wrk->ctx_list[j];
+
 		bool targets = false;
 		bool balance = false;
 
@@ -1091,16 +1183,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				else
 					targets = true;
 			} else if (w->type == ENGINE_MAP) {
-				wrk->ctx_list[j].engine_map = w->engine_map;
-				wrk->ctx_list[j].engine_map_count =
-					w->engine_map_count;
+				ctx->engine_map = w->engine_map;
+				ctx->engine_map_count = w->engine_map_count;
 			} else if (w->type == LOAD_BALANCE) {
-				if (!wrk->ctx_list[j].engine_map) {
+				if (!ctx->engine_map) {
 					wsim_err("Load balancing needs an engine map!\n");
 					return 1;
 				}
-				wrk->ctx_list[j].wants_balance =
-					w->load_balance;
+				ctx->wants_balance = w->load_balance;
+			} else if (w->type == BOND) {
+				if (!ctx->wants_balance) {
+					wsim_err("Engine bonds need load balancing engine map!\n");
+					return 1;
+				}
+				ctx->bond_count++;
+				ctx->bonds = realloc(ctx->bonds,
+						     ctx->bond_count *
+						     sizeof(struct bond));
+				igt_assert(ctx->bonds);
+				ctx->bonds[ctx->bond_count - 1].mask =
+					w->bond_mask;
+				ctx->bonds[ctx->bond_count - 1].master =
+					w->bond_master;
 			}
 		}
 
@@ -1274,6 +1378,46 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map[j - 1] - VCS1; /* FIXME */
 			}
 
+			for (j = 0; j < ctx->bond_count; j++) {
+				unsigned long mask = ctx->bonds[j].mask;
+				I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
+								 __builtin_popcount(mask));
+				struct i915_context_engines_bond *p = NULL, *prev;
+				unsigned int b, e;
+
+				prev = p;
+				p = alloca(sizeof(bond));
+				assert(p);
+				memset(p, 0, sizeof(bond));
+
+				if (j == 0)
+					load_balance.base.next_extension =
+						to_user_pointer(p);
+				else if (j < (ctx->bond_count - 1))
+					prev->base.next_extension =
+						to_user_pointer(p);
+
+				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+				p->virtual_index = 0;
+				p->master.engine_class =
+					I915_ENGINE_CLASS_VIDEO;
+				p->master.engine_instance =
+					ctx->bonds[j].master - VCS1;
+
+				for (b = 0, e = 0; mask; e++, mask >>= 1) {
+					unsigned int idx;
+
+					if (!(mask & 1))
+						continue;
+
+					idx = find_engine(&set_engines.engines[1],
+							  ctx->engine_map_count,
+							  e);
+					p->engines[b++] =
+						set_engines.engines[1 + idx];
+				}
+			}
+
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
@@ -2248,7 +2392,8 @@ static void *run_workload(void *data)
 				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
-				   w->type == LOAD_BALANCE) {
+				   w->type == LOAD_BALANCE ||
+				   w->type == BOND) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 2c085921e97b..c5107326a681 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -7,6 +7,7 @@ B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
+b.<uint>.<str>[|<str>].<str>
 f
 
 For duration a range can be given from which a random value will be picked
@@ -26,6 +27,7 @@ Additional workload steps are also supported:
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
  'B' - Turn on context load balancing.
+ 'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -199,3 +201,51 @@ Example:
   B.1
 
 This enables load balancing for context number one.
+
+Engine bonds
+------------
+
+Engine bonds are extensions on load balanced contexts. They allow expressing
+rules of engine selection between two co-operating contexts tied with submit
+fences. In other words, the rule expression is telling the driver: "If you pick
+this engine for context one, then you have to pick that engine for context two".
+
+Syntax is:
+  b.<context>.<engine_list>.<master_engine>
+
+Engine list is a list of one or more sibling engines separated by a pipe
+character (eg. "VCS1|VCS2").
+
+There can be multiple bonds tied to the same context.
+
+Example:
+
+  M.1.RCS|VECS
+  B.1
+  M.2.VCS1|VCS2
+  B.2
+  b.2.VCS1.RCS
+  b.2.VCS2.VECS
+
+This tells the driver that if it picked RCS for context one, it has to pick VCS1
+for context two. And if it picked VECS for context one, it has to pick VCS1 for
+context two.
+
+If we extend the above example with more workload directives:
+
+  1.DEFAULT.1000.0.0
+  2.DEFAULT.1000.s-1.0
+
+We get to a fully functional example where two batch buffers are submitted in a
+load balanced fashion, telling the driver they should run simultaneously and
+that valid engine pairs are either RCS + VCS1 (for two contexts respectively),
+or VECS + VCS2.
+
+This can also be extended using sync fences to improve chances of the first
+submission not getting on the hardware after the second one. Second block would
+then look like:
+
+  f
+  1.DEFAULT.1000.f-1.0
+  2.DEFAULT.1000.s-1.0
+  a.-3
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Intel-gfx] [PATCH i-g-t 05/15] gem_wsim: Engine bond command
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.

Please refer to the README for a more detailed description.

v2:
 * Use list of symbolic engine names instead of the mask. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 159 +++++++++++++++++++++++++++++++++++++++--
 benchmarks/wsim/README |  50 +++++++++++++
 2 files changed, 202 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index d645611cf8c2..6fb6c8cef2c7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -85,6 +85,7 @@ enum w_type
 	PREEMPTION,
 	ENGINE_MAP,
 	LOAD_BALANCE,
+	BOND,
 };
 
 struct deps
@@ -100,6 +101,11 @@ struct w_arg {
 	int prio;
 };
 
+struct bond {
+	uint64_t mask;
+	enum intel_engine_id master;
+};
+
 struct w_step
 {
 	/* Workload step metadata */
@@ -123,6 +129,10 @@ struct w_step
 			enum intel_engine_id *engine_map;
 		};
 		bool load_balance;
+		struct {
+			uint64_t bond_mask;
+			enum intel_engine_id bond_master;
+		};
 	};
 
 	/* Implementation details */
@@ -152,6 +162,8 @@ struct ctx {
 	int priority;
 	unsigned int engine_map_count;
 	enum intel_engine_id *engine_map;
+	unsigned int bond_count;
+	struct bond *bonds;
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
@@ -378,6 +390,26 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 	return 0;
 }
 
+static uint64_t engine_list_mask(const char *_str)
+{
+	uint64_t mask = 0;
+
+	char *token, *tctx = NULL, *tstart = (char *)_str;
+
+	while ((token = strtok_r(tstart, "|", &tctx))) {
+		enum intel_engine_id engine = str_to_engine(token);
+
+		if ((int)engine < 0 || engine == DEFAULT || engine == VCS)
+			return 0;
+
+		mask |= 1 << engine;
+
+		tstart = NULL;
+	}
+
+	return mask;
+}
+
 #define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \
 	if ((field = strtok_r(fstart, ".", &fctx))) { \
 		tmp = atoi(field); \
@@ -528,6 +560,39 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = LOAD_BALANCE;
 				goto add_step;
+			} else if (!strcmp(field, "b")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					check_arg(nr > 2,
+						  "Invalid bond format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0) {
+						tmp = atoi(field);
+						step.context = tmp;
+						check_arg(tmp <= 0,
+							  "Invalid context at step %u!\n",
+							  nr_steps);
+					} else if (nr == 1) {
+						step.bond_mask = engine_list_mask(field);
+						check_arg(step.bond_mask == 0,
+							"Invalid siblings list at step %u!\n",
+							nr_steps);
+					} else if (nr == 2) {
+						tmp = str_to_engine(field);
+						check_arg(tmp <= 0 ||
+							  tmp == VCS ||
+							  tmp == DEFAULT,
+							  "Invalid master engine at step %u!\n",
+							  nr_steps);
+						step.bond_master = tmp;
+					}
+
+					nr++;
+				}
+
+				step.type = BOND;
+				goto add_step;
 			}
 
 			if (!field) {
@@ -1011,6 +1076,31 @@ static void vm_destroy(int i915, uint32_t vm_id)
 	igt_assert_eq(__vm_destroy(i915, vm_id), 0);
 }
 
+static unsigned int
+find_engine(struct i915_engine_class_instance *ci, unsigned int count,
+	    enum intel_engine_id engine)
+{
+	static struct i915_engine_class_instance map[] = {
+		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
+		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
+		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
+		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
+		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
+	};
+	unsigned int i;
+
+	igt_assert(engine < ARRAY_SIZE(map));
+	igt_assert(engine == RCS || map[engine].engine_class);
+
+	for (i = 0; i < count; i++, ci++) {
+		if (!memcmp(&map[engine], ci, sizeof(*ci)))
+			return i;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1078,6 +1168,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 	 * Transfer over engine map configuration from the workload step.
 	 */
 	for (j = 0; j < wrk->nr_ctxs; j += 2) {
+		struct ctx *ctx = &wrk->ctx_list[j];
+
 		bool targets = false;
 		bool balance = false;
 
@@ -1091,16 +1183,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				else
 					targets = true;
 			} else if (w->type == ENGINE_MAP) {
-				wrk->ctx_list[j].engine_map = w->engine_map;
-				wrk->ctx_list[j].engine_map_count =
-					w->engine_map_count;
+				ctx->engine_map = w->engine_map;
+				ctx->engine_map_count = w->engine_map_count;
 			} else if (w->type == LOAD_BALANCE) {
-				if (!wrk->ctx_list[j].engine_map) {
+				if (!ctx->engine_map) {
 					wsim_err("Load balancing needs an engine map!\n");
 					return 1;
 				}
-				wrk->ctx_list[j].wants_balance =
-					w->load_balance;
+				ctx->wants_balance = w->load_balance;
+			} else if (w->type == BOND) {
+				if (!ctx->wants_balance) {
+					wsim_err("Engine bonds need load balancing engine map!\n");
+					return 1;
+				}
+				ctx->bond_count++;
+				ctx->bonds = realloc(ctx->bonds,
+						     ctx->bond_count *
+						     sizeof(struct bond));
+				igt_assert(ctx->bonds);
+				ctx->bonds[ctx->bond_count - 1].mask =
+					w->bond_mask;
+				ctx->bonds[ctx->bond_count - 1].master =
+					w->bond_master;
 			}
 		}
 
@@ -1274,6 +1378,46 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map[j - 1] - VCS1; /* FIXME */
 			}
 
+			for (j = 0; j < ctx->bond_count; j++) {
+				unsigned long mask = ctx->bonds[j].mask;
+				I915_DEFINE_CONTEXT_ENGINES_BOND(bond,
+								 __builtin_popcount(mask));
+				struct i915_context_engines_bond *p = NULL, *prev;
+				unsigned int b, e;
+
+				prev = p;
+				p = alloca(sizeof(bond));
+				assert(p);
+				memset(p, 0, sizeof(bond));
+
+				if (j == 0)
+					load_balance.base.next_extension =
+						to_user_pointer(p);
+				else if (j < (ctx->bond_count - 1))
+					prev->base.next_extension =
+						to_user_pointer(p);
+
+				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+				p->virtual_index = 0;
+				p->master.engine_class =
+					I915_ENGINE_CLASS_VIDEO;
+				p->master.engine_instance =
+					ctx->bonds[j].master - VCS1;
+
+				for (b = 0, e = 0; mask; e++, mask >>= 1) {
+					unsigned int idx;
+
+					if (!(mask & 1))
+						continue;
+
+					idx = find_engine(&set_engines.engines[1],
+							  ctx->engine_map_count,
+							  e);
+					p->engines[b++] =
+						set_engines.engines[1 + idx];
+				}
+			}
+
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
 			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
@@ -2248,7 +2392,8 @@ static void *run_workload(void *data)
 				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
-				   w->type == LOAD_BALANCE) {
+				   w->type == LOAD_BALANCE ||
+				   w->type == BOND) {
 				continue;
 			}
 
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 2c085921e97b..c5107326a681 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -7,6 +7,7 @@ B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
 d|p|s|t|q|a.<int>,...
+b.<uint>.<str>[|<str>].<str>
 f
 
 For duration a range can be given from which a random value will be picked
@@ -26,6 +27,7 @@ Additional workload steps are also supported:
  'f' - Create a sync fence.
  'a' - Advance the previously created sync fence.
  'B' - Turn on context load balancing.
+ 'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
  'X' - Context preemption control.
@@ -199,3 +201,51 @@ Example:
   B.1
 
 This enables load balancing for context number one.
+
+Engine bonds
+------------
+
+Engine bonds are extensions on load balanced contexts. They allow expressing
+rules of engine selection between two co-operating contexts tied with submit
+fences. In other words, the rule expression is telling the driver: "If you pick
+this engine for context one, then you have to pick that engine for context two".
+
+Syntax is:
+  b.<context>.<engine_list>.<master_engine>
+
+Engine list is a list of one or more sibling engines separated by a pipe
+character (eg. "VCS1|VCS2").
+
+There can be multiple bonds tied to the same context.
+
+Example:
+
+  M.1.RCS|VECS
+  B.1
+  M.2.VCS1|VCS2
+  B.2
+  b.2.VCS1.RCS
+  b.2.VCS2.VECS
+
+This tells the driver that if it picked RCS for context one, it has to pick VCS1
+for context two. And if it picked VECS for context one, it has to pick VCS1 for
+context two.
+
+If we extend the above example with more workload directives:
+
+  1.DEFAULT.1000.0.0
+  2.DEFAULT.1000.s-1.0
+
+We get to a fully functional example where two batch buffers are submitted in a
+load balanced fashion, telling the driver they should run simultaneously and
+that valid engine pairs are either RCS + VCS1 (for two contexts respectively),
+or VECS + VCS2.
+
+This can also be extended using sync fences to improve chances of the first
+submission not getting on the hardware after the second one. Second block would
+then look like:
+
+  f
+  1.DEFAULT.1000.f-1.0
+  2.DEFAULT.1000.s-1.0
+  a.-3
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 06/15] gem_wsim: Some more example workloads
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few additional workloads useful for experimenting with scheduling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/wsim/frame-split-60fps.wsim      | 16 ++++++++++++++++
 benchmarks/wsim/high-composited-game.wsim   | 11 +++++++++++
 benchmarks/wsim/media-1080p-player.wsim     |  5 +++++
 benchmarks/wsim/medium-composited-game.wsim |  9 +++++++++
 4 files changed, 41 insertions(+)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
new file mode 100644
index 000000000000..20fdcf8c8b4a
--- /dev/null
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -0,0 +1,16 @@
+X.1.0
+M.1.VCS1
+B.1
+X.2.0
+M.2.VCS2
+B.2
+b.2.VCS2.VCS1
+f
+1.DEFAULT.4000-6000.f-1.0
+2.DEFAULT.4000-6000.s-1.0
+a.-3
+3.RCS.2000-4000.-3/-2.0
+3.VECS.2000.-1.0
+4.BCS.1000.-1.0
+s.-2
+p.16667
diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim
new file mode 100644
index 000000000000..a90a2b2be95b
--- /dev/null
+++ b/benchmarks/wsim/high-composited-game.wsim
@@ -0,0 +1,11 @@
+1.RCS.500.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim
new file mode 100644
index 000000000000..bcbb0cfd2ad3
--- /dev/null
+++ b/benchmarks/wsim/media-1080p-player.wsim
@@ -0,0 +1,5 @@
+1.VCS.5000-10000.0.0
+2.RCS.1000-2000.-1.0
+P.3.1
+3.BCS.1000.-2.0
+p.16667
diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim
new file mode 100644
index 000000000000..580883516168
--- /dev/null
+++ b/benchmarks/wsim/medium-composited-game.wsim
@@ -0,0 +1,9 @@
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Intel-gfx] [PATCH i-g-t 06/15] gem_wsim: Some more example workloads
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A few additional workloads useful for experimenting with scheduling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/wsim/frame-split-60fps.wsim      | 16 ++++++++++++++++
 benchmarks/wsim/high-composited-game.wsim   | 11 +++++++++++
 benchmarks/wsim/media-1080p-player.wsim     |  5 +++++
 benchmarks/wsim/medium-composited-game.wsim |  9 +++++++++
 4 files changed, 41 insertions(+)
 create mode 100644 benchmarks/wsim/frame-split-60fps.wsim
 create mode 100644 benchmarks/wsim/high-composited-game.wsim
 create mode 100644 benchmarks/wsim/media-1080p-player.wsim
 create mode 100644 benchmarks/wsim/medium-composited-game.wsim

diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
new file mode 100644
index 000000000000..20fdcf8c8b4a
--- /dev/null
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -0,0 +1,16 @@
+X.1.0
+M.1.VCS1
+B.1
+X.2.0
+M.2.VCS2
+B.2
+b.2.VCS2.VCS1
+f
+1.DEFAULT.4000-6000.f-1.0
+2.DEFAULT.4000-6000.s-1.0
+a.-3
+3.RCS.2000-4000.-3/-2.0
+3.VECS.2000.-1.0
+4.BCS.1000.-1.0
+s.-2
+p.16667
diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim
new file mode 100644
index 000000000000..a90a2b2be95b
--- /dev/null
+++ b/benchmarks/wsim/high-composited-game.wsim
@@ -0,0 +1,11 @@
+1.RCS.500.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+1.RCS.2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim
new file mode 100644
index 000000000000..bcbb0cfd2ad3
--- /dev/null
+++ b/benchmarks/wsim/media-1080p-player.wsim
@@ -0,0 +1,5 @@
+1.VCS.5000-10000.0.0
+2.RCS.1000-2000.-1.0
+P.3.1
+3.BCS.1000.-2.0
+p.16667
diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim
new file mode 100644
index 000000000000..580883516168
--- /dev/null
+++ b/benchmarks/wsim/medium-composited-game.wsim
@@ -0,0 +1,9 @@
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+1.RCS.1000-2000.0.0
+P.2.1
+2.BCS.1000.-2.0
+2.RCS.2000.-1.1
+p.16667
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:

  1.RCS.*.0.0
  T.-1

First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.

v2: (Chris)
 * Relax the recursive batch with 4096 nops between BB_START.
 * Check for at least gen8.
 * Simplify relocation entry building.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
---
 benchmarks/gem_wsim.c                  | 124 ++++++++++++++++++-------
 benchmarks/wsim/README                 |   9 +-
 benchmarks/wsim/frame-split-60fps.wsim |   6 +-
 3 files changed, 104 insertions(+), 35 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 6fb6c8cef2c7..47a1ddaf8792 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -86,6 +86,7 @@ enum w_type
 	ENGINE_MAP,
 	LOAD_BALANCE,
 	BOND,
+	TERMINATE,
 };
 
 struct deps
@@ -113,6 +114,7 @@ struct w_step
 	unsigned int context;
 	unsigned int engine;
 	struct duration duration;
+	bool unbound_duration;
 	struct deps data_deps;
 	struct deps fence_deps;
 	int emit_fence;
@@ -143,7 +145,7 @@ struct w_step
 
 	struct drm_i915_gem_execbuffer2 eb;
 	struct drm_i915_gem_exec_object2 *obj;
-	struct drm_i915_gem_relocation_entry reloc[4];
+	struct drm_i915_gem_relocation_entry reloc[5];
 	unsigned long bb_sz;
 	uint32_t bb_handle;
 	uint32_t *seqno_value;
@@ -153,6 +155,7 @@ struct w_step
 	uint32_t *rt1_address;
 	uint32_t *latch_value;
 	uint32_t *latch_address;
+	uint32_t *recursive_bb_start;
 };
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
@@ -517,6 +520,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = ENGINE_MAP;
 				goto add_step;
+			} else if (!strcmp(field, "T")) {
+				int_field(TERMINATE, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid terminate target at step %u!\n");
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -632,23 +639,31 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 			fstart = NULL;
 
-			tmpl = strtol(field, &sep, 10);
-			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
-				  tmpl == LONG_MAX,
-				  "Invalid duration at step %u!\n", nr_steps);
-			step.duration.min = tmpl;
-
-			if (sep && *sep == '-') {
-				tmpl = strtol(sep + 1, NULL, 10);
-				check_arg(tmpl <= 0 ||
-					  tmpl <= step.duration.min ||
-					  tmpl == LONG_MIN ||
-					  tmpl == LONG_MAX,
-					  "Invalid duration range at step %u!\n",
+			if (field[0] == '*') {
+				check_arg(intel_gen(intel_get_drm_devid(fd)) < 8,
+					  "Infinite batch at step %u needs Gen8+!\n",
 					  nr_steps);
-				step.duration.max = tmpl;
+				step.unbound_duration = true;
 			} else {
-				step.duration.max = step.duration.min;
+				tmpl = strtol(field, &sep, 10);
+				check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration at step %u!\n",
+					  nr_steps);
+				step.duration.min = tmpl;
+
+				if (sep && *sep == '-') {
+					tmpl = strtol(sep + 1, NULL, 10);
+					check_arg(tmpl <= 0 ||
+						tmpl <= step.duration.min ||
+						tmpl == LONG_MIN ||
+						tmpl == LONG_MAX,
+						"Invalid duration range at step %u!\n",
+						nr_steps);
+					step.duration.max = tmpl;
+				} else {
+					step.duration.max = step.duration.min;
+				}
 			}
 
 			valid++;
@@ -808,7 +823,7 @@ init_bb(struct w_step *w, unsigned int flags)
 	unsigned int i;
 	uint32_t *ptr;
 
-	if (!arb_period)
+	if (w->unbound_duration || !arb_period)
 		return;
 
 	gem_set_domain(fd, w->bb_handle,
@@ -822,12 +837,13 @@ init_bb(struct w_step *w, unsigned int flags)
 	munmap(ptr, mmap_len);
 }
 
-static void
+static unsigned int
 terminate_bb(struct w_step *w, unsigned int flags)
 {
 	const uint32_t bbe = 0xa << 23;
 	unsigned long mmap_start, mmap_len;
 	unsigned long batch_start = w->bb_sz;
+	unsigned int r = 0;
 	uint32_t *ptr, *cs;
 
 	igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
@@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	if (flags & RT)
 		batch_start -= 12 * sizeof(uint32_t);
 
+	if (w->unbound_duration)
+		batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
+
 	mmap_start = rounddown(batch_start, PAGE_SIZE);
 	mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
 
@@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
 	cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
 
+	if (w->unbound_duration) {
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
+		batch_start += 4 * sizeof(uint32_t);
+
+		*cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
+		w->recursive_bb_start = cs;
+		*cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
+		*cs++ = 0;
+		*cs++ = 0;
+	}
+
 	if (flags & SEQNO) {
-		w->reloc[0].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -860,7 +890,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	if (flags & RT) {
-		w->reloc[1].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -870,7 +900,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		w->rt0_value = cs;
 		*cs++ = 0;
 
-		w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
@@ -879,7 +909,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		*cs++ = 0;
 		*cs++ = 0;
 
-		w->reloc[3].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -891,6 +921,8 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	*cs = bbe;
+
+	return r;
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
@@ -1011,19 +1043,22 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		}
 	}
 
-	w->bb_sz = get_bb_sz(w->duration.max);
-	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
+	if (w->unbound_duration)
+		/* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
+		w->bb_sz = max(PAGE_SIZE, get_bb_sz(w->preempt_us)) +
+			   (1 + 3) * sizeof(uint32_t);
+	else
+		w->bb_sz = get_bb_sz(w->duration.max);
+	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
 	init_bb(w, flags);
-	terminate_bb(w, flags);
+	w->obj[j].relocation_count = terminate_bb(w, flags);
 
-	if (flags & SEQNO) {
+	if (w->obj[j].relocation_count) {
 		w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
-		if (flags & RT)
-			w->obj[j].relocation_count = 4;
-		else
-			w->obj[j].relocation_count = 1;
 		for (i = 0; i < w->obj[j].relocation_count; i++)
 			w->reloc[i].target_handle = 1;
+		if (w->unbound_duration)
+			w->reloc[0].target_handle = j;
 	}
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
@@ -2113,6 +2148,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
 	}
 }
 
+static void
+update_bb_start(struct w_step *w)
+{
+	if (!w->unbound_duration)
+		return;
+
+	gem_set_domain(fd, w->bb_handle,
+		       I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+
+	*w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
+}
+
 static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
 {
 	if (target < 0)
@@ -2248,9 +2295,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	if (flags & RT)
 		update_bb_rt(w, engine, seqno);
 
+	update_bb_start(w);
+
 	w->eb.batch_start_offset =
+		w->unbound_duration ?
+		0 :
 		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
-			2 * sizeof(uint32_t));
+		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
 		int tgt = w->idx + w->fence_deps.list[i];
@@ -2390,6 +2441,17 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
+			} else if (w->type == TERMINATE) {
+				unsigned int t_idx = i + w->target;
+
+				igt_assert(t_idx >= 0 && t_idx < i);
+				igt_assert(wrk->steps[t_idx].type == BATCH);
+				igt_assert(wrk->steps[t_idx].unbound_duration);
+
+				*wrk->steps[t_idx].recursive_bb_start =
+					MI_BATCH_BUFFER_END;
+				__sync_synchronize();
+				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
 				   w->type == LOAD_BALANCE ||
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index c5107326a681..497d5cad2142 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -2,11 +2,11 @@ Workload descriptor format
 ==========================
 
 ctx.engine.duration_us.dependency.wait,...
-<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
-d|p|s|t|q|a.<int>,...
+d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
 
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
 Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
@@ -77,6 +78,10 @@ Example:
 
 I this case the last step has a data dependency on both first and second steps.
 
+Batch durations can also be specified as infinite by using the '*' in the
+duration field. Such batches must be ended by the terminate command ('T')
+otherwise they will cause a GPU hang to be reported.
+
 Sync (fd) fences
 ----------------
 
diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
index 20fdcf8c8b4a..17490ddfaddd 100644
--- a/benchmarks/wsim/frame-split-60fps.wsim
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -6,10 +6,12 @@ M.2.VCS2
 B.2
 b.2.VCS2.VCS1
 f
-1.DEFAULT.4000-6000.f-1.0
+1.DEFAULT.*.f-1.0
 2.DEFAULT.4000-6000.s-1.0
 a.-3
-3.RCS.2000-4000.-3/-2.0
+s.-2
+T.-4
+3.RCS.2000-4000.-5/-4.0
 3.VECS.2000.-1.0
 4.BCS.1000.-1.0
 s.-2
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:

  1.RCS.*.0.0
  T.-1

First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.

v2: (Chris)
 * Relax the recursive batch with 4096 nops between BB_START.
 * Check for at least gen8.
 * Simplify relocation entry building.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
---
 benchmarks/gem_wsim.c                  | 124 ++++++++++++++++++-------
 benchmarks/wsim/README                 |   9 +-
 benchmarks/wsim/frame-split-60fps.wsim |   6 +-
 3 files changed, 104 insertions(+), 35 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 6fb6c8cef2c7..47a1ddaf8792 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -86,6 +86,7 @@ enum w_type
 	ENGINE_MAP,
 	LOAD_BALANCE,
 	BOND,
+	TERMINATE,
 };
 
 struct deps
@@ -113,6 +114,7 @@ struct w_step
 	unsigned int context;
 	unsigned int engine;
 	struct duration duration;
+	bool unbound_duration;
 	struct deps data_deps;
 	struct deps fence_deps;
 	int emit_fence;
@@ -143,7 +145,7 @@ struct w_step
 
 	struct drm_i915_gem_execbuffer2 eb;
 	struct drm_i915_gem_exec_object2 *obj;
-	struct drm_i915_gem_relocation_entry reloc[4];
+	struct drm_i915_gem_relocation_entry reloc[5];
 	unsigned long bb_sz;
 	uint32_t bb_handle;
 	uint32_t *seqno_value;
@@ -153,6 +155,7 @@ struct w_step
 	uint32_t *rt1_address;
 	uint32_t *latch_value;
 	uint32_t *latch_address;
+	uint32_t *recursive_bb_start;
 };
 
 DECLARE_EWMA(uint64_t, rt, 4, 2)
@@ -517,6 +520,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 				step.type = ENGINE_MAP;
 				goto add_step;
+			} else if (!strcmp(field, "T")) {
+				int_field(TERMINATE, target,
+					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
+					  "Invalid terminate target at step %u!\n");
 			} else if (!strcmp(field, "X")) {
 				unsigned int nr = 0;
 				while ((field = strtok_r(fstart, ".", &fctx))) {
@@ -632,23 +639,31 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 
 			fstart = NULL;
 
-			tmpl = strtol(field, &sep, 10);
-			check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
-				  tmpl == LONG_MAX,
-				  "Invalid duration at step %u!\n", nr_steps);
-			step.duration.min = tmpl;
-
-			if (sep && *sep == '-') {
-				tmpl = strtol(sep + 1, NULL, 10);
-				check_arg(tmpl <= 0 ||
-					  tmpl <= step.duration.min ||
-					  tmpl == LONG_MIN ||
-					  tmpl == LONG_MAX,
-					  "Invalid duration range at step %u!\n",
+			if (field[0] == '*') {
+				check_arg(intel_gen(intel_get_drm_devid(fd)) < 8,
+					  "Infinite batch at step %u needs Gen8+!\n",
 					  nr_steps);
-				step.duration.max = tmpl;
+				step.unbound_duration = true;
 			} else {
-				step.duration.max = step.duration.min;
+				tmpl = strtol(field, &sep, 10);
+				check_arg(tmpl <= 0 || tmpl == LONG_MIN ||
+					  tmpl == LONG_MAX,
+					  "Invalid duration at step %u!\n",
+					  nr_steps);
+				step.duration.min = tmpl;
+
+				if (sep && *sep == '-') {
+					tmpl = strtol(sep + 1, NULL, 10);
+					check_arg(tmpl <= 0 ||
+						tmpl <= step.duration.min ||
+						tmpl == LONG_MIN ||
+						tmpl == LONG_MAX,
+						"Invalid duration range at step %u!\n",
+						nr_steps);
+					step.duration.max = tmpl;
+				} else {
+					step.duration.max = step.duration.min;
+				}
 			}
 
 			valid++;
@@ -808,7 +823,7 @@ init_bb(struct w_step *w, unsigned int flags)
 	unsigned int i;
 	uint32_t *ptr;
 
-	if (!arb_period)
+	if (w->unbound_duration || !arb_period)
 		return;
 
 	gem_set_domain(fd, w->bb_handle,
@@ -822,12 +837,13 @@ init_bb(struct w_step *w, unsigned int flags)
 	munmap(ptr, mmap_len);
 }
 
-static void
+static unsigned int
 terminate_bb(struct w_step *w, unsigned int flags)
 {
 	const uint32_t bbe = 0xa << 23;
 	unsigned long mmap_start, mmap_len;
 	unsigned long batch_start = w->bb_sz;
+	unsigned int r = 0;
 	uint32_t *ptr, *cs;
 
 	igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
@@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	if (flags & RT)
 		batch_start -= 12 * sizeof(uint32_t);
 
+	if (w->unbound_duration)
+		batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
+
 	mmap_start = rounddown(batch_start, PAGE_SIZE);
 	mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
 
@@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
 	cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
 
+	if (w->unbound_duration) {
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
+		batch_start += 4 * sizeof(uint32_t);
+
+		*cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
+		w->recursive_bb_start = cs;
+		*cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
+		*cs++ = 0;
+		*cs++ = 0;
+	}
+
 	if (flags & SEQNO) {
-		w->reloc[0].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -860,7 +890,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	if (flags & RT) {
-		w->reloc[1].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -870,7 +900,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		w->rt0_value = cs;
 		*cs++ = 0;
 
-		w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
@@ -879,7 +909,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
 		*cs++ = 0;
 		*cs++ = 0;
 
-		w->reloc[3].offset = batch_start + sizeof(uint32_t);
+		w->reloc[r++].offset = batch_start + sizeof(uint32_t);
 		batch_start += 4 * sizeof(uint32_t);
 
 		*cs++ = MI_STORE_DWORD_IMM;
@@ -891,6 +921,8 @@ terminate_bb(struct w_step *w, unsigned int flags)
 	}
 
 	*cs = bbe;
+
+	return r;
 }
 
 static const unsigned int eb_engine_map[NUM_ENGINES] = {
@@ -1011,19 +1043,22 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
 		}
 	}
 
-	w->bb_sz = get_bb_sz(w->duration.max);
-	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
+	if (w->unbound_duration)
+		/* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
+		w->bb_sz = max(PAGE_SIZE, get_bb_sz(w->preempt_us)) +
+			   (1 + 3) * sizeof(uint32_t);
+	else
+		w->bb_sz = get_bb_sz(w->duration.max);
+	w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
 	init_bb(w, flags);
-	terminate_bb(w, flags);
+	w->obj[j].relocation_count = terminate_bb(w, flags);
 
-	if (flags & SEQNO) {
+	if (w->obj[j].relocation_count) {
 		w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
-		if (flags & RT)
-			w->obj[j].relocation_count = 4;
-		else
-			w->obj[j].relocation_count = 1;
 		for (i = 0; i < w->obj[j].relocation_count; i++)
 			w->reloc[i].target_handle = 1;
+		if (w->unbound_duration)
+			w->reloc[0].target_handle = j;
 	}
 
 	w->eb.buffers_ptr = to_user_pointer(w->obj);
@@ -2113,6 +2148,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno)
 	}
 }
 
+static void
+update_bb_start(struct w_step *w)
+{
+	if (!w->unbound_duration)
+		return;
+
+	gem_set_domain(fd, w->bb_handle,
+		       I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+
+	*w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1;
+}
+
 static void w_sync_to(struct workload *wrk, struct w_step *w, int target)
 {
 	if (target < 0)
@@ -2248,9 +2295,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	if (flags & RT)
 		update_bb_rt(w, engine, seqno);
 
+	update_bb_start(w);
+
 	w->eb.batch_start_offset =
+		w->unbound_duration ?
+		0 :
 		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
-			2 * sizeof(uint32_t));
+		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
 		int tgt = w->idx + w->fence_deps.list[i];
@@ -2390,6 +2441,17 @@ static void *run_workload(void *data)
 								    w->priority;
 				}
 				continue;
+			} else if (w->type == TERMINATE) {
+				unsigned int t_idx = i + w->target;
+
+				igt_assert(t_idx >= 0 && t_idx < i);
+				igt_assert(wrk->steps[t_idx].type == BATCH);
+				igt_assert(wrk->steps[t_idx].unbound_duration);
+
+				*wrk->steps[t_idx].recursive_bb_start =
+					MI_BATCH_BUFFER_END;
+				__sync_synchronize();
+				continue;
 			} else if (w->type == PREEMPTION ||
 				   w->type == ENGINE_MAP ||
 				   w->type == LOAD_BALANCE ||
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index c5107326a681..497d5cad2142 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -2,11 +2,11 @@ Workload descriptor format
 ==========================
 
 ctx.engine.duration_us.dependency.wait,...
-<uint>.<str>.<uint>[-<uint>].<int <= 0>[/<int <= 0>][...].<0|1>,...
+<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
 P|X.<uint>.<int>
-d|p|s|t|q|a.<int>,...
+d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
 
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
 Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS
@@ -77,6 +78,10 @@ Example:
 
 I this case the last step has a data dependency on both first and second steps.
 
+Batch durations can also be specified as infinite by using the '*' in the
+duration field. Such batches must be ended by the terminate command ('T')
+otherwise they will cause a GPU hang to be reported.
+
 Sync (fd) fences
 ----------------
 
diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim
index 20fdcf8c8b4a..17490ddfaddd 100644
--- a/benchmarks/wsim/frame-split-60fps.wsim
+++ b/benchmarks/wsim/frame-split-60fps.wsim
@@ -6,10 +6,12 @@ M.2.VCS2
 B.2
 b.2.VCS2.VCS1
 f
-1.DEFAULT.4000-6000.f-1.0
+1.DEFAULT.*.f-1.0
 2.DEFAULT.4000-6000.s-1.0
 a.-3
-3.RCS.2000-4000.-3/-2.0
+s.-2
+T.-4
+3.RCS.2000-4000.-5/-4.0
 3.VECS.2000.-1.0
 4.BCS.1000.-1.0
 s.-2
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 08/15] gem_wsim: Command line switch for specifying low slice count workloads
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new command line switch ('-s') is added which toggles the low slice
count mode for workloads following on the command line.

This enables easy benchmarking of the effect of running the existing media
workloads in parallel against another client. For example:

  ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim

Adding or removing the '-s' switch before the second workload enables
analyzing the cost of dynamic SSEU switching impacted to the first
(master) workload.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 44 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 47a1ddaf8792..8dd887a5afd8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -100,6 +100,7 @@ struct w_arg {
 	char *filename;
 	char *desc;
 	int prio;
+	bool sseu;
 };
 
 struct bond {
@@ -179,6 +180,7 @@ struct workload
 	unsigned int nr_steps;
 	struct w_step *steps;
 	int prio;
+	bool sseu;
 
 	pthread_t thread;
 	bool run;
@@ -251,6 +253,7 @@ static int fd;
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
 #define I915		(1<<10)
+#define SSEU		(1<<11)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -726,6 +729,7 @@ add_step:
 	wrk->nr_steps = nr_steps;
 	wrk->steps = steps;
 	wrk->prio = arg->prio;
+	wrk->sseu = arg->sseu;
 
 	free(desc);
 
@@ -771,6 +775,7 @@ clone_workload(struct workload *_wrk)
 	memset(wrk, 0, sizeof(*wrk));
 
 	wrk->prio = _wrk->prio;
+	wrk->sseu = _wrk->sseu;
 	wrk->nr_steps = _wrk->nr_steps;
 	wrk->steps = calloc(wrk->nr_steps, sizeof(struct w_step));
 	igt_assert(wrk->steps);
@@ -1136,6 +1141,26 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
+static void
+set_ctx_sseu(uint32_t ctx)
+{
+	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param param = { };
+
+	sseu.class = I915_ENGINE_CLASS_RENDER;
+	sseu.instance = 0;
+
+	param.ctx_id = ctx;
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&sseu;
+
+	gem_context_get_param(fd, &param);
+
+	sseu.slice_mask = 1;
+
+	gem_context_set_param(fd, &param);
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1487,6 +1512,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
+		if (wrk->sseu)
+			set_ctx_sseu(arg.ctx_id);
+
 		if (share_vm)
 			vm_destroy(fd, share_vm);
 	}
@@ -2661,6 +2689,8 @@ static void print_help(void)
 "  -R              Round-robin initial VCS assignment per client.\n"
 "  -H              Send heartbeat on synchronisation points with seqno based\n"
 "                  balancers. Gives better engine busyness view in some cases.\n"
+"  -s              Turn on small SSEU config for the next workload on the\n"
+"                  command line. Subsequent -s switches it off.\n"
 "  -S              Synchronize the sequence of random batch durations between\n"
 "                  clients.\n"
 "  -G              Global load balancing - a single load balancer will be shared\n"
@@ -2703,11 +2733,12 @@ static char *load_workload_descriptor(char *filename)
 }
 
 static struct w_arg *
-add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg, int prio)
+add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg,
+		 int prio, bool sseu)
 {
 	w_args = realloc(w_args, sizeof(*w_args) * nr_args);
 	igt_assert(w_args);
-	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio };
+	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio, sseu };
 
 	return w_args;
 }
@@ -2800,7 +2831,8 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
-	while ((c = getopt(argc, argv, "hqv2RSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+	while ((c = getopt(argc, argv,
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -2810,7 +2842,8 @@ int main(int argc, char **argv)
 			master_workload = nr_w_args;
 			/* Fall through */
 		case 'w':
-			w_args = add_workload_arg(w_args, ++nr_w_args, optarg, prio);
+			w_args = add_workload_arg(w_args, ++nr_w_args, optarg,
+						  prio, flags & SSEU);
 			break;
 		case 'p':
 			prio = atoi(optarg);
@@ -2852,6 +2885,9 @@ int main(int argc, char **argv)
 		case 'S':
 			flags |= SYNCEDCLIENTS;
 			break;
+		case 's':
+			flags ^= SSEU;
+			break;
 		case 'H':
 			flags |= HEARTBEAT;
 			break;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 08/15] gem_wsim: Command line switch for specifying low slice count workloads
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new command line switch ('-s') is added which toggles the low slice
count mode for workloads following on the command line.

This enables easy benchmarking of the effect of running the existing media
workloads in parallel against another client. For example:

  ./gem_wsim -n ... -v -r 600 -W master.wsim -s -w media_nn480.wsim

Adding or removing the '-s' switch before the second workload enables
analyzing the cost of dynamic SSEU switching impacted to the first
(master) workload.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 44 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 47a1ddaf8792..8dd887a5afd8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -100,6 +100,7 @@ struct w_arg {
 	char *filename;
 	char *desc;
 	int prio;
+	bool sseu;
 };
 
 struct bond {
@@ -179,6 +180,7 @@ struct workload
 	unsigned int nr_steps;
 	struct w_step *steps;
 	int prio;
+	bool sseu;
 
 	pthread_t thread;
 	bool run;
@@ -251,6 +253,7 @@ static int fd;
 #define GLOBAL_BALANCE	(1<<8)
 #define DEPSYNC		(1<<9)
 #define I915		(1<<10)
+#define SSEU		(1<<11)
 
 #define SEQNO_IDX(engine) ((engine) * 16)
 #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t))
@@ -726,6 +729,7 @@ add_step:
 	wrk->nr_steps = nr_steps;
 	wrk->steps = steps;
 	wrk->prio = arg->prio;
+	wrk->sseu = arg->sseu;
 
 	free(desc);
 
@@ -771,6 +775,7 @@ clone_workload(struct workload *_wrk)
 	memset(wrk, 0, sizeof(*wrk));
 
 	wrk->prio = _wrk->prio;
+	wrk->sseu = _wrk->sseu;
 	wrk->nr_steps = _wrk->nr_steps;
 	wrk->steps = calloc(wrk->nr_steps, sizeof(struct w_step));
 	igt_assert(wrk->steps);
@@ -1136,6 +1141,26 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
+static void
+set_ctx_sseu(uint32_t ctx)
+{
+	struct drm_i915_gem_context_param_sseu sseu = { };
+	struct drm_i915_gem_context_param param = { };
+
+	sseu.class = I915_ENGINE_CLASS_RENDER;
+	sseu.instance = 0;
+
+	param.ctx_id = ctx;
+	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.value = (uintptr_t)&sseu;
+
+	gem_context_get_param(fd, &param);
+
+	sseu.slice_mask = 1;
+
+	gem_context_set_param(fd, &param);
+}
+
 static int
 prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 {
@@ -1487,6 +1512,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
+		if (wrk->sseu)
+			set_ctx_sseu(arg.ctx_id);
+
 		if (share_vm)
 			vm_destroy(fd, share_vm);
 	}
@@ -2661,6 +2689,8 @@ static void print_help(void)
 "  -R              Round-robin initial VCS assignment per client.\n"
 "  -H              Send heartbeat on synchronisation points with seqno based\n"
 "                  balancers. Gives better engine busyness view in some cases.\n"
+"  -s              Turn on small SSEU config for the next workload on the\n"
+"                  command line. Subsequent -s switches it off.\n"
 "  -S              Synchronize the sequence of random batch durations between\n"
 "                  clients.\n"
 "  -G              Global load balancing - a single load balancer will be shared\n"
@@ -2703,11 +2733,12 @@ static char *load_workload_descriptor(char *filename)
 }
 
 static struct w_arg *
-add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg, int prio)
+add_workload_arg(struct w_arg *w_args, unsigned int nr_args, char *w_arg,
+		 int prio, bool sseu)
 {
 	w_args = realloc(w_args, sizeof(*w_args) * nr_args);
 	igt_assert(w_args);
-	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio };
+	w_args[nr_args - 1] = (struct w_arg) { w_arg, NULL, prio, sseu };
 
 	return w_args;
 }
@@ -2800,7 +2831,8 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
-	while ((c = getopt(argc, argv, "hqv2RSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+	while ((c = getopt(argc, argv,
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -2810,7 +2842,8 @@ int main(int argc, char **argv)
 			master_workload = nr_w_args;
 			/* Fall through */
 		case 'w':
-			w_args = add_workload_arg(w_args, ++nr_w_args, optarg, prio);
+			w_args = add_workload_arg(w_args, ++nr_w_args, optarg,
+						  prio, flags & SSEU);
 			break;
 		case 'p':
 			prio = atoi(optarg);
@@ -2852,6 +2885,9 @@ int main(int argc, char **argv)
 		case 'S':
 			flags |= SYNCEDCLIENTS;
 			break;
+		case 's':
+			flags ^= SSEU;
+			break;
 		case 'H':
 			flags |= HEARTBEAT;
 			break;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 09/15] gem_wsim: Per context SSEU control
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command ('S') is added which allows per context slice
(re-)configuration.

v2:
 * Only query device SSEU on first use. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 83 ++++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README | 23 +++++++++++-
 2 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 8dd887a5afd8..ede505e537fd 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 	LOAD_BALANCE,
 	BOND,
 	TERMINATE,
+	SSEU
 };
 
 struct deps
@@ -136,6 +137,7 @@ struct w_step
 			uint64_t bond_mask;
 			enum intel_engine_id bond_master;
 		};
+		int sseu;
 	};
 
 	/* Implementation details */
@@ -171,6 +173,7 @@ struct ctx {
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
+	uint64_t sseu;
 };
 
 struct workload
@@ -241,6 +244,9 @@ static unsigned int context_vcs_rr;
 
 static int verbose = 1;
 static int fd;
+static struct drm_i915_gem_context_param_sseu device_sseu = {
+	.slice_mask = -1 /* Force read on first use. */
+};
 
 #define SWAPVCS		(1<<0)
 #define SEQNO		(1<<1)
@@ -482,6 +488,27 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				int_field(SYNC, target,
 					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
 					  "Invalid sync target at step %u!\n");
+			} else if (!strcmp(field, "S")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(tmp <= 0 && nr == 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid SSEU format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
+						step.context = tmp;
+					else if (nr == 1)
+						step.sseu = tmp;
+
+					nr++;
+				}
+
+				step.type = SSEU;
+				goto add_step;
 			} else if (!strcmp(field, "t")) {
 				int_field(THROTTLE, throttle,
 					  tmp < 0,
@@ -1141,24 +1168,38 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
-static void
-set_ctx_sseu(uint32_t ctx)
+static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 {
-	struct drm_i915_gem_context_param_sseu sseu = { };
 	struct drm_i915_gem_context_param param = { };
 
-	sseu.class = I915_ENGINE_CLASS_RENDER;
-	sseu.instance = 0;
+	if (device_sseu.slice_mask == -1) {
+		param.param = I915_CONTEXT_PARAM_SSEU;
+		param.value = (uintptr_t)&device_sseu;
+
+		gem_context_get_param(fd, &param);
+	}
+
+	return device_sseu;
+}
+
+static uint64_t
+set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+{
+	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
+	struct drm_i915_gem_context_param param = { };
+
+	if (slice_mask == -1)
+		slice_mask = device_sseu.slice_mask;
+
+	sseu.slice_mask = slice_mask;
 
 	param.ctx_id = ctx;
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&sseu;
 
-	gem_context_get_param(fd, &param);
-
-	sseu.slice_mask = 1;
-
 	gem_context_set_param(fd, &param);
+
+	return slice_mask;
 }
 
 static int
@@ -1352,6 +1393,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		igt_assert(ctx_id);
 		ctx->id = ctx_id;
+		ctx->sseu = device_sseu.slice_mask;
 
 		if (flags & GLOBAL_BALANCE) {
 			ctx->static_vcs = context_vcs_rr;
@@ -1512,8 +1554,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
-		if (wrk->sseu)
-			set_ctx_sseu(arg.ctx_id);
+		if (wrk->sseu) {
+			/* Set to slice 0 only, one slice. */
+			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+		}
 
 		if (share_vm)
 			vm_destroy(fd, share_vm);
@@ -1550,6 +1594,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Scan for SSEU control steps.
+	 */
+	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+		if (w->type == SSEU) {
+			get_device_sseu();
+			break;
+		}
+	}
+
 	/*
 	 * Allocate batch buffers.
 	 */
@@ -2485,6 +2539,13 @@ static void *run_workload(void *data)
 				   w->type == LOAD_BALANCE ||
 				   w->type == BOND) {
 				continue;
+			} else if (w->type == SSEU) {
+				if (w->sseu != wrk->ctx_list[w->context].sseu) {
+					wrk->ctx_list[w->context].sseu =
+						set_ctx_sseu(wrk->ctx_list[w->context].id,
+							     w->sseu);
+				}
+				continue;
 			}
 
 			if (do_sleep || w->type == PERIOD) {
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 497d5cad2142..9f770217f075 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -5,7 +5,7 @@ ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
-P|X.<uint>.<int>
+P|S|X.<uint>.<int>
 d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'S' - Context SSEU configuration.
  'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
@@ -254,3 +255,23 @@ then look like:
   1.DEFAULT.1000.f-1.0
   2.DEFAULT.1000.s-1.0
   a.-3
+
+Context SSEU configuration
+--------------------------
+
+  S.1.1
+  1.RCS.1000.0.0
+  S.2.-1
+  2.RCS.1000.0.0
+
+Context 1 is configured to run with one enabled slice (slice mask 1) and a batch
+is sumitted against it. Context 2 is configured to run with all slices (this is
+the default so the command could also be omitted) and a batch submitted against
+it.
+
+This shows the dynamic SSEU reconfiguration cost beween two contexts competing
+for the render engine.
+
+Slice mask of -1 has a special meaning of "all slices". Otherwise any integer
+can be specifying as the slice mask, but beware any apart from 1 and -1 can make
+the workload not portable between different GPUs.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 09/15] gem_wsim: Per context SSEU control
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

A new workload command ('S') is added which allows per context slice
(re-)configuration.

v2:
 * Only query device SSEU on first use. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c  | 83 ++++++++++++++++++++++++++++++++++++------
 benchmarks/wsim/README | 23 +++++++++++-
 2 files changed, 94 insertions(+), 12 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 8dd887a5afd8..ede505e537fd 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -87,6 +87,7 @@ enum w_type
 	LOAD_BALANCE,
 	BOND,
 	TERMINATE,
+	SSEU
 };
 
 struct deps
@@ -136,6 +137,7 @@ struct w_step
 			uint64_t bond_mask;
 			enum intel_engine_id bond_master;
 		};
+		int sseu;
 	};
 
 	/* Implementation details */
@@ -171,6 +173,7 @@ struct ctx {
 	bool targets_instance;
 	bool wants_balance;
 	unsigned int static_vcs;
+	uint64_t sseu;
 };
 
 struct workload
@@ -241,6 +244,9 @@ static unsigned int context_vcs_rr;
 
 static int verbose = 1;
 static int fd;
+static struct drm_i915_gem_context_param_sseu device_sseu = {
+	.slice_mask = -1 /* Force read on first use. */
+};
 
 #define SWAPVCS		(1<<0)
 #define SEQNO		(1<<1)
@@ -482,6 +488,27 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w)
 				int_field(SYNC, target,
 					  tmp >= 0 || ((int)nr_steps + tmp) < 0,
 					  "Invalid sync target at step %u!\n");
+			} else if (!strcmp(field, "S")) {
+				unsigned int nr = 0;
+				while ((field = strtok_r(fstart, ".", &fctx))) {
+					tmp = atoi(field);
+					check_arg(tmp <= 0 && nr == 0,
+						  "Invalid context at step %u!\n",
+						  nr_steps);
+					check_arg(nr > 1,
+						  "Invalid SSEU format at step %u!\n",
+						  nr_steps);
+
+					if (nr == 0)
+						step.context = tmp;
+					else if (nr == 1)
+						step.sseu = tmp;
+
+					nr++;
+				}
+
+				step.type = SSEU;
+				goto add_step;
 			} else if (!strcmp(field, "t")) {
 				int_field(THROTTLE, throttle,
 					  tmp < 0,
@@ -1141,24 +1168,38 @@ find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	return 0;
 }
 
-static void
-set_ctx_sseu(uint32_t ctx)
+static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 {
-	struct drm_i915_gem_context_param_sseu sseu = { };
 	struct drm_i915_gem_context_param param = { };
 
-	sseu.class = I915_ENGINE_CLASS_RENDER;
-	sseu.instance = 0;
+	if (device_sseu.slice_mask == -1) {
+		param.param = I915_CONTEXT_PARAM_SSEU;
+		param.value = (uintptr_t)&device_sseu;
+
+		gem_context_get_param(fd, &param);
+	}
+
+	return device_sseu;
+}
+
+static uint64_t
+set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+{
+	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
+	struct drm_i915_gem_context_param param = { };
+
+	if (slice_mask == -1)
+		slice_mask = device_sseu.slice_mask;
+
+	sseu.slice_mask = slice_mask;
 
 	param.ctx_id = ctx;
 	param.param = I915_CONTEXT_PARAM_SSEU;
 	param.value = (uintptr_t)&sseu;
 
-	gem_context_get_param(fd, &param);
-
-	sseu.slice_mask = 1;
-
 	gem_context_set_param(fd, &param);
+
+	return slice_mask;
 }
 
 static int
@@ -1352,6 +1393,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		igt_assert(ctx_id);
 		ctx->id = ctx_id;
+		ctx->sseu = device_sseu.slice_mask;
 
 		if (flags & GLOBAL_BALANCE) {
 			ctx->static_vcs = context_vcs_rr;
@@ -1512,8 +1554,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			gem_context_set_param(fd, &param);
 		}
 
-		if (wrk->sseu)
-			set_ctx_sseu(arg.ctx_id);
+		if (wrk->sseu) {
+			/* Set to slice 0 only, one slice. */
+			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+		}
 
 		if (share_vm)
 			vm_destroy(fd, share_vm);
@@ -1550,6 +1594,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 		}
 	}
 
+	/*
+	 * Scan for SSEU control steps.
+	 */
+	for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) {
+		if (w->type == SSEU) {
+			get_device_sseu();
+			break;
+		}
+	}
+
 	/*
 	 * Allocate batch buffers.
 	 */
@@ -2485,6 +2539,13 @@ static void *run_workload(void *data)
 				   w->type == LOAD_BALANCE ||
 				   w->type == BOND) {
 				continue;
+			} else if (w->type == SSEU) {
+				if (w->sseu != wrk->ctx_list[w->context].sseu) {
+					wrk->ctx_list[w->context].sseu =
+						set_ctx_sseu(wrk->ctx_list[w->context].id,
+							     w->sseu);
+				}
+				continue;
 			}
 
 			if (do_sleep || w->type == PERIOD) {
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README
index 497d5cad2142..9f770217f075 100644
--- a/benchmarks/wsim/README
+++ b/benchmarks/wsim/README
@@ -5,7 +5,7 @@ ctx.engine.duration_us.dependency.wait,...
 <uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,...
 B.<uint>
 M.<uint>.<str>[|<str>]...
-P|X.<uint>.<int>
+P|S|X.<uint>.<int>
 d|p|s|t|q|a|T.<int>,...
 b.<uint>.<str>[|<str>].<str>
 f
@@ -30,6 +30,7 @@ Additional workload steps are also supported:
  'b' - Set up engine bonds.
  'M' - Set up engine map.
  'P' - Context priority.
+ 'S' - Context SSEU configuration.
  'T' - Terminate an infinite batch.
  'X' - Context preemption control.
 
@@ -254,3 +255,23 @@ then look like:
   1.DEFAULT.1000.f-1.0
   2.DEFAULT.1000.s-1.0
   a.-3
+
+Context SSEU configuration
+--------------------------
+
+  S.1.1
+  1.RCS.1000.0.0
+  S.2.-1
+  2.RCS.1000.0.0
+
+Context 1 is configured to run with one enabled slice (slice mask 1) and a batch
+is sumitted against it. Context 2 is configured to run with all slices (this is
+the default so the command could also be omitted) and a batch submitted against
+it.
+
+This shows the dynamic SSEU reconfiguration cost beween two contexts competing
+for the render engine.
+
+Slice mask of -1 has a special meaning of "all slices". Otherwise any integer
+can be specifying as the slice mask, but beware any apart from 1 and -1 can make
+the workload not portable between different GPUs.
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 10/15] gem_wsim: Allow RCS virtual engine with SSEU control
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To allow exercising the SSEU configuration in combination with Virtual
Engine, allow RCS to be specified in the engine map and use appropriate
index based addressing when applying SSEU configuration to it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 51 ++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index ede505e537fd..fe70e7719d88 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -382,7 +382,8 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		if ((int)engine < 0)
 			return -1;
 
-		if (engine != VCS && engine != VCS1 && engine != VCS2)
+		if (engine != VCS && engine != VCS1 && engine != VCS2 &&
+		    engine != RCS)
 			return -1; /* TODO */
 
 		add = engine == VCS ? 2 : 1;
@@ -1183,7 +1184,7 @@ static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 }
 
 static uint64_t
-set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+set_ctx_sseu(struct ctx *ctx, uint64_t slice_mask)
 {
 	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
 	struct drm_i915_gem_context_param param = { };
@@ -1191,10 +1192,17 @@ set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 	if (slice_mask == -1)
 		slice_mask = device_sseu.slice_mask;
 
+	if (ctx->engine_map && ctx->wants_balance) {
+		sseu.flags = I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX;
+		sseu.engine.engine_class = I915_ENGINE_CLASS_INVALID;
+		sseu.engine.engine_instance = 0;
+	}
+
 	sseu.slice_mask = slice_mask;
 
-	param.ctx_id = ctx;
+	param.ctx_id = ctx->id;
 	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.size = sizeof(sseu);
 	param.value = (uintptr_t)&sseu;
 
 	gem_context_set_param(fd, &param);
@@ -1458,10 +1466,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map_count;
 
 				for (j = 0; j < ctx->engine_map_count; j++) {
-					load_balance.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					load_balance.engines[j].engine_instance =
-						ctx->engine_map[j] - VCS1; /* FIXME */
+					if (ctx->engine_map[j] == RCS) {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_RENDER;
+						load_balance.engines[j].engine_instance =
+							0; /* FIXME */
+					} else {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_VIDEO; /* FIXME */
+						load_balance.engines[j].engine_instance =
+							ctx->engine_map[j] - VCS1; /* FIXME */
+					}
 				}
 			} else {
 				set_engines.extensions = 0;
@@ -1474,10 +1489,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				I915_ENGINE_CLASS_INVALID_NONE;
 
 			for (j = 1; j <= ctx->engine_map_count; j++) {
-				set_engines.engines[j].engine_class =
-					I915_ENGINE_CLASS_VIDEO; /* FIXME */
-				set_engines.engines[j].engine_instance =
-					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				if (ctx->engine_map[j - 1] == RCS) {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_RENDER;
+					set_engines.engines[j].engine_instance = 0; /* FIXME */
+				} else {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					set_engines.engines[j].engine_instance =
+						ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				}
 			}
 
 			for (j = 0; j < ctx->bond_count; j++) {
@@ -1556,7 +1577,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		if (wrk->sseu) {
 			/* Set to slice 0 only, one slice. */
-			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+			ctx->sseu = set_ctx_sseu(ctx, 1);
 		}
 
 		if (share_vm)
@@ -2540,9 +2561,9 @@ static void *run_workload(void *data)
 				   w->type == BOND) {
 				continue;
 			} else if (w->type == SSEU) {
-				if (w->sseu != wrk->ctx_list[w->context].sseu) {
-					wrk->ctx_list[w->context].sseu =
-						set_ctx_sseu(wrk->ctx_list[w->context].id,
+				if (w->sseu != wrk->ctx_list[w->context * 2].sseu) {
+					wrk->ctx_list[w->context * 2].sseu =
+						set_ctx_sseu(&wrk->ctx_list[w->context * 2],
 							     w->sseu);
 				}
 				continue;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 10/15] gem_wsim: Allow RCS virtual engine with SSEU control
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To allow exercising the SSEU configuration in combination with Virtual
Engine, allow RCS to be specified in the engine map and use appropriate
index based addressing when applying SSEU configuration to it.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 51 ++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index ede505e537fd..fe70e7719d88 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -382,7 +382,8 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		if ((int)engine < 0)
 			return -1;
 
-		if (engine != VCS && engine != VCS1 && engine != VCS2)
+		if (engine != VCS && engine != VCS1 && engine != VCS2 &&
+		    engine != RCS)
 			return -1; /* TODO */
 
 		add = engine == VCS ? 2 : 1;
@@ -1183,7 +1184,7 @@ static struct drm_i915_gem_context_param_sseu get_device_sseu(void)
 }
 
 static uint64_t
-set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
+set_ctx_sseu(struct ctx *ctx, uint64_t slice_mask)
 {
 	struct drm_i915_gem_context_param_sseu sseu = get_device_sseu();
 	struct drm_i915_gem_context_param param = { };
@@ -1191,10 +1192,17 @@ set_ctx_sseu(uint32_t ctx, uint64_t slice_mask)
 	if (slice_mask == -1)
 		slice_mask = device_sseu.slice_mask;
 
+	if (ctx->engine_map && ctx->wants_balance) {
+		sseu.flags = I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX;
+		sseu.engine.engine_class = I915_ENGINE_CLASS_INVALID;
+		sseu.engine.engine_instance = 0;
+	}
+
 	sseu.slice_mask = slice_mask;
 
-	param.ctx_id = ctx;
+	param.ctx_id = ctx->id;
 	param.param = I915_CONTEXT_PARAM_SSEU;
+	param.size = sizeof(sseu);
 	param.value = (uintptr_t)&sseu;
 
 	gem_context_set_param(fd, &param);
@@ -1458,10 +1466,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 					ctx->engine_map_count;
 
 				for (j = 0; j < ctx->engine_map_count; j++) {
-					load_balance.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					load_balance.engines[j].engine_instance =
-						ctx->engine_map[j] - VCS1; /* FIXME */
+					if (ctx->engine_map[j] == RCS) {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_RENDER;
+						load_balance.engines[j].engine_instance =
+							0; /* FIXME */
+					} else {
+						load_balance.engines[j].engine_class =
+							I915_ENGINE_CLASS_VIDEO; /* FIXME */
+						load_balance.engines[j].engine_instance =
+							ctx->engine_map[j] - VCS1; /* FIXME */
+					}
 				}
 			} else {
 				set_engines.extensions = 0;
@@ -1474,10 +1489,16 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				I915_ENGINE_CLASS_INVALID_NONE;
 
 			for (j = 1; j <= ctx->engine_map_count; j++) {
-				set_engines.engines[j].engine_class =
-					I915_ENGINE_CLASS_VIDEO; /* FIXME */
-				set_engines.engines[j].engine_instance =
-					ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				if (ctx->engine_map[j - 1] == RCS) {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_RENDER;
+					set_engines.engines[j].engine_instance = 0; /* FIXME */
+				} else {
+					set_engines.engines[j].engine_class =
+						I915_ENGINE_CLASS_VIDEO; /* FIXME */
+					set_engines.engines[j].engine_instance =
+						ctx->engine_map[j - 1] - VCS1; /* FIXME */
+				}
 			}
 
 			for (j = 0; j < ctx->bond_count; j++) {
@@ -1556,7 +1577,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 		if (wrk->sseu) {
 			/* Set to slice 0 only, one slice. */
-			ctx->sseu = set_ctx_sseu(ctx_id, 1);
+			ctx->sseu = set_ctx_sseu(ctx, 1);
 		}
 
 		if (share_vm)
@@ -2540,9 +2561,9 @@ static void *run_workload(void *data)
 				   w->type == BOND) {
 				continue;
 			} else if (w->type == SSEU) {
-				if (w->sseu != wrk->ctx_list[w->context].sseu) {
-					wrk->ctx_list[w->context].sseu =
-						set_ctx_sseu(wrk->ctx_list[w->context].id,
+				if (w->sseu != wrk->ctx_list[w->context * 2].sseu) {
+					wrk->ctx_list[w->context * 2].sseu =
+						set_ctx_sseu(&wrk->ctx_list[w->context * 2],
 							     w->sseu);
 				}
 				continue;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 11/15] gem_wsim: Consolidate engine assignments into helpers
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This will allow applying the discovered engine configuration from a single
place.

v2:
 * Consolidate enum to ci conversion.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 163 ++++++++++++++++++++++++------------------
 1 file changed, 94 insertions(+), 69 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index fe70e7719d88..02c3e9d655d8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,6 +365,66 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static unsigned int num_engines_in_class(enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	return 2;
+}
+
+static void
+fill_engines_class(struct i915_engine_class_instance *ci,
+		   enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[0].engine_instance = 0;
+
+	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[1].engine_instance = 1;
+}
+
+static void
+fill_engines_id_class(enum intel_engine_id *list,
+		      enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	list[0] = VCS1;
+	list[1] = VCS2;
+}
+
+static struct i915_engine_class_instance
+get_engine(enum intel_engine_id engine)
+{
+	struct i915_engine_class_instance ci;
+
+	switch (engine) {
+	case RCS:
+		ci.engine_class = I915_ENGINE_CLASS_RENDER;
+		ci.engine_instance = 0;
+		break;
+	case BCS:
+		ci.engine_class = I915_ENGINE_CLASS_COPY;
+		ci.engine_instance = 0;
+		break;
+	case VCS1:
+	case VCS2:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
+		ci.engine_instance = engine - VCS1;
+		break;
+	case VECS:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO_ENHANCE;
+		ci.engine_instance = 0;
+		break;
+	default:
+		igt_assert(0);
+	};
+
+	return ci;
+}
+
 static int parse_engine_map(struct w_step *step, const char *_str)
 {
 	char *token, *tctx = NULL, *tstart = (char *)_str;
@@ -386,18 +446,16 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		    engine != RCS)
 			return -1; /* TODO */
 
-		add = engine == VCS ? 2 : 1;
+		add = engine == VCS ? num_engines_in_class(VCS) : 1;
 		step->engine_map_count += add;
 		step->engine_map = realloc(step->engine_map,
 					   step->engine_map_count *
 					   sizeof(step->engine_map[0]));
 
-		if (engine != VCS) {
-			step->engine_map[step->engine_map_count - 1] = engine;
-		} else {
-			step->engine_map[step->engine_map_count - 2] = VCS1;
-			step->engine_map[step->engine_map_count - 1] = VCS2;
-		}
+		if (engine != VCS)
+			step->engine_map[step->engine_map_count - add] = engine;
+		else
+			fill_engines_id_class(&step->engine_map[step->engine_map_count - add], VCS);
 	}
 
 	return 0;
@@ -1148,20 +1206,11 @@ static unsigned int
 find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	    enum intel_engine_id engine)
 {
-	static struct i915_engine_class_instance map[] = {
-		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
-		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
-		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
-		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
-		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
-	};
+	struct i915_engine_class_instance e = get_engine(engine);
 	unsigned int i;
 
-	igt_assert(engine < ARRAY_SIZE(map));
-	igt_assert(engine == RCS || map[engine].engine_class);
-
 	for (i = 0; i < count; i++, ci++) {
-		if (!memcmp(&map[engine], ci, sizeof(*ci)))
+		if (!memcmp(&e, ci, sizeof(*ci)))
 			return i;
 	}
 
@@ -1465,19 +1514,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				load_balance.num_siblings =
 					ctx->engine_map_count;
 
-				for (j = 0; j < ctx->engine_map_count; j++) {
-					if (ctx->engine_map[j] == RCS) {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_RENDER;
-						load_balance.engines[j].engine_instance =
-							0; /* FIXME */
-					} else {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_VIDEO; /* FIXME */
-						load_balance.engines[j].engine_instance =
-							ctx->engine_map[j] - VCS1; /* FIXME */
-					}
-				}
+				for (j = 0; j < ctx->engine_map_count; j++)
+					load_balance.engines[j] =
+						get_engine(ctx->engine_map[j]);
 			} else {
 				set_engines.extensions = 0;
 			}
@@ -1488,18 +1527,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			set_engines.engines[0].engine_instance =
 				I915_ENGINE_CLASS_INVALID_NONE;
 
-			for (j = 1; j <= ctx->engine_map_count; j++) {
-				if (ctx->engine_map[j - 1] == RCS) {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_RENDER;
-					set_engines.engines[j].engine_instance = 0; /* FIXME */
-				} else {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					set_engines.engines[j].engine_instance =
-						ctx->engine_map[j - 1] - VCS1; /* FIXME */
-				}
-			}
+			for (j = 1; j <= ctx->engine_map_count; j++)
+				set_engines.engines[j] =
+					get_engine(ctx->engine_map[j - 1]);
 
 			for (j = 0; j < ctx->bond_count; j++) {
 				unsigned long mask = ctx->bonds[j].mask;
@@ -1522,10 +1552,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
 				p->virtual_index = 0;
-				p->master.engine_class =
-					I915_ENGINE_CLASS_VIDEO;
-				p->master.engine_instance =
-					ctx->bonds[j].master - VCS1;
+				p->master = get_engine(ctx->bonds[j].master);
 
 				for (b = 0, e = 0; mask; e++, mask >>= 1) {
 					unsigned int idx;
@@ -1543,28 +1570,11 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
-			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
-				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
-				.num_siblings = 2,
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
-				.extensions = to_user_pointer(&load_balance),
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_INVALID,
-					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-
+			const unsigned int count = num_engines_in_class(VCS);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 count);
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  count + 1);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1572,6 +1582,21 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
+			set_engines.extensions = to_user_pointer(&load_balance);
+
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+			fill_engines_class(&set_engines.engines[1], VCS);
+
+			memset(&load_balance, 0, sizeof(load_balance));
+			load_balance.base.name =
+				I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+			load_balance.num_siblings = count;
+
+			fill_engines_class(&load_balance.engines[0], VCS);
+
 			gem_context_set_param(fd, &param);
 		}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 11/15] gem_wsim: Consolidate engine assignments into helpers
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This will allow applying the discovered engine configuration from a single
place.

v2:
 * Consolidate enum to ci conversion.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 163 ++++++++++++++++++++++++------------------
 1 file changed, 94 insertions(+), 69 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index fe70e7719d88..02c3e9d655d8 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,6 +365,66 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static unsigned int num_engines_in_class(enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	return 2;
+}
+
+static void
+fill_engines_class(struct i915_engine_class_instance *ci,
+		   enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[0].engine_instance = 0;
+
+	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
+	ci[1].engine_instance = 1;
+}
+
+static void
+fill_engines_id_class(enum intel_engine_id *list,
+		      enum intel_engine_id class)
+{
+	igt_assert(class == VCS);
+
+	list[0] = VCS1;
+	list[1] = VCS2;
+}
+
+static struct i915_engine_class_instance
+get_engine(enum intel_engine_id engine)
+{
+	struct i915_engine_class_instance ci;
+
+	switch (engine) {
+	case RCS:
+		ci.engine_class = I915_ENGINE_CLASS_RENDER;
+		ci.engine_instance = 0;
+		break;
+	case BCS:
+		ci.engine_class = I915_ENGINE_CLASS_COPY;
+		ci.engine_instance = 0;
+		break;
+	case VCS1:
+	case VCS2:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
+		ci.engine_instance = engine - VCS1;
+		break;
+	case VECS:
+		ci.engine_class = I915_ENGINE_CLASS_VIDEO_ENHANCE;
+		ci.engine_instance = 0;
+		break;
+	default:
+		igt_assert(0);
+	};
+
+	return ci;
+}
+
 static int parse_engine_map(struct w_step *step, const char *_str)
 {
 	char *token, *tctx = NULL, *tstart = (char *)_str;
@@ -386,18 +446,16 @@ static int parse_engine_map(struct w_step *step, const char *_str)
 		    engine != RCS)
 			return -1; /* TODO */
 
-		add = engine == VCS ? 2 : 1;
+		add = engine == VCS ? num_engines_in_class(VCS) : 1;
 		step->engine_map_count += add;
 		step->engine_map = realloc(step->engine_map,
 					   step->engine_map_count *
 					   sizeof(step->engine_map[0]));
 
-		if (engine != VCS) {
-			step->engine_map[step->engine_map_count - 1] = engine;
-		} else {
-			step->engine_map[step->engine_map_count - 2] = VCS1;
-			step->engine_map[step->engine_map_count - 1] = VCS2;
-		}
+		if (engine != VCS)
+			step->engine_map[step->engine_map_count - add] = engine;
+		else
+			fill_engines_id_class(&step->engine_map[step->engine_map_count - add], VCS);
 	}
 
 	return 0;
@@ -1148,20 +1206,11 @@ static unsigned int
 find_engine(struct i915_engine_class_instance *ci, unsigned int count,
 	    enum intel_engine_id engine)
 {
-	static struct i915_engine_class_instance map[] = {
-		[RCS] = { I915_ENGINE_CLASS_RENDER, 0 },
-		[BCS] = { I915_ENGINE_CLASS_COPY, 0 },
-		[VCS1] = { I915_ENGINE_CLASS_VIDEO, 0 },
-		[VCS2] = { I915_ENGINE_CLASS_VIDEO, 1 },
-		[VECS] = { I915_ENGINE_CLASS_VIDEO_ENHANCE, 0 },
-	};
+	struct i915_engine_class_instance e = get_engine(engine);
 	unsigned int i;
 
-	igt_assert(engine < ARRAY_SIZE(map));
-	igt_assert(engine == RCS || map[engine].engine_class);
-
 	for (i = 0; i < count; i++, ci++) {
-		if (!memcmp(&map[engine], ci, sizeof(*ci)))
+		if (!memcmp(&e, ci, sizeof(*ci)))
 			return i;
 	}
 
@@ -1465,19 +1514,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				load_balance.num_siblings =
 					ctx->engine_map_count;
 
-				for (j = 0; j < ctx->engine_map_count; j++) {
-					if (ctx->engine_map[j] == RCS) {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_RENDER;
-						load_balance.engines[j].engine_instance =
-							0; /* FIXME */
-					} else {
-						load_balance.engines[j].engine_class =
-							I915_ENGINE_CLASS_VIDEO; /* FIXME */
-						load_balance.engines[j].engine_instance =
-							ctx->engine_map[j] - VCS1; /* FIXME */
-					}
-				}
+				for (j = 0; j < ctx->engine_map_count; j++)
+					load_balance.engines[j] =
+						get_engine(ctx->engine_map[j]);
 			} else {
 				set_engines.extensions = 0;
 			}
@@ -1488,18 +1527,9 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 			set_engines.engines[0].engine_instance =
 				I915_ENGINE_CLASS_INVALID_NONE;
 
-			for (j = 1; j <= ctx->engine_map_count; j++) {
-				if (ctx->engine_map[j - 1] == RCS) {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_RENDER;
-					set_engines.engines[j].engine_instance = 0; /* FIXME */
-				} else {
-					set_engines.engines[j].engine_class =
-						I915_ENGINE_CLASS_VIDEO; /* FIXME */
-					set_engines.engines[j].engine_instance =
-						ctx->engine_map[j - 1] - VCS1; /* FIXME */
-				}
-			}
+			for (j = 1; j <= ctx->engine_map_count; j++)
+				set_engines.engines[j] =
+					get_engine(ctx->engine_map[j - 1]);
 
 			for (j = 0; j < ctx->bond_count; j++) {
 				unsigned long mask = ctx->bonds[j].mask;
@@ -1522,10 +1552,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 				p->base.name = I915_CONTEXT_ENGINES_EXT_BOND;
 				p->virtual_index = 0;
-				p->master.engine_class =
-					I915_ENGINE_CLASS_VIDEO;
-				p->master.engine_instance =
-					ctx->bonds[j].master - VCS1;
+				p->master = get_engine(ctx->bonds[j].master);
 
 				for (b = 0, e = 0; mask; e++, mask >>= 1) {
 					unsigned int idx;
@@ -1543,28 +1570,11 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 			gem_context_set_param(fd, &param);
 		} else if (ctx->wants_balance) {
-			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance, 2) = {
-				.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
-				.num_siblings = 2,
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines, 3) = {
-				.extensions = to_user_pointer(&load_balance),
-				.engines = {
-					{ .engine_class = I915_ENGINE_CLASS_INVALID,
-					  .engine_instance = I915_ENGINE_CLASS_INVALID_NONE },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 0 },
-					{ .engine_class = I915_ENGINE_CLASS_VIDEO,
-					  .engine_instance = 1 },
-				},
-			};
-
+			const unsigned int count = num_engines_in_class(VCS);
+			I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(load_balance,
+								 count);
+			I915_DEFINE_CONTEXT_PARAM_ENGINES(set_engines,
+							  count + 1);
 			struct drm_i915_gem_context_param param = {
 				.ctx_id = ctx_id,
 				.param = I915_CONTEXT_PARAM_ENGINES,
@@ -1572,6 +1582,21 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 				.value = to_user_pointer(&set_engines),
 			};
 
+			set_engines.extensions = to_user_pointer(&load_balance);
+
+			set_engines.engines[0].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			set_engines.engines[0].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+			fill_engines_class(&set_engines.engines[1], VCS);
+
+			memset(&load_balance, 0, sizeof(load_balance));
+			load_balance.base.name =
+				I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+			load_balance.num_siblings = count;
+
+			fill_engines_class(&load_balance.engines[0], VCS);
+
 			gem_context_set_param(fd, &param);
 		}
 
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 12/15] gem_wsim: Discover engines
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Instead of hardcoding the VCS balancing engines, discover, both with the
new engines query, or with the legacy get_param in the fallback case, so
class based addressing always works.

v2:
 * Simplify has_engine_query check. (Andi)
 * Fix assert on uninitialized variable. (Andi)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
---
 benchmarks/gem_wsim.c | 173 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 166 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 02c3e9d655d8..64d19ed469b0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,34 +365,191 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static bool __engines_queried;
+static unsigned int __num_engines;
+static struct i915_engine_class_instance *__engines;
+
+static int
+__i915_query(int i915, struct drm_i915_query *q)
+{
+	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
+		return -errno;
+	return 0;
+}
+
+static int
+__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	struct drm_i915_query q = {
+		.num_items = n_items,
+		.items_ptr = to_user_pointer(items),
+	};
+	return __i915_query(i915, &q);
+}
+
+static void
+i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
+}
+
+static bool has_engine_query(int i915)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
+}
+
+static void query_engines(void)
+{
+	struct i915_engine_class_instance *engines;
+	unsigned int num;
+
+	if (__engines_queried)
+		return;
+
+	__engines_queried = true;
+
+	if (!has_engine_query(fd)) {
+		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
+		unsigned int i = 0;
+
+		igt_assert(num_bsd);
+
+		num = 1 + num_bsd;
+
+		if (gem_has_blt(fd))
+			num++;
+
+		if (gem_has_vebox(fd))
+			num++;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
+		engines[i].engine_instance = 0;
+		i++;
+
+		if (gem_has_blt(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd2(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 1;
+			i++;
+		}
+
+		if (gem_has_vebox(fd)) {
+			engines[i].engine_class =
+				I915_ENGINE_CLASS_VIDEO_ENHANCE;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+	} else {
+		struct drm_i915_query_engine_info *engine_info;
+		struct drm_i915_query_item item = {
+			.query_id = DRM_I915_QUERY_ENGINE_INFO,
+		};
+		const unsigned int sz = 4096;
+		unsigned int i;
+
+		engine_info = malloc(sz);
+		igt_assert(engine_info);
+		memset(engine_info, 0, sz);
+
+		item.data_ptr = to_user_pointer(engine_info);
+		item.length = sz;
+
+		i915_query_items(fd, &item, 1);
+		igt_assert(item.length > 0);
+		igt_assert(item.length <= sz);
+
+		num = engine_info->num_engines;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		for (i = 0; i < num; i++) {
+			struct drm_i915_engine_info *engine =
+				(struct drm_i915_engine_info *)&engine_info->engines[i];
+
+			engines[i] = engine->engine;
+		}
+	}
+
+	__engines = engines;
+	__num_engines = num;
+}
+
 static unsigned int num_engines_in_class(enum intel_engine_id class)
 {
+	unsigned int i, count = 0;
+
 	igt_assert(class == VCS);
 
-	return 2;
+	query_engines();
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class == I915_ENGINE_CLASS_VIDEO)
+			count++;
+	}
+
+	igt_assert(count);
+	return count;
 }
 
 static void
 fill_engines_class(struct i915_engine_class_instance *ci,
 		   enum intel_engine_id class)
 {
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
 
-	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[0].engine_instance = 0;
+	query_engines();
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
 
-	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[1].engine_instance = 1;
+		ci[j].engine_class = __engines[i].engine_class;
+		ci[j].engine_instance = __engines[i].engine_instance;
+		j++;
+	}
 }
 
 static void
 fill_engines_id_class(enum intel_engine_id *list,
 		      enum intel_engine_id class)
 {
+	enum intel_engine_id engine = VCS1;
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
+	igt_assert(num_engines_in_class(VCS) <= 2);
+
+	query_engines();
 
-	list[0] = VCS1;
-	list[1] = VCS2;
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		list[j++] = engine++;
+	}
 }
 
 static struct i915_engine_class_instance
@@ -400,6 +557,8 @@ get_engine(enum intel_engine_id engine)
 {
 	struct i915_engine_class_instance ci;
 
+	query_engines();
+
 	switch (engine) {
 	case RCS:
 		ci.engine_class = I915_ENGINE_CLASS_RENDER;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 12/15] gem_wsim: Discover engines
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Instead of hardcoding the VCS balancing engines, discover, both with the
new engines query, or with the legacy get_param in the fallback case, so
class based addressing always works.

v2:
 * Simplify has_engine_query check. (Andi)
 * Fix assert on uninitialized variable. (Andi)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
---
 benchmarks/gem_wsim.c | 173 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 166 insertions(+), 7 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 02c3e9d655d8..64d19ed469b0 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -365,34 +365,191 @@ static int str_to_engine(const char *str)
 	return -1;
 }
 
+static bool __engines_queried;
+static unsigned int __num_engines;
+static struct i915_engine_class_instance *__engines;
+
+static int
+__i915_query(int i915, struct drm_i915_query *q)
+{
+	if (igt_ioctl(i915, DRM_IOCTL_I915_QUERY, q))
+		return -errno;
+	return 0;
+}
+
+static int
+__i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	struct drm_i915_query q = {
+		.num_items = n_items,
+		.items_ptr = to_user_pointer(items),
+	};
+	return __i915_query(i915, &q);
+}
+
+static void
+i915_query_items(int i915, struct drm_i915_query_item *items, uint32_t n_items)
+{
+	igt_assert_eq(__i915_query_items(i915, items, n_items), 0);
+}
+
+static bool has_engine_query(int i915)
+{
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO,
+	};
+
+	return __i915_query_items(i915, &item, 1) == 0 && item.length > 0;
+}
+
+static void query_engines(void)
+{
+	struct i915_engine_class_instance *engines;
+	unsigned int num;
+
+	if (__engines_queried)
+		return;
+
+	__engines_queried = true;
+
+	if (!has_engine_query(fd)) {
+		unsigned int num_bsd = gem_has_bsd(fd) + gem_has_bsd2(fd);
+		unsigned int i = 0;
+
+		igt_assert(num_bsd);
+
+		num = 1 + num_bsd;
+
+		if (gem_has_blt(fd))
+			num++;
+
+		if (gem_has_vebox(fd))
+			num++;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		engines[i].engine_class = I915_ENGINE_CLASS_RENDER;
+		engines[i].engine_instance = 0;
+		i++;
+
+		if (gem_has_blt(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_COPY;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+
+		if (gem_has_bsd2(fd)) {
+			engines[i].engine_class = I915_ENGINE_CLASS_VIDEO;
+			engines[i].engine_instance = 1;
+			i++;
+		}
+
+		if (gem_has_vebox(fd)) {
+			engines[i].engine_class =
+				I915_ENGINE_CLASS_VIDEO_ENHANCE;
+			engines[i].engine_instance = 0;
+			i++;
+		}
+	} else {
+		struct drm_i915_query_engine_info *engine_info;
+		struct drm_i915_query_item item = {
+			.query_id = DRM_I915_QUERY_ENGINE_INFO,
+		};
+		const unsigned int sz = 4096;
+		unsigned int i;
+
+		engine_info = malloc(sz);
+		igt_assert(engine_info);
+		memset(engine_info, 0, sz);
+
+		item.data_ptr = to_user_pointer(engine_info);
+		item.length = sz;
+
+		i915_query_items(fd, &item, 1);
+		igt_assert(item.length > 0);
+		igt_assert(item.length <= sz);
+
+		num = engine_info->num_engines;
+
+		engines = calloc(num,
+				 sizeof(struct i915_engine_class_instance));
+		igt_assert(engines);
+
+		for (i = 0; i < num; i++) {
+			struct drm_i915_engine_info *engine =
+				(struct drm_i915_engine_info *)&engine_info->engines[i];
+
+			engines[i] = engine->engine;
+		}
+	}
+
+	__engines = engines;
+	__num_engines = num;
+}
+
 static unsigned int num_engines_in_class(enum intel_engine_id class)
 {
+	unsigned int i, count = 0;
+
 	igt_assert(class == VCS);
 
-	return 2;
+	query_engines();
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class == I915_ENGINE_CLASS_VIDEO)
+			count++;
+	}
+
+	igt_assert(count);
+	return count;
 }
 
 static void
 fill_engines_class(struct i915_engine_class_instance *ci,
 		   enum intel_engine_id class)
 {
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
 
-	ci[0].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[0].engine_instance = 0;
+	query_engines();
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
 
-	ci[1].engine_class = I915_ENGINE_CLASS_VIDEO;
-	ci[1].engine_instance = 1;
+		ci[j].engine_class = __engines[i].engine_class;
+		ci[j].engine_instance = __engines[i].engine_instance;
+		j++;
+	}
 }
 
 static void
 fill_engines_id_class(enum intel_engine_id *list,
 		      enum intel_engine_id class)
 {
+	enum intel_engine_id engine = VCS1;
+	unsigned int i, j = 0;
+
 	igt_assert(class == VCS);
+	igt_assert(num_engines_in_class(VCS) <= 2);
+
+	query_engines();
 
-	list[0] = VCS1;
-	list[1] = VCS2;
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		list[j++] = engine++;
+	}
 }
 
 static struct i915_engine_class_instance
@@ -400,6 +557,8 @@ get_engine(enum intel_engine_id engine)
 {
 	struct i915_engine_class_instance ci;
 
+	query_engines();
+
 	switch (engine) {
 	case RCS:
 		ci.engine_class = I915_ENGINE_CLASS_RENDER;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 13/15] gem_wsim: Support Icelake parts
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

On Icelake second vcs engine is vcs2 instead of vcs1 so add some logical
to physical instance remapping based on engine discovery to support it.

v2:
 * Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 64d19ed469b0..0ccb271575f7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -552,6 +552,26 @@ fill_engines_id_class(enum intel_engine_id *list,
 	}
 }
 
+static unsigned int
+find_physical_instance(enum intel_engine_id class, unsigned int logical)
+{
+	unsigned int i, j = 0;
+
+	igt_assert(class == VCS);
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		/* Map logical to physical instances. */
+		if (logical == j++)
+			return __engines[i].engine_instance;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static struct i915_engine_class_instance
 get_engine(enum intel_engine_id engine)
 {
@@ -571,7 +591,7 @@ get_engine(enum intel_engine_id engine)
 	case VCS1:
 	case VCS2:
 		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
-		ci.engine_instance = engine - VCS1;
+		ci.engine_instance = find_physical_instance(VCS, engine - VCS1);
 		break;
 	case VECS:
 		ci.engine_class = I915_ENGINE_CLASS_VIDEO_ENHANCE;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 13/15] gem_wsim: Support Icelake parts
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

On Icelake second vcs engine is vcs2 instead of vcs1 so add some logical
to physical instance remapping based on engine discovery to support it.

v2:
 * Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 benchmarks/gem_wsim.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 64d19ed469b0..0ccb271575f7 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -552,6 +552,26 @@ fill_engines_id_class(enum intel_engine_id *list,
 	}
 }
 
+static unsigned int
+find_physical_instance(enum intel_engine_id class, unsigned int logical)
+{
+	unsigned int i, j = 0;
+
+	igt_assert(class == VCS);
+
+	for (i = 0; i < __num_engines; i++) {
+		if (__engines[i].engine_class != I915_ENGINE_CLASS_VIDEO)
+			continue;
+
+		/* Map logical to physical instances. */
+		if (logical == j++)
+			return __engines[i].engine_instance;
+	}
+
+	igt_assert(0);
+	return 0;
+}
+
 static struct i915_engine_class_instance
 get_engine(enum intel_engine_id engine)
 {
@@ -571,7 +591,7 @@ get_engine(enum intel_engine_id engine)
 	case VCS1:
 	case VCS2:
 		ci.engine_class = I915_ENGINE_CLASS_VIDEO;
-		ci.engine_instance = engine - VCS1;
+		ci.engine_instance = find_physical_instance(VCS, engine - VCS1);
 		break;
 	case VECS:
 		ci.engine_class = I915_ENGINE_CLASS_VIDEO_ENHANCE;
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 14/15] gem_wsim: Fix prng usage
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Back when gem_wsim used forking it was safe to use the common storage
prng, but after converting to threads it no longer is.

Fix by storing and using a new per workload seed for batch buffer
duration randomness.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 0ccb271575f7..c43bbbc8c94d 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -193,6 +193,7 @@ struct workload
 	unsigned int flags;
 	bool print_stats;
 
+	uint32_t bb_prng;
 	uint32_t prng;
 
 	struct timespec repeat_start;
@@ -240,6 +241,8 @@ struct workload
 static const unsigned int nop_calibration_us = 1000;
 static unsigned long nop_calibration;
 
+static unsigned int master_prng;
+
 static unsigned int context_vcs_rr;
 
 static int verbose = 1;
@@ -1067,14 +1070,14 @@ clone_workload(struct workload *_wrk)
 #define PAGE_SIZE (4096)
 #endif
 
-static unsigned int get_duration(struct w_step *w)
+static unsigned int get_duration(struct workload *wrk, struct w_step *w)
 {
 	struct duration *dur = &w->duration;
 
 	if (dur->min == dur->max)
 		return dur->min;
 	else
-		return dur->min + hars_petruska_f54_1_random_unsafe() %
+		return dur->min + hars_petruska_f54_1_random(&wrk->bb_prng) %
 		       (dur->max + 1 - dur->min);
 }
 
@@ -1448,6 +1451,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 	wrk->id = id;
 	wrk->prng = rand();
+	wrk->bb_prng = (wrk->flags & SYNCEDCLIENTS) ? master_prng : rand();
 	wrk->run = true;
 
 	ctx_vcs =  0;
@@ -2607,7 +2611,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	w->eb.batch_start_offset =
 		w->unbound_duration ?
 		0 :
-		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
+		ALIGN(w->bb_sz - get_bb_sz(get_duration(wrk, w)),
 		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
@@ -2676,9 +2680,6 @@ static void *run_workload(void *data)
 
 	clock_gettime(CLOCK_MONOTONIC, &t_start);
 
-	hars_petruska_f54_1_random_seed((wrk->flags & SYNCEDCLIENTS) ?
-					0 : wrk->id);
-
 	init_status_page(wrk, INIT_ALL);
 	for (count = 0; wrk->run && (wrk->background || count < wrk->repeat);
 	     count++) {
@@ -3117,6 +3118,10 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
+	master_prng = time(NULL);
+	srand(master_prng);
+	master_prng = rand();
+
 	while ((c = getopt(argc, argv,
 			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [igt-dev] [PATCH i-g-t 14/15] gem_wsim: Fix prng usage
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Back when gem_wsim used forking it was safe to use the common storage
prng, but after converting to threads it no longer is.

Fix by storing and using a new per workload seed for batch buffer
duration randomness.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 benchmarks/gem_wsim.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index 0ccb271575f7..c43bbbc8c94d 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -193,6 +193,7 @@ struct workload
 	unsigned int flags;
 	bool print_stats;
 
+	uint32_t bb_prng;
 	uint32_t prng;
 
 	struct timespec repeat_start;
@@ -240,6 +241,8 @@ struct workload
 static const unsigned int nop_calibration_us = 1000;
 static unsigned long nop_calibration;
 
+static unsigned int master_prng;
+
 static unsigned int context_vcs_rr;
 
 static int verbose = 1;
@@ -1067,14 +1070,14 @@ clone_workload(struct workload *_wrk)
 #define PAGE_SIZE (4096)
 #endif
 
-static unsigned int get_duration(struct w_step *w)
+static unsigned int get_duration(struct workload *wrk, struct w_step *w)
 {
 	struct duration *dur = &w->duration;
 
 	if (dur->min == dur->max)
 		return dur->min;
 	else
-		return dur->min + hars_petruska_f54_1_random_unsafe() %
+		return dur->min + hars_petruska_f54_1_random(&wrk->bb_prng) %
 		       (dur->max + 1 - dur->min);
 }
 
@@ -1448,6 +1451,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags)
 
 	wrk->id = id;
 	wrk->prng = rand();
+	wrk->bb_prng = (wrk->flags & SYNCEDCLIENTS) ? master_prng : rand();
 	wrk->run = true;
 
 	ctx_vcs =  0;
@@ -2607,7 +2611,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine,
 	w->eb.batch_start_offset =
 		w->unbound_duration ?
 		0 :
-		ALIGN(w->bb_sz - get_bb_sz(get_duration(w)),
+		ALIGN(w->bb_sz - get_bb_sz(get_duration(wrk, w)),
 		      2 * sizeof(uint32_t));
 
 	for (i = 0; i < w->fence_deps.nr; i++) {
@@ -2676,9 +2680,6 @@ static void *run_workload(void *data)
 
 	clock_gettime(CLOCK_MONOTONIC, &t_start);
 
-	hars_petruska_f54_1_random_seed((wrk->flags & SYNCEDCLIENTS) ?
-					0 : wrk->id);
-
 	init_status_page(wrk, INIT_ALL);
 	for (count = 0; wrk->run && (wrk->background || count < wrk->repeat);
 	     count++) {
@@ -3117,6 +3118,10 @@ int main(int argc, char **argv)
 
 	init_clocks();
 
+	master_prng = time(NULL);
+	srand(master_prng);
+	master_prng = rand();
+
 	while ((c = getopt(argc, argv,
 			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
 		switch (c) {
-- 
2.20.1

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH i-g-t 15/15] gem_wsim: Allow random seed control
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Simon Ser, Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

New command line option to allow controling the initial pseudo random
generator seed in order to allow repeatable runs.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Simon Ser <simon.ser@intel.com>
---
 benchmarks/gem_wsim.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index c43bbbc8c94d..e2ffb93a94d5 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -2946,6 +2946,7 @@ static void print_help(void)
 "  -t <n>          Nop calibration tolerance percentage.\n"
 "                  Use when there is a difficulty obtaining calibration with the\n"
 "                  default settings.\n"
+"  -I <n>          Initial randomness seed.\n"
 "  -p <n>          Context priority to use for the following workload on the\n"
 "                  command line.\n"
 "  -w <desc|path>  Filename or a workload descriptor.\n"
@@ -3119,11 +3120,9 @@ int main(int argc, char **argv)
 	init_clocks();
 
 	master_prng = time(NULL);
-	srand(master_prng);
-	master_prng = rand();
 
 	while ((c = getopt(argc, argv,
-			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:I:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -3210,6 +3209,9 @@ int main(int argc, char **argv)
 				return 1;
 			}
 			break;
+		case 'I':
+			master_prng = strtol(optarg, NULL, 0);
+			break;
 		case 'h':
 			print_help();
 			return 0;
@@ -3294,6 +3296,7 @@ int main(int argc, char **argv)
 		clients = nr_w_args;
 
 	if (verbose > 1) {
+		printf("Random seed is %u.\n", master_prng);
 		printf("Using %lu nop calibration for %uus delay.\n",
 		       nop_calibration, nop_calibration_us);
 		printf("%u client%s.\n", clients, clients > 1 ? "s" : "");
@@ -3312,6 +3315,9 @@ int main(int argc, char **argv)
 		}
 	}
 
+	srand(master_prng);
+	master_prng = rand();
+
 	if (master_workload >= 0 && clients == 1)
 		master_workload = -1;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [Intel-gfx] [PATCH i-g-t 15/15] gem_wsim: Allow random seed control
@ 2019-05-22 15:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-22 15:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Simon Ser, Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

New command line option to allow controling the initial pseudo random
generator seed in order to allow repeatable runs.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Simon Ser <simon.ser@intel.com>
---
 benchmarks/gem_wsim.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c
index c43bbbc8c94d..e2ffb93a94d5 100644
--- a/benchmarks/gem_wsim.c
+++ b/benchmarks/gem_wsim.c
@@ -2946,6 +2946,7 @@ static void print_help(void)
 "  -t <n>          Nop calibration tolerance percentage.\n"
 "                  Use when there is a difficulty obtaining calibration with the\n"
 "                  default settings.\n"
+"  -I <n>          Initial randomness seed.\n"
 "  -p <n>          Context priority to use for the following workload on the\n"
 "                  command line.\n"
 "  -w <desc|path>  Filename or a workload descriptor.\n"
@@ -3119,11 +3120,9 @@ int main(int argc, char **argv)
 	init_clocks();
 
 	master_prng = time(NULL);
-	srand(master_prng);
-	master_prng = rand();
 
 	while ((c = getopt(argc, argv,
-			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:")) != -1) {
+			   "hqv2RsSHxGdc:n:r:w:W:a:t:b:p:I:")) != -1) {
 		switch (c) {
 		case 'W':
 			if (master_workload >= 0) {
@@ -3210,6 +3209,9 @@ int main(int argc, char **argv)
 				return 1;
 			}
 			break;
+		case 'I':
+			master_prng = strtol(optarg, NULL, 0);
+			break;
 		case 'h':
 			print_help();
 			return 0;
@@ -3294,6 +3296,7 @@ int main(int argc, char **argv)
 		clients = nr_w_args;
 
 	if (verbose > 1) {
+		printf("Random seed is %u.\n", master_prng);
 		printf("Using %lu nop calibration for %uus delay.\n",
 		       nop_calibration, nop_calibration_us);
 		printf("%u client%s.\n", clients, clients > 1 ? "s" : "");
@@ -3312,6 +3315,9 @@ int main(int argc, char **argv)
 		}
 	}
 
+	srand(master_prng);
+	master_prng = rand();
+
 	if (master_workload >= 0 && clients == 1)
 		master_workload = -1;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/15] gem_wsim: Fix prng usage
  2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-22 16:51     ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-22 16:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:19)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Back when gem_wsim used forking it was safe to use the common storage
> prng, but after converting to threads it no longer is.
> 
> Fix by storing and using a new per workload seed for batch buffer
> duration randomness.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

But I suggest squashing this with the following patch as this introduces
a variable random seed; and the next patch allows it to be
pre-determined again.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 14/15] gem_wsim: Fix prng usage
@ 2019-05-22 16:51     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-22 16:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-22 16:57:19)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Back when gem_wsim used forking it was safe to use the common storage
> prng, but after converting to threads it no longer is.
> 
> Fix by storing and using a new per workload seed for batch buffer
> duration randomness.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

But I suggest squashing this with the following patch as this introduces
a variable random seed; and the next patch allows it to be
pre-determined again.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH i-g-t 15/15] gem_wsim: Allow random seed control
  2019-05-22 15:57   ` [Intel-gfx] " Tvrtko Ursulin
@ 2019-05-22 16:52     ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-22 16:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Simon Ser, Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:20)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> New command line option to allow controling the initial pseudo random
> generator seed in order to allow repeatable runs.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Suggested-by: Simon Ser <simon.ser@intel.com>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

And squash maybe?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 15/15] gem_wsim: Allow random seed control
@ 2019-05-22 16:52     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-22 16:52 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-22 16:57:20)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> New command line option to allow controling the initial pseudo random
> generator seed in order to allow repeatable runs.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Suggested-by: Simon Ser <simon.ser@intel.com>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

And squash maybe?
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Remaining bits of Virtual Engine tooling
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
                   ` (15 preceding siblings ...)
  (?)
@ 2019-05-22 17:21 ` Patchwork
  -1 siblings, 0 replies; 52+ messages in thread
From: Patchwork @ 2019-05-22 17:21 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Remaining bits of Virtual Engine tooling
URL   : https://patchwork.freedesktop.org/series/60992/
State : success

== Summary ==

CI Bug Log - changes from IGT_5005 -> IGTPW_3030
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/60992/revisions/1/mbox/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_3030:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@runner@aborted:
    - {fi-icl-y}:         NOTRUN -> [FAIL][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/fi-icl-y/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in IGTPW_3030 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       [PASS][2] -> [DMESG-WARN][3] ([fdo#102614])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s4-devices:
    - {fi-icl-u3}:        [DMESG-WARN][4] -> [PASS][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/fi-icl-u3/igt@gem_exec_suspend@basic-s4-devices.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/fi-icl-u3/igt@gem_exec_suspend@basic-s4-devices.html

  * igt@i915_selftest@live_hangcheck:
    - fi-skl-iommu:       [INCOMPLETE][6] ([fdo#108602] / [fdo#108744]) -> [PASS][7]
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/fi-skl-iommu/igt@i915_selftest@live_hangcheck.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/fi-skl-iommu/igt@i915_selftest@live_hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#108602]: https://bugs.freedesktop.org/show_bug.cgi?id=108602
  [fdo#108744]: https://bugs.freedesktop.org/show_bug.cgi?id=108744


Participating hosts (53 -> 46)
------------------------------

  Missing    (7): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * IGT: IGT_5005 -> IGTPW_3030

  CI_DRM_6121: 0a029524f22ca287ec7e515edc1258e7f806750c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_3030: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/
  IGT_5005: adf9f435a795d692e30cd6eafe26eddf4993c8ff @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH i-g-t 01/15] gem_wsim: Engine map support
  2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-23 13:23     ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 13:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:06)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support new i915 uAPI for configuring contexts with engine maps.
> 
> Please refer to the README file for more detailed explanation.
> 
> v2:
>  * Allow defining engine maps by class.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Keeping the TODO in mind, this does what it says on the tin... I am just
left wanting more.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [Intel-gfx] [PATCH i-g-t 01/15] gem_wsim: Engine map support
@ 2019-05-23 13:23     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 13:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:06)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Support new i915 uAPI for configuring contexts with engine maps.
> 
> Please refer to the README file for more detailed explanation.
> 
> v2:
>  * Allow defining engine maps by class.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Keeping the TODO in mind, this does what it says on the tin... I am just
left wanting more.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command
  2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-23 13:25     ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 13:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:09)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.
> 
> v2:
>  * Lift restriction to only use load balancer when enabled in context map.
>    (Chris)

You didn't fancy going all out and say:
B.1.DEFAULT.VCS1|VCS2
?

If you are happy with the current code, it looks to do what you want,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command
@ 2019-05-23 13:25     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 13:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-22 16:57:09)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> A new workload command for enabling a load balanced context map (aka
> Virtual Engine). Example usage:
> 
>   B.1
> 
> This turns on load balancing for context one, assuming it has already been
> configured with an engine map. Only DEFAULT engine specifier can be used
> with load balanced engine maps.
> 
> v2:
>  * Lift restriction to only use load balancer when enabled in context map.
>    (Chris)

You didn't fancy going all out and say:
B.1.DEFAULT.VCS1|VCS2
?

If you are happy with the current code, it looks to do what you want,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/15] gem_wsim: Consolidate engine assignments into helpers
  2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-23 13:26     ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 13:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:16)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> This will allow applying the discovered engine configuration from a single
> place.
> 
> v2:
>  * Consolidate enum to ci conversion.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 11/15] gem_wsim: Consolidate engine assignments into helpers
@ 2019-05-23 13:26     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 13:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-22 16:57:16)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> This will allow applying the discovered engine configuration from a single
> place.
> 
> v2:
>  * Consolidate enum to ci conversion.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command
  2019-05-23 13:25     ` Chris Wilson
@ 2019-05-23 13:58       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-23 13:58 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 23/05/2019 14:25, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-22 16:57:09)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
>>
>> v2:
>>   * Lift restriction to only use load balancer when enabled in context map.
>>     (Chris)
> 
> You didn't fancy going all out and say:
> B.1.DEFAULT.VCS1|VCS2
> ?
> 
> If you are happy with the current code, it looks to do what you want,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks, I wouldn't say I am happy as in really proud of it happy. But as 
a tool by hackers for hackers, which grew organically always as a second 
priority thing, it seems to work for now and is able to exercise the new 
uAPI and scheduling paths.

So there is scope to tidy, and will certainly need more work in the 
future (not least per engine calibration), but I need to call it done 
for a while at some reasonable point and it feels like that should be now.

Proof in the pudding is that I think you found it useful when 
benchmarking the semaphore code and related issues. So it at least 
continues providing the same simulated workloads over the new uAPI.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command
@ 2019-05-23 13:58       ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-23 13:58 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 23/05/2019 14:25, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-05-22 16:57:09)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> A new workload command for enabling a load balanced context map (aka
>> Virtual Engine). Example usage:
>>
>>    B.1
>>
>> This turns on load balancing for context one, assuming it has already been
>> configured with an engine map. Only DEFAULT engine specifier can be used
>> with load balanced engine maps.
>>
>> v2:
>>   * Lift restriction to only use load balancer when enabled in context map.
>>     (Chris)
> 
> You didn't fancy going all out and say:
> B.1.DEFAULT.VCS1|VCS2
> ?
> 
> If you are happy with the current code, it looks to do what you want,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks, I wouldn't say I am happy as in really proud of it happy. But as 
a tool by hackers for hackers, which grew organically always as a second 
priority thing, it seems to work for now and is able to exercise the new 
uAPI and scheduling paths.

So there is scope to tidy, and will certainly need more work in the 
future (not least per engine calibration), but I need to call it done 
for a while at some reasonable point and it feels like that should be now.

Proof in the pudding is that I think you found it useful when 
benchmarking the semaphore code and related issues. So it at least 
continues providing the same simulated workloads over the new uAPI.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
  2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
@ 2019-05-23 14:05     ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 14:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Tvrtko Ursulin (2019-05-22 16:57:12)
> -static void
> +static unsigned int
>  terminate_bb(struct w_step *w, unsigned int flags)
>  {
>         const uint32_t bbe = 0xa << 23;
>         unsigned long mmap_start, mmap_len;
>         unsigned long batch_start = w->bb_sz;
> +       unsigned int r = 0;
>         uint32_t *ptr, *cs;
>  
>         igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
> @@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         if (flags & RT)
>                 batch_start -= 12 * sizeof(uint32_t);
>  
> +       if (w->unbound_duration)
> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +
>         mmap_start = rounddown(batch_start, PAGE_SIZE);
>         mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>  
> @@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>         cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>  
> +       if (w->unbound_duration) {
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
> +               batch_start += 4 * sizeof(uint32_t);
> +
> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
> +               w->recursive_bb_start = cs;
> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> +               *cs++ = 0;
> +               *cs++ = 0;

delta is zero, and mmap_len is consistent, so yup this gives a page of
nops before looping.

> +       }
> +
>         if (flags & SEQNO) {
> -               w->reloc[0].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -860,7 +890,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         }
>  
>         if (flags & RT) {
> -               w->reloc[1].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -870,7 +900,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 w->rt0_value = cs;
>                 *cs++ = 0;
>  
> -               w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
> @@ -879,7 +909,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 *cs++ = 0;
>                 *cs++ = 0;
>  
> -               w->reloc[3].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -891,6 +921,8 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         }
>  
>         *cs = bbe;
> +
> +       return r;
>  }
>  
>  static const unsigned int eb_engine_map[NUM_ENGINES] = {
> @@ -1011,19 +1043,22 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>                 }
>         }
>  
> -       w->bb_sz = get_bb_sz(w->duration.max);
> -       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
> +       if (w->unbound_duration)
> +               /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +               w->bb_sz = max(PAGE_SIZE, get_bb_sz(w->preempt_us)) +
> +                          (1 + 3) * sizeof(uint32_t);
> +       else
> +               w->bb_sz = get_bb_sz(w->duration.max);
> +       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
>         init_bb(w, flags);
> -       terminate_bb(w, flags);
> +       w->obj[j].relocation_count = terminate_bb(w, flags);
>  
> -       if (flags & SEQNO) {
> +       if (w->obj[j].relocation_count) {
>                 w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
> -               if (flags & RT)
> -                       w->obj[j].relocation_count = 4;
> -               else
> -                       w->obj[j].relocation_count = 1;
>                 for (i = 0; i < w->obj[j].relocation_count; i++)
>                         w->reloc[i].target_handle = 1;
> +               if (w->unbound_duration)
> +                       w->reloc[0].target_handle = j;
>         }

That flows much better.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
@ 2019-05-23 14:05     ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 14:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Tvrtko Ursulin (2019-05-22 16:57:12)
> -static void
> +static unsigned int
>  terminate_bb(struct w_step *w, unsigned int flags)
>  {
>         const uint32_t bbe = 0xa << 23;
>         unsigned long mmap_start, mmap_len;
>         unsigned long batch_start = w->bb_sz;
> +       unsigned int r = 0;
>         uint32_t *ptr, *cs;
>  
>         igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
> @@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         if (flags & RT)
>                 batch_start -= 12 * sizeof(uint32_t);
>  
> +       if (w->unbound_duration)
> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +
>         mmap_start = rounddown(batch_start, PAGE_SIZE);
>         mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>  
> @@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>         cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>  
> +       if (w->unbound_duration) {
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
> +               batch_start += 4 * sizeof(uint32_t);
> +
> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
> +               w->recursive_bb_start = cs;
> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> +               *cs++ = 0;
> +               *cs++ = 0;

delta is zero, and mmap_len is consistent, so yup this gives a page of
nops before looping.

> +       }
> +
>         if (flags & SEQNO) {
> -               w->reloc[0].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -860,7 +890,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         }
>  
>         if (flags & RT) {
> -               w->reloc[1].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -870,7 +900,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 w->rt0_value = cs;
>                 *cs++ = 0;
>  
> -               w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */
> @@ -879,7 +909,7 @@ terminate_bb(struct w_step *w, unsigned int flags)
>                 *cs++ = 0;
>                 *cs++ = 0;
>  
> -               w->reloc[3].offset = batch_start + sizeof(uint32_t);
> +               w->reloc[r++].offset = batch_start + sizeof(uint32_t);
>                 batch_start += 4 * sizeof(uint32_t);
>  
>                 *cs++ = MI_STORE_DWORD_IMM;
> @@ -891,6 +921,8 @@ terminate_bb(struct w_step *w, unsigned int flags)
>         }
>  
>         *cs = bbe;
> +
> +       return r;
>  }
>  
>  static const unsigned int eb_engine_map[NUM_ENGINES] = {
> @@ -1011,19 +1043,22 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags)
>                 }
>         }
>  
> -       w->bb_sz = get_bb_sz(w->duration.max);
> -       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz);
> +       if (w->unbound_duration)
> +               /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */
> +               w->bb_sz = max(PAGE_SIZE, get_bb_sz(w->preempt_us)) +
> +                          (1 + 3) * sizeof(uint32_t);
> +       else
> +               w->bb_sz = get_bb_sz(w->duration.max);
> +       w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0));
>         init_bb(w, flags);
> -       terminate_bb(w, flags);
> +       w->obj[j].relocation_count = terminate_bb(w, flags);
>  
> -       if (flags & SEQNO) {
> +       if (w->obj[j].relocation_count) {
>                 w->obj[j].relocs_ptr = to_user_pointer(&w->reloc);
> -               if (flags & RT)
> -                       w->obj[j].relocation_count = 4;
> -               else
> -                       w->obj[j].relocation_count = 1;
>                 for (i = 0; i < w->obj[j].relocation_count; i++)
>                         w->reloc[i].target_handle = 1;
> +               if (w->unbound_duration)
> +                       w->reloc[0].target_handle = j;
>         }

That flows much better.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
  2019-05-23 14:05     ` [igt-dev] " Chris Wilson
@ 2019-05-23 14:09       ` Chris Wilson
  -1 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 14:09 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx

Quoting Chris Wilson (2019-05-23 15:05:05)
> Quoting Tvrtko Ursulin (2019-05-22 16:57:12)
> > -static void
> > +static unsigned int
> >  terminate_bb(struct w_step *w, unsigned int flags)
> >  {
> >         const uint32_t bbe = 0xa << 23;
> >         unsigned long mmap_start, mmap_len;
> >         unsigned long batch_start = w->bb_sz;
> > +       unsigned int r = 0;
> >         uint32_t *ptr, *cs;
> >  
> >         igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
> > @@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
> >         if (flags & RT)
> >                 batch_start -= 12 * sizeof(uint32_t);
> >  
> > +       if (w->unbound_duration)
> > +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
> > +
> >         mmap_start = rounddown(batch_start, PAGE_SIZE);
> >         mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
> >  
> > @@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
> >         ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
> >         cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
> >  
> > +       if (w->unbound_duration) {
> > +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
> > +               batch_start += 4 * sizeof(uint32_t);
> > +
> > +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
> > +               w->recursive_bb_start = cs;
> > +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> > +               *cs++ = 0;
> > +               *cs++ = 0;
> 
> delta is zero, and mmap_len is consistent, so yup this gives a page of
> nops before looping.

Waitasec... Only emitting ARB_CHK if preempt_us is set. You want an
unbounded unpreemptible batch?

I suppose it's not without use (although I plan for it to be killed
quickly and reset), but I would not advise for preempt_us to be 0 by
default then.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
@ 2019-05-23 14:09       ` Chris Wilson
  0 siblings, 0 replies; 52+ messages in thread
From: Chris Wilson @ 2019-05-23 14:09 UTC (permalink / raw)
  To: Tvrtko Ursulin, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin

Quoting Chris Wilson (2019-05-23 15:05:05)
> Quoting Tvrtko Ursulin (2019-05-22 16:57:12)
> > -static void
> > +static unsigned int
> >  terminate_bb(struct w_step *w, unsigned int flags)
> >  {
> >         const uint32_t bbe = 0xa << 23;
> >         unsigned long mmap_start, mmap_len;
> >         unsigned long batch_start = w->bb_sz;
> > +       unsigned int r = 0;
> >         uint32_t *ptr, *cs;
> >  
> >         igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
> > @@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
> >         if (flags & RT)
> >                 batch_start -= 12 * sizeof(uint32_t);
> >  
> > +       if (w->unbound_duration)
> > +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
> > +
> >         mmap_start = rounddown(batch_start, PAGE_SIZE);
> >         mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
> >  
> > @@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
> >         ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
> >         cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
> >  
> > +       if (w->unbound_duration) {
> > +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
> > +               batch_start += 4 * sizeof(uint32_t);
> > +
> > +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
> > +               w->recursive_bb_start = cs;
> > +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
> > +               *cs++ = 0;
> > +               *cs++ = 0;
> 
> delta is zero, and mmap_len is consistent, so yup this gives a page of
> nops before looping.

Waitasec... Only emitting ARB_CHK if preempt_us is set. You want an
unbounded unpreemptible batch?

I suppose it's not without use (although I plan for it to be killed
quickly and reset), but I would not advise for preempt_us to be 0 by
default then.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for Remaining bits of Virtual Engine tooling
  2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
                   ` (16 preceding siblings ...)
  (?)
@ 2019-05-23 14:15 ` Patchwork
  -1 siblings, 0 replies; 52+ messages in thread
From: Patchwork @ 2019-05-23 14:15 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: igt-dev

== Series Details ==

Series: Remaining bits of Virtual Engine tooling
URL   : https://patchwork.freedesktop.org/series/60992/
State : success

== Summary ==

CI Bug Log - changes from IGT_5005_full -> IGTPW_3030_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/60992/revisions/1/mbox/

Known issues
------------

  Here are the changes found in IGTPW_3030_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_schedule@preempt-queue-contexts-chain-bsd:
    - shard-glk:          [PASS][1] -> [INCOMPLETE][2] ([fdo#103359] / [k.org#198133])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-glk4/igt@gem_exec_schedule@preempt-queue-contexts-chain-bsd.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-glk2/igt@gem_exec_schedule@preempt-queue-contexts-chain-bsd.html

  * igt@gem_tiled_swapping@non-threaded:
    - shard-hsw:          [PASS][3] -> [FAIL][4] ([fdo#108686])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-hsw6/igt@gem_tiled_swapping@non-threaded.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-hsw2/igt@gem_tiled_swapping@non-threaded.html
    - shard-kbl:          [PASS][5] -> [FAIL][6] ([fdo#108686])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-kbl1/igt@gem_tiled_swapping@non-threaded.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-kbl4/igt@gem_tiled_swapping@non-threaded.html

  * igt@gem_workarounds@suspend-resume:
    - shard-apl:          [PASS][7] -> [DMESG-WARN][8] ([fdo#108566]) +4 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-apl4/igt@gem_workarounds@suspend-resume.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-apl3/igt@gem_workarounds@suspend-resume.html

  * igt@gem_workarounds@suspend-resume-context:
    - shard-kbl:          [PASS][9] -> [INCOMPLETE][10] ([fdo#103665])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-kbl1/igt@gem_workarounds@suspend-resume-context.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-kbl4/igt@gem_workarounds@suspend-resume-context.html

  * igt@kms_cursor_edge_walk@pipe-a-128x128-left-edge:
    - shard-snb:          [PASS][11] -> [SKIP][12] ([fdo#109271] / [fdo#109278])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-snb5/igt@kms_cursor_edge_walk@pipe-a-128x128-left-edge.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-snb7/igt@kms_cursor_edge_walk@pipe-a-128x128-left-edge.html

  * igt@kms_flip@2x-flip-vs-expired-vblank:
    - shard-glk:          [PASS][13] -> [FAIL][14] ([fdo#105363])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-glk8/igt@kms_flip@2x-flip-vs-expired-vblank.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-glk1/igt@kms_flip@2x-flip-vs-expired-vblank.html

  * igt@kms_frontbuffer_tracking@fbc-indfb-scaledprimary:
    - shard-iclb:         [PASS][15] -> [FAIL][16] ([fdo#103167]) +6 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-iclb1/igt@kms_frontbuffer_tracking@fbc-indfb-scaledprimary.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-iclb5/igt@kms_frontbuffer_tracking@fbc-indfb-scaledprimary.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
    - shard-iclb:         [PASS][17] -> [FAIL][18] ([fdo#103166])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-iclb2/igt@kms_plane_lowres@pipe-a-tiling-x.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-iclb4/igt@kms_plane_lowres@pipe-a-tiling-x.html

  * igt@kms_psr@psr2_dpms:
    - shard-iclb:         [PASS][19] -> [SKIP][20] ([fdo#109441]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-iclb2/igt@kms_psr@psr2_dpms.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-iclb5/igt@kms_psr@psr2_dpms.html

  * igt@kms_setmode@basic:
    - shard-apl:          [PASS][21] -> [FAIL][22] ([fdo#99912])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-apl4/igt@kms_setmode@basic.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-apl8/igt@kms_setmode@basic.html

  * igt@perf_pmu@rc6-runtime-pm-long:
    - shard-iclb:         [PASS][23] -> [FAIL][24] ([fdo#105010])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-iclb1/igt@perf_pmu@rc6-runtime-pm-long.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-iclb7/igt@perf_pmu@rc6-runtime-pm-long.html

  
#### Possible fixes ####

  * igt@i915_suspend@debugfs-reader:
    - shard-apl:          [DMESG-WARN][25] ([fdo#108566]) -> [PASS][26] +8 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-apl3/igt@i915_suspend@debugfs-reader.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-apl8/igt@i915_suspend@debugfs-reader.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-kbl:          [FAIL][27] ([fdo#102670]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-kbl1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-kbl3/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
    - shard-hsw:          [FAIL][29] ([fdo#102670]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-hsw6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-hsw1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-snb:          [INCOMPLETE][31] ([fdo#105411]) -> [PASS][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-snb1/igt@kms_flip@flip-vs-suspend-interruptible.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-snb7/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_flip@modeset-vs-vblank-race-interruptible:
    - shard-glk:          [FAIL][33] ([fdo#103060]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-glk8/igt@kms_flip@modeset-vs-vblank-race-interruptible.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-glk6/igt@kms_flip@modeset-vs-vblank-race-interruptible.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite:
    - shard-iclb:         [FAIL][35] ([fdo#103167]) -> [PASS][36] +4 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html

  * igt@kms_psr@psr2_basic:
    - shard-iclb:         [SKIP][37] ([fdo#109441]) -> [PASS][38] +1 similar issue
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-iclb7/igt@kms_psr@psr2_basic.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-iclb2/igt@kms_psr@psr2_basic.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [FAIL][39] ([fdo#99912]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-kbl1/igt@kms_setmode@basic.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-kbl3/igt@kms_setmode@basic.html

  
#### Warnings ####

  * igt@gem_ctx_isolation@bcs0-s3:
    - shard-apl:          [WARN][41] ([fdo#110738]) -> [DMESG-WARN][42] ([fdo#108566])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-apl8/igt@gem_ctx_isolation@bcs0-s3.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-apl6/igt@gem_ctx_isolation@bcs0-s3.html

  * igt@gem_ctx_isolation@vcs0-s3:
    - shard-apl:          [DMESG-WARN][43] ([fdo#108566]) -> [WARN][44] ([fdo#110738])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5005/shard-apl3/igt@gem_ctx_isolation@vcs0-s3.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/shard-apl4/igt@gem_ctx_isolation@vcs0-s3.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102670]: https://bugs.freedesktop.org/show_bug.cgi?id=102670
  [fdo#103060]: https://bugs.freedesktop.org/show_bug.cgi?id=103060
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#105010]: https://bugs.freedesktop.org/show_bug.cgi?id=105010
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105411]: https://bugs.freedesktop.org/show_bug.cgi?id=105411
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#108686]: https://bugs.freedesktop.org/show_bug.cgi?id=108686
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#110738]: https://bugs.freedesktop.org/show_bug.cgi?id=110738
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (7 -> 6)
------------------------------

  Missing    (1): shard-skl 


Build changes
-------------

  * IGT: IGT_5005 -> IGTPW_3030

  CI_DRM_6121: 0a029524f22ca287ec7e515edc1258e7f806750c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_3030: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/
  IGT_5005: adf9f435a795d692e30cd6eafe26eddf4993c8ff @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3030/
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
  2019-05-23 14:09       ` [igt-dev] " Chris Wilson
@ 2019-05-23 14:17         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-23 14:17 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx


On 23/05/2019 15:09, Chris Wilson wrote:
> Quoting Chris Wilson (2019-05-23 15:05:05)
>> Quoting Tvrtko Ursulin (2019-05-22 16:57:12)
>>> -static void
>>> +static unsigned int
>>>   terminate_bb(struct w_step *w, unsigned int flags)
>>>   {
>>>          const uint32_t bbe = 0xa << 23;
>>>          unsigned long mmap_start, mmap_len;
>>>          unsigned long batch_start = w->bb_sz;
>>> +       unsigned int r = 0;
>>>          uint32_t *ptr, *cs;
>>>   
>>>          igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
>>> @@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>>          if (flags & RT)
>>>                  batch_start -= 12 * sizeof(uint32_t);
>>>   
>>> +       if (w->unbound_duration)
>>> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
>>> +
>>>          mmap_start = rounddown(batch_start, PAGE_SIZE);
>>>          mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>>>   
>>> @@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>>          ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>>>          cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>>>   
>>> +       if (w->unbound_duration) {
>>> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>>> +               batch_start += 4 * sizeof(uint32_t);
>>> +
>>> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
>>> +               w->recursive_bb_start = cs;
>>> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
>>> +               *cs++ = 0;
>>> +               *cs++ = 0;
>>
>> delta is zero, and mmap_len is consistent, so yup this gives a page of
>> nops before looping.
> 
> Waitasec... Only emitting ARB_CHK if preempt_us is set. You want an
> unbounded unpreemptible batch?
> 
> I suppose it's not without use (although I plan for it to be killed
> quickly and reset), but I would not advise for preempt_us to be 0 by
> default then.

Default is 100us (search for "w->preempt_us = 100;") and then for 
instance frame-split-60fps.wsim disables it to simulate media better 
("X.1.0", "X.2.0"). AFAIR you actually asked for preemptable by default.

Now that I think of it.. I really need a command line switch to control 
it globally. Or better per engine class to simulate media better. 
Blast.. feature creep never ends.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 07/15] gem_wsim: Infinite batch support
@ 2019-05-23 14:17         ` Tvrtko Ursulin
  0 siblings, 0 replies; 52+ messages in thread
From: Tvrtko Ursulin @ 2019-05-23 14:17 UTC (permalink / raw)
  To: Chris Wilson, igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin


On 23/05/2019 15:09, Chris Wilson wrote:
> Quoting Chris Wilson (2019-05-23 15:05:05)
>> Quoting Tvrtko Ursulin (2019-05-22 16:57:12)
>>> -static void
>>> +static unsigned int
>>>   terminate_bb(struct w_step *w, unsigned int flags)
>>>   {
>>>          const uint32_t bbe = 0xa << 23;
>>>          unsigned long mmap_start, mmap_len;
>>>          unsigned long batch_start = w->bb_sz;
>>> +       unsigned int r = 0;
>>>          uint32_t *ptr, *cs;
>>>   
>>>          igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT));
>>> @@ -838,6 +854,9 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>>          if (flags & RT)
>>>                  batch_start -= 12 * sizeof(uint32_t);
>>>   
>>> +       if (w->unbound_duration)
>>> +               batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */
>>> +
>>>          mmap_start = rounddown(batch_start, PAGE_SIZE);
>>>          mmap_len = ALIGN(w->bb_sz - mmap_start, PAGE_SIZE);
>>>   
>>> @@ -847,8 +866,19 @@ terminate_bb(struct w_step *w, unsigned int flags)
>>>          ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE);
>>>          cs = (uint32_t *)((char *)ptr + batch_start - mmap_start);
>>>   
>>> +       if (w->unbound_duration) {
>>> +               w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t);
>>> +               batch_start += 4 * sizeof(uint32_t);
>>> +
>>> +               *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP;
>>> +               w->recursive_bb_start = cs;
>>> +               *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1;
>>> +               *cs++ = 0;
>>> +               *cs++ = 0;
>>
>> delta is zero, and mmap_len is consistent, so yup this gives a page of
>> nops before looping.
> 
> Waitasec... Only emitting ARB_CHK if preempt_us is set. You want an
> unbounded unpreemptible batch?
> 
> I suppose it's not without use (although I plan for it to be killed
> quickly and reset), but I would not advise for preempt_us to be 0 by
> default then.

Default is 100us (search for "w->preempt_us = 100;") and then for 
instance frame-split-60fps.wsim disables it to simulate media better 
("X.1.0", "X.2.0"). AFAIR you actually asked for preemptable by default.

Now that I think of it.. I really need a command line switch to control 
it globally. Or better per engine class to simulate media better. 
Blast.. feature creep never ends.

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2019-05-23 14:17 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22 15:57 [PATCH i-g-t 00/15] Remaining bits of Virtual Engine tooling Tvrtko Ursulin
2019-05-22 15:57 ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 01/15] gem_wsim: Engine map support Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-23 13:23   ` Chris Wilson
2019-05-23 13:23     ` [igt-dev] [Intel-gfx] " Chris Wilson
2019-05-22 15:57 ` [PATCH i-g-t 02/15] gem_wsim: Save some lines by changing to implicit NULL checking Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 03/15] gem_wsim: Compact int command parsing with a macro Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 04/15] gem_wsim: Engine map load balance command Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-23 13:25   ` Chris Wilson
2019-05-23 13:25     ` Chris Wilson
2019-05-23 13:58     ` Tvrtko Ursulin
2019-05-23 13:58       ` Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 05/15] gem_wsim: Engine bond command Tvrtko Ursulin
2019-05-22 15:57   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 06/15] gem_wsim: Some more example workloads Tvrtko Ursulin
2019-05-22 15:57   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 07/15] gem_wsim: Infinite batch support Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-23 14:05   ` Chris Wilson
2019-05-23 14:05     ` [igt-dev] " Chris Wilson
2019-05-23 14:09     ` Chris Wilson
2019-05-23 14:09       ` [igt-dev] " Chris Wilson
2019-05-23 14:17       ` Tvrtko Ursulin
2019-05-23 14:17         ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 08/15] gem_wsim: Command line switch for specifying low slice count workloads Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 09/15] gem_wsim: Per context SSEU control Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 10/15] gem_wsim: Allow RCS virtual engine with " Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 11/15] gem_wsim: Consolidate engine assignments into helpers Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-23 13:26   ` Chris Wilson
2019-05-23 13:26     ` Chris Wilson
2019-05-22 15:57 ` [PATCH i-g-t 12/15] gem_wsim: Discover engines Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 13/15] gem_wsim: Support Icelake parts Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 15:57 ` [PATCH i-g-t 14/15] gem_wsim: Fix prng usage Tvrtko Ursulin
2019-05-22 15:57   ` [igt-dev] " Tvrtko Ursulin
2019-05-22 16:51   ` Chris Wilson
2019-05-22 16:51     ` Chris Wilson
2019-05-22 15:57 ` [PATCH i-g-t 15/15] gem_wsim: Allow random seed control Tvrtko Ursulin
2019-05-22 15:57   ` [Intel-gfx] " Tvrtko Ursulin
2019-05-22 16:52   ` Chris Wilson
2019-05-22 16:52     ` [igt-dev] " Chris Wilson
2019-05-22 17:21 ` [igt-dev] ✓ Fi.CI.BAT: success for Remaining bits of Virtual Engine tooling Patchwork
2019-05-23 14:15 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.