All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/21] drm/i915/gem: ioctl clean-ups
@ 2021-04-23 22:31 ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Overview:
---------

This patch series attempts to clean up some of the IOCTL mess we've created
over the last few years.  The most egregious bit being context mutability.
In summary, this series:

 1. Drops two never-used context params: RINGSIZE and NO_ZEROMAP
 2. Drops the entire CONTEXT_CLONE API
 3. Implements SINGLE_TIMELINE with a syncobj instead of actually sharing
    intel_timeline between engines.
 4. Adds a few sanity restrictions to the balancing/bonding API.
 5. Implements a proto-ctx mechanism so that the engine set and VM can only
    be set early on in the lifetime of a context, before anything ever
    executes on it.  This effectively makes the VM and engine set
    immutable.

This series has been tested with IGT as well as the Iris, ANV, and the
Intel media driver doing an 8K decode (this uses bonding/balancing).  I've
also done quite a bit of git archeology to ensure that nothing in here will
break anything that's already shipped at some point in history.  It's
possible I've missed something, but I've dug quite a bit.


Details and motivation:
-----------------------

In very broad strokes, there's an effort going on right now within Intel to
try and clean up and simplify i915 anywhere we can.  We obviously don't
want to break any shipping userspace but, as can be seen by this series,
there's a lot i915 theoretically supports which userspace doesn't actually
need.  Some of this, like the two context params used here, were simply
oversights where we went through the usual API review process and merged
the i915 bits but the userspace bits never landed for some reason.

Not all are so innocent, however.  For instance, there's an entire context
cloning API which allows one to create a context with certain parameters
"cloned" from some other context.  This entire API has never been used by
any userspace except IGT and there were never patches to any other
userspace to use it.  It never should have landed.  Also, when we added
support for setting explicit engine sets and sharing VMs across contexts,
people decided to do so via SET_CONTEXT_PARAM.  While this allowed them to
re-use existing API, it did so at the cost of making those states mutable
which leads to a plethora of potential race conditions.  There were even
IGT tests merged to cover some of theses:

 - gem_vm_create@async-destroy and gem_vm_create@destroy-race which test
   swapping out the VM on a running context.

 - gem_ctx_persistence@replace* which test whether a client can escape a
   non-persistent context by submitting a hanging batch and then swapping
   out the engine set before the hang is detected.

 - api_intel_bb@bb-with-vm which tests the that intel_bb_assign_vm works
   properly.  This API is never used by any other IGT test.

There is also an entire deferred flush and set state framework in
i915_gem_cotnext.c which exists for safely swapping out the VM while there
is work in-flight on a context.

So, clearly people knew that this API was inherently racy and difficult to
implement but they landed it anyway.  Why?  The best explanation I've been
given is because it makes the API more "unified" or "symmetric" for this
stuff to go through SET_CONTEXT_PARAM.  It's not because any userspace
actually wants to be able to swap out the VM or the set of engines on a
running context.  That would be utterly insane.

This patch series cleans up this particular mess by introducing the concept
of a i915_gem_proto_context data structure which contains context creation
information.  When you initially call GEM_CONTEXT_CREATE, a proto-context
in created instead of an actual context.  Then, the first time something is
done on the context besides SET_CONTEXT_PARAM, an actual context is
created.  This allows us to keep the old drivers which use
SET_CONTEXT_PARAM to set up the engine set (see also media) while ensuring
that, once you have an i915_gem_context, the VM and the engine set are
immutable state.

Eventually, there are more clean-ups I'd like to do on top of this which
should make working with contexts inside i915 simpler and safer:

 1. Move the GEM handle -> vma LUT from i915_gem_context into either
    i915_ppgtt or drm_i915_file_private depending on whether or not the
    hardware has a full PPGTT.

 2. Move the delayed context destruction code into intel_context or a
    per-engine wrapper struct rather than i915_gem_context.

 3. Get rid of the separation between context close and context destroy

 4. Get rid of the RCU on i915_gem_context

However, these should probably be done as a separate patch series as this
one is already starting to get longish, especially if you consider the 89
IGT patches that go along with it.

Test-with: 20210423214853.876911-1-jason@jlekstrand.net

Jason Ekstrand (21):
  drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
  drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
  drm/i915/gem: Set the watchdog timeout directly in
    intel_context_set_gem
  drm/i915/gem: Return void from context_apply_all
  drm/i915: Drop the CONTEXT_CLONE API
  drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
  drm/i915/gem: Disallow bonding of virtual engines
  drm/i915/gem: Disallow creating contexts with too many engines
  drm/i915/request: Remove the hook from await_execution
  drm/i915: Stop manually RCU banging in reset_stats_ioctl
  drm/i915/gem: Add a separate validate_priority helper
  drm/i915/gem: Add an intermediate proto_context struct
  drm/i915/gem: Return an error ptr from context_lookup
  drm/i915/gt: Drop i915_address_space::file
  drm/i915/gem: Delay context creation
  drm/i915/gem: Don't allow changing the VM on running contexts
  drm/i915/gem: Don't allow changing the engine set on running contexts
  drm/i915/selftests: Take a VM in kernel_context()
  i915/gem/selftests: Assign the VM at context creation in
    igt_shared_ctx_exec
  drm/i915/gem: Roll all of context creation together

 drivers/gpu/drm/i915/Makefile                 |    1 -
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 2967 +++++++----------
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |    3 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   68 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   31 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |  127 +-
 .../gpu/drm/i915/gem/selftests/mock_context.c |   62 +-
 .../gpu/drm/i915/gem/selftests/mock_context.h |    4 +-
 drivers/gpu/drm/i915/gt/intel_context_param.c |   63 -
 drivers/gpu/drm/i915/gt/intel_context_param.h |    6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |    7 -
 .../drm/i915/gt/intel_execlists_submission.c  |  100 -
 .../drm/i915/gt/intel_execlists_submission.h  |    4 -
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   10 -
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  249 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |    2 +-
 drivers/gpu/drm/i915/i915_drv.h               |   23 +-
 drivers/gpu/drm/i915/i915_perf.c              |    4 +-
 drivers/gpu/drm/i915/i915_request.c           |   42 +-
 drivers/gpu/drm/i915/i915_request.h           |    4 +-
 .../drm/i915/selftests/i915_mock_selftests.h  |    1 -
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |    1 -
 include/uapi/drm/i915_drm.h                   |   40 +-
 23 files changed, 1438 insertions(+), 2381 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c

-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 00/21] drm/i915/gem: ioctl clean-ups
@ 2021-04-23 22:31 ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Overview:
---------

This patch series attempts to clean up some of the IOCTL mess we've created
over the last few years.  The most egregious bit being context mutability.
In summary, this series:

 1. Drops two never-used context params: RINGSIZE and NO_ZEROMAP
 2. Drops the entire CONTEXT_CLONE API
 3. Implements SINGLE_TIMELINE with a syncobj instead of actually sharing
    intel_timeline between engines.
 4. Adds a few sanity restrictions to the balancing/bonding API.
 5. Implements a proto-ctx mechanism so that the engine set and VM can only
    be set early on in the lifetime of a context, before anything ever
    executes on it.  This effectively makes the VM and engine set
    immutable.

This series has been tested with IGT as well as the Iris, ANV, and the
Intel media driver doing an 8K decode (this uses bonding/balancing).  I've
also done quite a bit of git archeology to ensure that nothing in here will
break anything that's already shipped at some point in history.  It's
possible I've missed something, but I've dug quite a bit.


Details and motivation:
-----------------------

In very broad strokes, there's an effort going on right now within Intel to
try and clean up and simplify i915 anywhere we can.  We obviously don't
want to break any shipping userspace but, as can be seen by this series,
there's a lot i915 theoretically supports which userspace doesn't actually
need.  Some of this, like the two context params used here, were simply
oversights where we went through the usual API review process and merged
the i915 bits but the userspace bits never landed for some reason.

Not all are so innocent, however.  For instance, there's an entire context
cloning API which allows one to create a context with certain parameters
"cloned" from some other context.  This entire API has never been used by
any userspace except IGT and there were never patches to any other
userspace to use it.  It never should have landed.  Also, when we added
support for setting explicit engine sets and sharing VMs across contexts,
people decided to do so via SET_CONTEXT_PARAM.  While this allowed them to
re-use existing API, it did so at the cost of making those states mutable
which leads to a plethora of potential race conditions.  There were even
IGT tests merged to cover some of theses:

 - gem_vm_create@async-destroy and gem_vm_create@destroy-race which test
   swapping out the VM on a running context.

 - gem_ctx_persistence@replace* which test whether a client can escape a
   non-persistent context by submitting a hanging batch and then swapping
   out the engine set before the hang is detected.

 - api_intel_bb@bb-with-vm which tests the that intel_bb_assign_vm works
   properly.  This API is never used by any other IGT test.

There is also an entire deferred flush and set state framework in
i915_gem_cotnext.c which exists for safely swapping out the VM while there
is work in-flight on a context.

So, clearly people knew that this API was inherently racy and difficult to
implement but they landed it anyway.  Why?  The best explanation I've been
given is because it makes the API more "unified" or "symmetric" for this
stuff to go through SET_CONTEXT_PARAM.  It's not because any userspace
actually wants to be able to swap out the VM or the set of engines on a
running context.  That would be utterly insane.

This patch series cleans up this particular mess by introducing the concept
of a i915_gem_proto_context data structure which contains context creation
information.  When you initially call GEM_CONTEXT_CREATE, a proto-context
in created instead of an actual context.  Then, the first time something is
done on the context besides SET_CONTEXT_PARAM, an actual context is
created.  This allows us to keep the old drivers which use
SET_CONTEXT_PARAM to set up the engine set (see also media) while ensuring
that, once you have an i915_gem_context, the VM and the engine set are
immutable state.

Eventually, there are more clean-ups I'd like to do on top of this which
should make working with contexts inside i915 simpler and safer:

 1. Move the GEM handle -> vma LUT from i915_gem_context into either
    i915_ppgtt or drm_i915_file_private depending on whether or not the
    hardware has a full PPGTT.

 2. Move the delayed context destruction code into intel_context or a
    per-engine wrapper struct rather than i915_gem_context.

 3. Get rid of the separation between context close and context destroy

 4. Get rid of the RCU on i915_gem_context

However, these should probably be done as a separate patch series as this
one is already starting to get longish, especially if you consider the 89
IGT patches that go along with it.

Test-with: 20210423214853.876911-1-jason@jlekstrand.net

Jason Ekstrand (21):
  drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
  drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
  drm/i915/gem: Set the watchdog timeout directly in
    intel_context_set_gem
  drm/i915/gem: Return void from context_apply_all
  drm/i915: Drop the CONTEXT_CLONE API
  drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
  drm/i915/gem: Disallow bonding of virtual engines
  drm/i915/gem: Disallow creating contexts with too many engines
  drm/i915/request: Remove the hook from await_execution
  drm/i915: Stop manually RCU banging in reset_stats_ioctl
  drm/i915/gem: Add a separate validate_priority helper
  drm/i915/gem: Add an intermediate proto_context struct
  drm/i915/gem: Return an error ptr from context_lookup
  drm/i915/gt: Drop i915_address_space::file
  drm/i915/gem: Delay context creation
  drm/i915/gem: Don't allow changing the VM on running contexts
  drm/i915/gem: Don't allow changing the engine set on running contexts
  drm/i915/selftests: Take a VM in kernel_context()
  i915/gem/selftests: Assign the VM at context creation in
    igt_shared_ctx_exec
  drm/i915/gem: Roll all of context creation together

 drivers/gpu/drm/i915/Makefile                 |    1 -
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 2967 +++++++----------
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |    3 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   68 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   31 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |  127 +-
 .../gpu/drm/i915/gem/selftests/mock_context.c |   62 +-
 .../gpu/drm/i915/gem/selftests/mock_context.h |    4 +-
 drivers/gpu/drm/i915/gt/intel_context_param.c |   63 -
 drivers/gpu/drm/i915/gt/intel_context_param.h |    6 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |    7 -
 .../drm/i915/gt/intel_execlists_submission.c  |  100 -
 .../drm/i915/gt/intel_execlists_submission.h  |    4 -
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   10 -
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  249 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |    2 +-
 drivers/gpu/drm/i915/i915_drv.h               |   23 +-
 drivers/gpu/drm/i915/i915_perf.c              |    4 +-
 drivers/gpu/drm/i915/i915_request.c           |   42 +-
 drivers/gpu/drm/i915/i915_request.h           |    4 +-
 .../drm/i915/selftests/i915_mock_selftests.h  |    1 -
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |    1 -
 include/uapi/drm/i915_drm.h                   |   40 +-
 23 files changed, 1438 insertions(+), 2381 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c

-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
ringsize on construction").  This API was originally added for OpenCL
but the compute-runtime PR has sat open for a year without action so we
can still pull it out if we want.  I argue we should drop it for three
reasons:

 1. If the compute-runtime PR has sat open for a year, this clearly
    isn't that important.

 2. It's a very leaky API.  Ring size is an implementation detail of the
    current execlist scheduler and really only makes sense there.  It
    can't apply to the older ring-buffer scheduler on pre-execlist
    hardware because that's shared across all contexts and it won't
    apply to the GuC scheduler that's in the pipeline.

 3. Having userspace set a ring size in bytes is a bad solution to the
    problem of having too small a ring.  There is no way that userspace
    has the information to know how to properly set the ring size so
    it's just going to detect the feature and always set it to the
    maximum of 512K.  This is what the compute-runtime PR does.  The
    scheduler in i915, on the other hand, does have the information to
    make an informed choice.  It could detect if the ring size is a
    problem and grow it itself.  Or, if that's too hard, we could just
    increase the default size from 16K to 32K or even 64K instead of
    relying on userspace to do it.

Let's drop this API for now and, if someone decides they really care
about solving this problem, they can do it properly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/Makefile                 |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +------------------
 drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --------------
 drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
 include/uapi/drm/i915_drm.h                   | 20 +----
 5 files changed, 4 insertions(+), 168 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index d0d936d9137bc..afa22338fa343 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -88,7 +88,6 @@ gt-y += \
 	gt/gen8_ppgtt.o \
 	gt/intel_breadcrumbs.o \
 	gt/intel_context.o \
-	gt/intel_context_param.o \
 	gt/intel_context_sseu.o \
 	gt/intel_engine_cs.o \
 	gt/intel_engine_heartbeat.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fd8ee52e17a47..e52b85b8f923d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
 	return err;
 }
 
-static int __apply_ringsize(struct intel_context *ce, void *sz)
-{
-	return intel_context_set_ring_size(ce, (unsigned long)sz);
-}
-
-static int set_ringsize(struct i915_gem_context *ctx,
-			struct drm_i915_gem_context_param *args)
-{
-	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
-		return -ENODEV;
-
-	if (args->size)
-		return -EINVAL;
-
-	if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
-		return -EINVAL;
-
-	if (args->value < I915_GTT_PAGE_SIZE)
-		return -EINVAL;
-
-	if (args->value > 128 * I915_GTT_PAGE_SIZE)
-		return -EINVAL;
-
-	return context_apply_all(ctx,
-				 __apply_ringsize,
-				 __intel_context_ring_size(args->value));
-}
-
-static int __get_ringsize(struct intel_context *ce, void *arg)
-{
-	long sz;
-
-	sz = intel_context_get_ring_size(ce);
-	GEM_BUG_ON(sz > INT_MAX);
-
-	return sz; /* stop on first engine */
-}
-
-static int get_ringsize(struct i915_gem_context *ctx,
-			struct drm_i915_gem_context_param *args)
-{
-	int sz;
-
-	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
-		return -ENODEV;
-
-	if (args->size)
-		return -EINVAL;
-
-	sz = context_apply_all(ctx, __get_ringsize, NULL);
-	if (sz < 0)
-		return sz;
-
-	args->value = sz;
-	return 0;
-}
-
 int
 i915_gem_user_to_context_sseu(struct intel_gt *gt,
 			      const struct drm_i915_gem_context_param_sseu *user,
@@ -2037,11 +1980,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_persistence(ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_RINGSIZE:
-		ret = set_ringsize(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
 		ret = -EINVAL;
 		break;
@@ -2069,18 +2009,6 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
-static int copy_ring_size(struct intel_context *dst,
-			  struct intel_context *src)
-{
-	long sz;
-
-	sz = intel_context_get_ring_size(src);
-	if (sz < 0)
-		return sz;
-
-	return intel_context_set_ring_size(dst, sz);
-}
-
 static int clone_engines(struct i915_gem_context *dst,
 			 struct i915_gem_context *src)
 {
@@ -2125,12 +2053,6 @@ static int clone_engines(struct i915_gem_context *dst,
 		}
 
 		intel_context_set_gem(clone->engines[n], dst);
-
-		/* Copy across the preferred ringsize */
-		if (copy_ring_size(clone->engines[n], e->engines[n])) {
-			__free_engines(clone, n + 1);
-			goto err_unlock;
-		}
 	}
 	clone->num_engines = n;
 	i915_sw_fence_complete(&e->fence);
@@ -2490,11 +2412,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		args->value = i915_gem_context_is_persistent(ctx);
 		break;
 
-	case I915_CONTEXT_PARAM_RINGSIZE:
-		ret = get_ringsize(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
deleted file mode 100644
index 65dcd090245d6..0000000000000
--- a/drivers/gpu/drm/i915/gt/intel_context_param.c
+++ /dev/null
@@ -1,63 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_active.h"
-#include "intel_context.h"
-#include "intel_context_param.h"
-#include "intel_ring.h"
-
-int intel_context_set_ring_size(struct intel_context *ce, long sz)
-{
-	int err;
-
-	if (intel_context_lock_pinned(ce))
-		return -EINTR;
-
-	err = i915_active_wait(&ce->active);
-	if (err < 0)
-		goto unlock;
-
-	if (intel_context_is_pinned(ce)) {
-		err = -EBUSY; /* In active use, come back later! */
-		goto unlock;
-	}
-
-	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-		struct intel_ring *ring;
-
-		/* Replace the existing ringbuffer */
-		ring = intel_engine_create_ring(ce->engine, sz);
-		if (IS_ERR(ring)) {
-			err = PTR_ERR(ring);
-			goto unlock;
-		}
-
-		intel_ring_put(ce->ring);
-		ce->ring = ring;
-
-		/* Context image will be updated on next pin */
-	} else {
-		ce->ring = __intel_context_ring_size(sz);
-	}
-
-unlock:
-	intel_context_unlock_pinned(ce);
-	return err;
-}
-
-long intel_context_get_ring_size(struct intel_context *ce)
-{
-	long sz = (unsigned long)READ_ONCE(ce->ring);
-
-	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-		if (intel_context_lock_pinned(ce))
-			return -EINTR;
-
-		sz = ce->ring->size;
-		intel_context_unlock_pinned(ce);
-	}
-
-	return sz;
-}
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
index 3ecacc675f414..dffedd983693d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_param.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
@@ -10,9 +10,6 @@
 
 #include "intel_context.h"
 
-int intel_context_set_ring_size(struct intel_context *ce, long sz);
-long intel_context_get_ring_size(struct intel_context *ce);
-
 static inline int
 intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
 {
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6a34243a7646a..6eefbc6dec01f 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1721,24 +1721,8 @@ struct drm_i915_gem_context_param {
  */
 #define I915_CONTEXT_PARAM_PERSISTENCE	0xb
 
-/*
- * I915_CONTEXT_PARAM_RINGSIZE:
- *
- * Sets the size of the CS ringbuffer to use for logical ring contexts. This
- * applies a limit of how many batches can be queued to HW before the caller
- * is blocked due to lack of space for more commands.
- *
- * Only reliably possible to be set prior to first use, i.e. during
- * construction. At any later point, the current execution must be flushed as
- * the ring can only be changed while the context is idle. Note, the ringsize
- * can be specified as a constructor property, see
- * I915_CONTEXT_CREATE_EXT_SETPARAM, but can also be set later if required.
- *
- * Only applies to the current set of engine and lost when those engines
- * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
- *
- * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
- * Default is 16 KiB.
+/* This API has been removed.  On the off chance someone somewhere has
+ * attempted to use it, never re-use this context param number.
  */
 #define I915_CONTEXT_PARAM_RINGSIZE	0xc
 /* Must be kept compact -- no holes and well documented */
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
ringsize on construction").  This API was originally added for OpenCL
but the compute-runtime PR has sat open for a year without action so we
can still pull it out if we want.  I argue we should drop it for three
reasons:

 1. If the compute-runtime PR has sat open for a year, this clearly
    isn't that important.

 2. It's a very leaky API.  Ring size is an implementation detail of the
    current execlist scheduler and really only makes sense there.  It
    can't apply to the older ring-buffer scheduler on pre-execlist
    hardware because that's shared across all contexts and it won't
    apply to the GuC scheduler that's in the pipeline.

 3. Having userspace set a ring size in bytes is a bad solution to the
    problem of having too small a ring.  There is no way that userspace
    has the information to know how to properly set the ring size so
    it's just going to detect the feature and always set it to the
    maximum of 512K.  This is what the compute-runtime PR does.  The
    scheduler in i915, on the other hand, does have the information to
    make an informed choice.  It could detect if the ring size is a
    problem and grow it itself.  Or, if that's too hard, we could just
    increase the default size from 16K to 32K or even 64K instead of
    relying on userspace to do it.

Let's drop this API for now and, if someone decides they really care
about solving this problem, they can do it properly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/Makefile                 |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +------------------
 drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --------------
 drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
 include/uapi/drm/i915_drm.h                   | 20 +----
 5 files changed, 4 insertions(+), 168 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index d0d936d9137bc..afa22338fa343 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -88,7 +88,6 @@ gt-y += \
 	gt/gen8_ppgtt.o \
 	gt/intel_breadcrumbs.o \
 	gt/intel_context.o \
-	gt/intel_context_param.o \
 	gt/intel_context_sseu.o \
 	gt/intel_engine_cs.o \
 	gt/intel_engine_heartbeat.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fd8ee52e17a47..e52b85b8f923d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
 	return err;
 }
 
-static int __apply_ringsize(struct intel_context *ce, void *sz)
-{
-	return intel_context_set_ring_size(ce, (unsigned long)sz);
-}
-
-static int set_ringsize(struct i915_gem_context *ctx,
-			struct drm_i915_gem_context_param *args)
-{
-	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
-		return -ENODEV;
-
-	if (args->size)
-		return -EINVAL;
-
-	if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
-		return -EINVAL;
-
-	if (args->value < I915_GTT_PAGE_SIZE)
-		return -EINVAL;
-
-	if (args->value > 128 * I915_GTT_PAGE_SIZE)
-		return -EINVAL;
-
-	return context_apply_all(ctx,
-				 __apply_ringsize,
-				 __intel_context_ring_size(args->value));
-}
-
-static int __get_ringsize(struct intel_context *ce, void *arg)
-{
-	long sz;
-
-	sz = intel_context_get_ring_size(ce);
-	GEM_BUG_ON(sz > INT_MAX);
-
-	return sz; /* stop on first engine */
-}
-
-static int get_ringsize(struct i915_gem_context *ctx,
-			struct drm_i915_gem_context_param *args)
-{
-	int sz;
-
-	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
-		return -ENODEV;
-
-	if (args->size)
-		return -EINVAL;
-
-	sz = context_apply_all(ctx, __get_ringsize, NULL);
-	if (sz < 0)
-		return sz;
-
-	args->value = sz;
-	return 0;
-}
-
 int
 i915_gem_user_to_context_sseu(struct intel_gt *gt,
 			      const struct drm_i915_gem_context_param_sseu *user,
@@ -2037,11 +1980,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_persistence(ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_RINGSIZE:
-		ret = set_ringsize(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
 		ret = -EINVAL;
 		break;
@@ -2069,18 +2009,6 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
-static int copy_ring_size(struct intel_context *dst,
-			  struct intel_context *src)
-{
-	long sz;
-
-	sz = intel_context_get_ring_size(src);
-	if (sz < 0)
-		return sz;
-
-	return intel_context_set_ring_size(dst, sz);
-}
-
 static int clone_engines(struct i915_gem_context *dst,
 			 struct i915_gem_context *src)
 {
@@ -2125,12 +2053,6 @@ static int clone_engines(struct i915_gem_context *dst,
 		}
 
 		intel_context_set_gem(clone->engines[n], dst);
-
-		/* Copy across the preferred ringsize */
-		if (copy_ring_size(clone->engines[n], e->engines[n])) {
-			__free_engines(clone, n + 1);
-			goto err_unlock;
-		}
 	}
 	clone->num_engines = n;
 	i915_sw_fence_complete(&e->fence);
@@ -2490,11 +2412,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		args->value = i915_gem_context_is_persistent(ctx);
 		break;
 
-	case I915_CONTEXT_PARAM_RINGSIZE:
-		ret = get_ringsize(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
deleted file mode 100644
index 65dcd090245d6..0000000000000
--- a/drivers/gpu/drm/i915/gt/intel_context_param.c
+++ /dev/null
@@ -1,63 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_active.h"
-#include "intel_context.h"
-#include "intel_context_param.h"
-#include "intel_ring.h"
-
-int intel_context_set_ring_size(struct intel_context *ce, long sz)
-{
-	int err;
-
-	if (intel_context_lock_pinned(ce))
-		return -EINTR;
-
-	err = i915_active_wait(&ce->active);
-	if (err < 0)
-		goto unlock;
-
-	if (intel_context_is_pinned(ce)) {
-		err = -EBUSY; /* In active use, come back later! */
-		goto unlock;
-	}
-
-	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-		struct intel_ring *ring;
-
-		/* Replace the existing ringbuffer */
-		ring = intel_engine_create_ring(ce->engine, sz);
-		if (IS_ERR(ring)) {
-			err = PTR_ERR(ring);
-			goto unlock;
-		}
-
-		intel_ring_put(ce->ring);
-		ce->ring = ring;
-
-		/* Context image will be updated on next pin */
-	} else {
-		ce->ring = __intel_context_ring_size(sz);
-	}
-
-unlock:
-	intel_context_unlock_pinned(ce);
-	return err;
-}
-
-long intel_context_get_ring_size(struct intel_context *ce)
-{
-	long sz = (unsigned long)READ_ONCE(ce->ring);
-
-	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-		if (intel_context_lock_pinned(ce))
-			return -EINTR;
-
-		sz = ce->ring->size;
-		intel_context_unlock_pinned(ce);
-	}
-
-	return sz;
-}
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
index 3ecacc675f414..dffedd983693d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_param.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
@@ -10,9 +10,6 @@
 
 #include "intel_context.h"
 
-int intel_context_set_ring_size(struct intel_context *ce, long sz);
-long intel_context_get_ring_size(struct intel_context *ce);
-
 static inline int
 intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
 {
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6a34243a7646a..6eefbc6dec01f 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1721,24 +1721,8 @@ struct drm_i915_gem_context_param {
  */
 #define I915_CONTEXT_PARAM_PERSISTENCE	0xb
 
-/*
- * I915_CONTEXT_PARAM_RINGSIZE:
- *
- * Sets the size of the CS ringbuffer to use for logical ring contexts. This
- * applies a limit of how many batches can be queued to HW before the caller
- * is blocked due to lack of space for more commands.
- *
- * Only reliably possible to be set prior to first use, i.e. during
- * construction. At any later point, the current execution must be flushed as
- * the ring can only be changed while the context is idle. Note, the ringsize
- * can be specified as a constructor property, see
- * I915_CONTEXT_CREATE_EXT_SETPARAM, but can also be set later if required.
- *
- * Only applies to the current set of engine and lost when those engines
- * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
- *
- * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
- * Default is 16 KiB.
+/* This API has been removed.  On the off chance someone somewhere has
+ * attempted to use it, never re-use this context param number.
  */
 #define I915_CONTEXT_PARAM_RINGSIZE	0xc
 /* Must be kept compact -- no holes and well documented */
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 02/21] drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

The idea behind this param is to support OpenCL drivers with relocations
because OpenCL reserves 0x0 for NULL and, if we placed memory there, it
would confuse CL kernels.  It was originally sent out as part of a patch
series including libdrm [1] and Beignet [2] support.  However, the
libdrm and Beignet patches never landed in their respective upstream
projects so this API has never been used.  It's never been used in Mesa
or any other driver, either.

Dropping this API allows us to delete a small bit of code.

[1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html
[2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c      | 16 ++--------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h    |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   |  8 --------
 include/uapi/drm/i915_drm.h                      |  4 ++++
 4 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e52b85b8f923d..35bcdeddfbf3f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1922,15 +1922,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 	int ret = 0;
 
 	switch (args->param) {
-	case I915_CONTEXT_PARAM_NO_ZEROMAP:
-		if (args->size)
-			ret = -EINVAL;
-		else if (args->value)
-			set_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		else
-			clear_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		break;
-
 	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
 		if (args->size)
 			ret = -EINVAL;
@@ -1980,6 +1971,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_persistence(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
@@ -2360,11 +2352,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->param) {
-	case I915_CONTEXT_PARAM_NO_ZEROMAP:
-		args->size = 0;
-		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		break;
-
 	case I915_CONTEXT_PARAM_GTT_SIZE:
 		args->size = 0;
 		rcu_read_lock();
@@ -2412,6 +2399,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		args->value = i915_gem_context_is_persistent(ctx);
 		break;
 
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 340473aa70de0..5ae71ec936f7c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -129,7 +129,6 @@ struct i915_gem_context {
 	 * @user_flags: small set of booleans controlled by the user
 	 */
 	unsigned long user_flags;
-#define UCONTEXT_NO_ZEROMAP		0
 #define UCONTEXT_NO_ERROR_CAPTURE	1
 #define UCONTEXT_BANNABLE		2
 #define UCONTEXT_RECOVERABLE		3
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 297143511f99b..b812f313422a9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -290,7 +290,6 @@ struct i915_execbuffer {
 	struct intel_context *reloc_context;
 
 	u64 invalid_flags; /** Set of execobj.flags that are invalid */
-	u32 context_flags; /** Set of execobj.flags to insert from the ctx */
 
 	u64 batch_len; /** Length of batch within object */
 	u32 batch_start_offset; /** Location within object of batch */
@@ -541,9 +540,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
 			entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
 	}
 
-	if (!(entry->flags & EXEC_OBJECT_PINNED))
-		entry->flags |= eb->context_flags;
-
 	return 0;
 }
 
@@ -750,10 +746,6 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	if (rcu_access_pointer(ctx->vm))
 		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
 
-	eb->context_flags = 0;
-	if (test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags))
-		eb->context_flags |= __EXEC_OBJECT_NEEDS_BIAS;
-
 	return 0;
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6eefbc6dec01f..a0aaa8298f28d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1637,6 +1637,10 @@ struct drm_i915_gem_context_param {
 	__u32 size;
 	__u64 param;
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
+/* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed.  On the off chance
+ * someone somewhere has attempted to use it, never re-use this context
+ * param number.
+ */
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
 #define I915_CONTEXT_PARAM_NO_ERROR_CAPTURE	0x4
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 02/21] drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

The idea behind this param is to support OpenCL drivers with relocations
because OpenCL reserves 0x0 for NULL and, if we placed memory there, it
would confuse CL kernels.  It was originally sent out as part of a patch
series including libdrm [1] and Beignet [2] support.  However, the
libdrm and Beignet patches never landed in their respective upstream
projects so this API has never been used.  It's never been used in Mesa
or any other driver, either.

Dropping this API allows us to delete a small bit of code.

[1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html
[2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c      | 16 ++--------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h    |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   |  8 --------
 include/uapi/drm/i915_drm.h                      |  4 ++++
 4 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e52b85b8f923d..35bcdeddfbf3f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1922,15 +1922,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 	int ret = 0;
 
 	switch (args->param) {
-	case I915_CONTEXT_PARAM_NO_ZEROMAP:
-		if (args->size)
-			ret = -EINVAL;
-		else if (args->value)
-			set_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		else
-			clear_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		break;
-
 	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
 		if (args->size)
 			ret = -EINVAL;
@@ -1980,6 +1971,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_persistence(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
@@ -2360,11 +2352,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->param) {
-	case I915_CONTEXT_PARAM_NO_ZEROMAP:
-		args->size = 0;
-		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		break;
-
 	case I915_CONTEXT_PARAM_GTT_SIZE:
 		args->size = 0;
 		rcu_read_lock();
@@ -2412,6 +2399,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		args->value = i915_gem_context_is_persistent(ctx);
 		break;
 
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 340473aa70de0..5ae71ec936f7c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -129,7 +129,6 @@ struct i915_gem_context {
 	 * @user_flags: small set of booleans controlled by the user
 	 */
 	unsigned long user_flags;
-#define UCONTEXT_NO_ZEROMAP		0
 #define UCONTEXT_NO_ERROR_CAPTURE	1
 #define UCONTEXT_BANNABLE		2
 #define UCONTEXT_RECOVERABLE		3
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 297143511f99b..b812f313422a9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -290,7 +290,6 @@ struct i915_execbuffer {
 	struct intel_context *reloc_context;
 
 	u64 invalid_flags; /** Set of execobj.flags that are invalid */
-	u32 context_flags; /** Set of execobj.flags to insert from the ctx */
 
 	u64 batch_len; /** Length of batch within object */
 	u32 batch_start_offset; /** Location within object of batch */
@@ -541,9 +540,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
 			entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
 	}
 
-	if (!(entry->flags & EXEC_OBJECT_PINNED))
-		entry->flags |= eb->context_flags;
-
 	return 0;
 }
 
@@ -750,10 +746,6 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	if (rcu_access_pointer(ctx->vm))
 		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
 
-	eb->context_flags = 0;
-	if (test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags))
-		eb->context_flags |= __EXEC_OBJECT_NEEDS_BIAS;
-
 	return 0;
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6eefbc6dec01f..a0aaa8298f28d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1637,6 +1637,10 @@ struct drm_i915_gem_context_param {
 	__u32 size;
 	__u64 param;
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
+/* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed.  On the off chance
+ * someone somewhere has attempted to use it, never re-use this context
+ * param number.
+ */
 #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
 #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
 #define I915_CONTEXT_PARAM_NO_ERROR_CAPTURE	0x4
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Instead of handling it like a context param, unconditionally set it when
intel_contexts are created.  This doesn't fix anything but does simplify
the code a bit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
 drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
 3 files changed, 6 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 35bcdeddfbf3f..1091cc04a242a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
 	    intel_engine_has_timeslices(ce->engine))
 		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
 
-	intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
+	if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
+	    ctx->i915->params.request_timeout_ms) {
+		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
+		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
+	}
 }
 
 static void __free_engines(struct i915_gem_engines *e, unsigned int count)
@@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
 	context_apply_all(ctx, __apply_timeline, timeline);
 }
 
-static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
-{
-	return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
-}
-
-static int
-__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
-{
-	int ret;
-
-	ret = context_apply_all(ctx, __apply_watchdog,
-				(void *)(uintptr_t)timeout_us);
-	if (!ret)
-		ctx->watchdog.timeout_us = timeout_us;
-
-	return ret;
-}
-
-static void __set_default_fence_expiry(struct i915_gem_context *ctx)
-{
-	struct drm_i915_private *i915 = ctx->i915;
-	int ret;
-
-	if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
-	    !i915->params.request_timeout_ms)
-		return;
-
-	/* Default expiry for user fences. */
-	ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
-	if (ret)
-		drm_notice(&i915->drm,
-			   "Failed to configure default fence expiry! (%d)",
-			   ret);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 {
@@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 		intel_timeline_put(timeline);
 	}
 
-	__set_default_fence_expiry(ctx);
-
 	trace_i915_context_create(ctx);
 
 	return ctx;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 5ae71ec936f7c..676592e27e7d2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -153,10 +153,6 @@ struct i915_gem_context {
 	 */
 	atomic_t active_count;
 
-	struct {
-		u64 timeout_us;
-	} watchdog;
-
 	/**
 	 * @hang_timestamp: The last time(s) this context caused a GPU hang
 	 */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
index dffedd983693d..0c69cb42d075c 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_param.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
@@ -10,11 +10,10 @@
 
 #include "intel_context.h"
 
-static inline int
+static inline void
 intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
 {
 	ce->watchdog.timeout_us = timeout_us;
-	return 0;
 }
 
 #endif /* INTEL_CONTEXT_PARAM_H */
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Instead of handling it like a context param, unconditionally set it when
intel_contexts are created.  This doesn't fix anything but does simplify
the code a bit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
 drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
 3 files changed, 6 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 35bcdeddfbf3f..1091cc04a242a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
 	    intel_engine_has_timeslices(ce->engine))
 		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
 
-	intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
+	if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
+	    ctx->i915->params.request_timeout_ms) {
+		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
+		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
+	}
 }
 
 static void __free_engines(struct i915_gem_engines *e, unsigned int count)
@@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
 	context_apply_all(ctx, __apply_timeline, timeline);
 }
 
-static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
-{
-	return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
-}
-
-static int
-__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
-{
-	int ret;
-
-	ret = context_apply_all(ctx, __apply_watchdog,
-				(void *)(uintptr_t)timeout_us);
-	if (!ret)
-		ctx->watchdog.timeout_us = timeout_us;
-
-	return ret;
-}
-
-static void __set_default_fence_expiry(struct i915_gem_context *ctx)
-{
-	struct drm_i915_private *i915 = ctx->i915;
-	int ret;
-
-	if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
-	    !i915->params.request_timeout_ms)
-		return;
-
-	/* Default expiry for user fences. */
-	ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
-	if (ret)
-		drm_notice(&i915->drm,
-			   "Failed to configure default fence expiry! (%d)",
-			   ret);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 {
@@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 		intel_timeline_put(timeline);
 	}
 
-	__set_default_fence_expiry(ctx);
-
 	trace_i915_context_create(ctx);
 
 	return ctx;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 5ae71ec936f7c..676592e27e7d2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -153,10 +153,6 @@ struct i915_gem_context {
 	 */
 	atomic_t active_count;
 
-	struct {
-		u64 timeout_us;
-	} watchdog;
-
 	/**
 	 * @hang_timestamp: The last time(s) this context caused a GPU hang
 	 */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
index dffedd983693d..0c69cb42d075c 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_param.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
@@ -10,11 +10,10 @@
 
 #include "intel_context.h"
 
-static inline int
+static inline void
 intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
 {
 	ce->watchdog.timeout_us = timeout_us;
-	return 0;
 }
 
 #endif /* INTEL_CONTEXT_PARAM_H */
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 04/21] drm/i915/gem: Return void from context_apply_all
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

None of the callbacks we use with it return an error code anymore; they
all return 0 unconditionally.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 26 +++++++--------------
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 1091cc04a242a..8a77855123cec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -718,32 +718,25 @@ __context_engines_await(const struct i915_gem_context *ctx,
 	return engines;
 }
 
-static int
+static void
 context_apply_all(struct i915_gem_context *ctx,
-		  int (*fn)(struct intel_context *ce, void *data),
+		  void (*fn)(struct intel_context *ce, void *data),
 		  void *data)
 {
 	struct i915_gem_engines_iter it;
 	struct i915_gem_engines *e;
 	struct intel_context *ce;
-	int err = 0;
 
 	e = __context_engines_await(ctx, NULL);
-	for_each_gem_engine(ce, e, it) {
-		err = fn(ce, data);
-		if (err)
-			break;
-	}
+	for_each_gem_engine(ce, e, it)
+		fn(ce, data);
 	i915_sw_fence_complete(&e->fence);
-
-	return err;
 }
 
-static int __apply_ppgtt(struct intel_context *ce, void *vm)
+static void __apply_ppgtt(struct intel_context *ce, void *vm)
 {
 	i915_vm_put(ce->vm);
 	ce->vm = i915_vm_get(vm);
-	return 0;
 }
 
 static struct i915_address_space *
@@ -783,10 +776,9 @@ static void __set_timeline(struct intel_timeline **dst,
 		intel_timeline_put(old);
 }
 
-static int __apply_timeline(struct intel_context *ce, void *timeline)
+static void __apply_timeline(struct intel_context *ce, void *timeline)
 {
 	__set_timeline(&ce->timeline, timeline);
-	return 0;
 }
 
 static void __assign_timeline(struct i915_gem_context *ctx,
@@ -1842,19 +1834,17 @@ set_persistence(struct i915_gem_context *ctx,
 	return __context_set_persistence(ctx, args->value);
 }
 
-static int __apply_priority(struct intel_context *ce, void *arg)
+static void __apply_priority(struct intel_context *ce, void *arg)
 {
 	struct i915_gem_context *ctx = arg;
 
 	if (!intel_engine_has_timeslices(ce->engine))
-		return 0;
+		return;
 
 	if (ctx->sched.priority >= I915_PRIORITY_NORMAL)
 		intel_context_set_use_semaphores(ce);
 	else
 		intel_context_clear_use_semaphores(ce);
-
-	return 0;
 }
 
 static int set_priority(struct i915_gem_context *ctx,
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 04/21] drm/i915/gem: Return void from context_apply_all
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

None of the callbacks we use with it return an error code anymore; they
all return 0 unconditionally.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 26 +++++++--------------
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 1091cc04a242a..8a77855123cec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -718,32 +718,25 @@ __context_engines_await(const struct i915_gem_context *ctx,
 	return engines;
 }
 
-static int
+static void
 context_apply_all(struct i915_gem_context *ctx,
-		  int (*fn)(struct intel_context *ce, void *data),
+		  void (*fn)(struct intel_context *ce, void *data),
 		  void *data)
 {
 	struct i915_gem_engines_iter it;
 	struct i915_gem_engines *e;
 	struct intel_context *ce;
-	int err = 0;
 
 	e = __context_engines_await(ctx, NULL);
-	for_each_gem_engine(ce, e, it) {
-		err = fn(ce, data);
-		if (err)
-			break;
-	}
+	for_each_gem_engine(ce, e, it)
+		fn(ce, data);
 	i915_sw_fence_complete(&e->fence);
-
-	return err;
 }
 
-static int __apply_ppgtt(struct intel_context *ce, void *vm)
+static void __apply_ppgtt(struct intel_context *ce, void *vm)
 {
 	i915_vm_put(ce->vm);
 	ce->vm = i915_vm_get(vm);
-	return 0;
 }
 
 static struct i915_address_space *
@@ -783,10 +776,9 @@ static void __set_timeline(struct intel_timeline **dst,
 		intel_timeline_put(old);
 }
 
-static int __apply_timeline(struct intel_context *ce, void *timeline)
+static void __apply_timeline(struct intel_context *ce, void *timeline)
 {
 	__set_timeline(&ce->timeline, timeline);
-	return 0;
 }
 
 static void __assign_timeline(struct i915_gem_context *ctx,
@@ -1842,19 +1834,17 @@ set_persistence(struct i915_gem_context *ctx,
 	return __context_set_persistence(ctx, args->value);
 }
 
-static int __apply_priority(struct intel_context *ce, void *arg)
+static void __apply_priority(struct intel_context *ce, void *arg)
 {
 	struct i915_gem_context *ctx = arg;
 
 	if (!intel_engine_has_timeslices(ce->engine))
-		return 0;
+		return;
 
 	if (ctx->sched.priority >= I915_PRIORITY_NORMAL)
 		intel_context_set_use_semaphores(ce);
 	else
 		intel_context_clear_use_semaphores(ce);
-
-	return 0;
 }
 
 static int set_priority(struct i915_gem_context *ctx,
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand, Tvrtko Ursulin

This API allows one context to grab bits out of another context upon
creation.  It can be used as a short-cut for setparam(getparam()) for
things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
real userspace.  It's used by a few IGT tests and that's it.  Since it
doesn't add any real value (most of the stuff you can CLONE you can copy
in other ways), drop it.

There is one thing that this API allows you to clone which you cannot
clone via getparam/setparam: timelines.  However, timelines are an
implementation detail of i915 and not really something that needs to be
exposed to userspace.  Also, sharing timelines between contexts isn't
obviously useful and supporting it has the potential to complicate i915
internally.  It also doesn't add any functionality that the client can't
get in other ways.  If a client really wants a shared timeline, they can
use a syncobj and set it as an in and out fence on every submit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
 include/uapi/drm/i915_drm.h                 |  16 +-
 2 files changed, 6 insertions(+), 209 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 8a77855123cec..2c2fefa912805 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
-static int clone_engines(struct i915_gem_context *dst,
-			 struct i915_gem_context *src)
+static int invalid_ext(struct i915_user_extension __user *ext, void *data)
 {
-	struct i915_gem_engines *clone, *e;
-	bool user_engines;
-	unsigned long n;
-
-	e = __context_engines_await(src, &user_engines);
-	if (!e)
-		return -ENOENT;
-
-	clone = alloc_engines(e->num_engines);
-	if (!clone)
-		goto err_unlock;
-
-	for (n = 0; n < e->num_engines; n++) {
-		struct intel_engine_cs *engine;
-
-		if (!e->engines[n]) {
-			clone->engines[n] = NULL;
-			continue;
-		}
-		engine = e->engines[n]->engine;
-
-		/*
-		 * Virtual engines are singletons; they can only exist
-		 * inside a single context, because they embed their
-		 * HW context... As each virtual context implies a single
-		 * timeline (each engine can only dequeue a single request
-		 * at any time), it would be surprising for two contexts
-		 * to use the same engine. So let's create a copy of
-		 * the virtual engine instead.
-		 */
-		if (intel_engine_is_virtual(engine))
-			clone->engines[n] =
-				intel_execlists_clone_virtual(engine);
-		else
-			clone->engines[n] = intel_context_create(engine);
-		if (IS_ERR_OR_NULL(clone->engines[n])) {
-			__free_engines(clone, n);
-			goto err_unlock;
-		}
-
-		intel_context_set_gem(clone->engines[n], dst);
-	}
-	clone->num_engines = n;
-	i915_sw_fence_complete(&e->fence);
-
-	/* Serialised by constructor */
-	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
-	if (user_engines)
-		i915_gem_context_set_user_engines(dst);
-	else
-		i915_gem_context_clear_user_engines(dst);
-	return 0;
-
-err_unlock:
-	i915_sw_fence_complete(&e->fence);
-	return -ENOMEM;
-}
-
-static int clone_flags(struct i915_gem_context *dst,
-		       struct i915_gem_context *src)
-{
-	dst->user_flags = src->user_flags;
-	return 0;
-}
-
-static int clone_schedattr(struct i915_gem_context *dst,
-			   struct i915_gem_context *src)
-{
-	dst->sched = src->sched;
-	return 0;
-}
-
-static int clone_sseu(struct i915_gem_context *dst,
-		      struct i915_gem_context *src)
-{
-	struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
-	struct i915_gem_engines *clone;
-	unsigned long n;
-	int err;
-
-	/* no locking required; sole access under constructor*/
-	clone = __context_engines_static(dst);
-	if (e->num_engines != clone->num_engines) {
-		err = -EINVAL;
-		goto unlock;
-	}
-
-	for (n = 0; n < e->num_engines; n++) {
-		struct intel_context *ce = e->engines[n];
-
-		if (clone->engines[n]->engine->class != ce->engine->class) {
-			/* Must have compatible engine maps! */
-			err = -EINVAL;
-			goto unlock;
-		}
-
-		/* serialises with set_sseu */
-		err = intel_context_lock_pinned(ce);
-		if (err)
-			goto unlock;
-
-		clone->engines[n]->sseu = ce->sseu;
-		intel_context_unlock_pinned(ce);
-	}
-
-	err = 0;
-unlock:
-	i915_gem_context_unlock_engines(src);
-	return err;
-}
-
-static int clone_timeline(struct i915_gem_context *dst,
-			  struct i915_gem_context *src)
-{
-	if (src->timeline)
-		__assign_timeline(dst, src->timeline);
-
-	return 0;
-}
-
-static int clone_vm(struct i915_gem_context *dst,
-		    struct i915_gem_context *src)
-{
-	struct i915_address_space *vm;
-	int err = 0;
-
-	if (!rcu_access_pointer(src->vm))
-		return 0;
-
-	rcu_read_lock();
-	vm = context_get_vm_rcu(src);
-	rcu_read_unlock();
-
-	if (!mutex_lock_interruptible(&dst->mutex)) {
-		__assign_ppgtt(dst, vm);
-		mutex_unlock(&dst->mutex);
-	} else {
-		err = -EINTR;
-	}
-
-	i915_vm_put(vm);
-	return err;
-}
-
-static int create_clone(struct i915_user_extension __user *ext, void *data)
-{
-	static int (* const fn[])(struct i915_gem_context *dst,
-				  struct i915_gem_context *src) = {
-#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
-		MAP(ENGINES, clone_engines),
-		MAP(FLAGS, clone_flags),
-		MAP(SCHEDATTR, clone_schedattr),
-		MAP(SSEU, clone_sseu),
-		MAP(TIMELINE, clone_timeline),
-		MAP(VM, clone_vm),
-#undef MAP
-	};
-	struct drm_i915_gem_context_create_ext_clone local;
-	const struct create_ext *arg = data;
-	struct i915_gem_context *dst = arg->ctx;
-	struct i915_gem_context *src;
-	int err, bit;
-
-	if (copy_from_user(&local, ext, sizeof(local)))
-		return -EFAULT;
-
-	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
-		     I915_CONTEXT_CLONE_UNKNOWN);
-
-	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
-		return -EINVAL;
-
-	if (local.rsvd)
-		return -EINVAL;
-
-	rcu_read_lock();
-	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
-	rcu_read_unlock();
-	if (!src)
-		return -ENOENT;
-
-	GEM_BUG_ON(src == dst);
-
-	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
-		if (!(local.flags & BIT(bit)))
-			continue;
-
-		err = fn[bit](dst, src);
-		if (err)
-			return err;
-	}
-
-	return 0;
+	return -EINVAL;
 }
 
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
-	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
+	[I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
 };
 
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a0aaa8298f28d..75a71b6756ed8 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
 	struct drm_i915_gem_context_param param;
 };
 
-struct drm_i915_gem_context_create_ext_clone {
+/* This API has been removed.  On the off chance someone somewhere has
+ * attempted to use it, never re-use this extension number.
+ */
 #define I915_CONTEXT_CREATE_EXT_CLONE 1
-	struct i915_user_extension base;
-	__u32 clone_id;
-	__u32 flags;
-#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
-#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
-#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
-#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
-#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
-#define I915_CONTEXT_CLONE_VM		(1u << 5)
-#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
-	__u64 rsvd;
-};
 
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

This API allows one context to grab bits out of another context upon
creation.  It can be used as a short-cut for setparam(getparam()) for
things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
real userspace.  It's used by a few IGT tests and that's it.  Since it
doesn't add any real value (most of the stuff you can CLONE you can copy
in other ways), drop it.

There is one thing that this API allows you to clone which you cannot
clone via getparam/setparam: timelines.  However, timelines are an
implementation detail of i915 and not really something that needs to be
exposed to userspace.  Also, sharing timelines between contexts isn't
obviously useful and supporting it has the potential to complicate i915
internally.  It also doesn't add any functionality that the client can't
get in other ways.  If a client really wants a shared timeline, they can
use a syncobj and set it as an in and out fence on every submit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
 include/uapi/drm/i915_drm.h                 |  16 +-
 2 files changed, 6 insertions(+), 209 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 8a77855123cec..2c2fefa912805 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
-static int clone_engines(struct i915_gem_context *dst,
-			 struct i915_gem_context *src)
+static int invalid_ext(struct i915_user_extension __user *ext, void *data)
 {
-	struct i915_gem_engines *clone, *e;
-	bool user_engines;
-	unsigned long n;
-
-	e = __context_engines_await(src, &user_engines);
-	if (!e)
-		return -ENOENT;
-
-	clone = alloc_engines(e->num_engines);
-	if (!clone)
-		goto err_unlock;
-
-	for (n = 0; n < e->num_engines; n++) {
-		struct intel_engine_cs *engine;
-
-		if (!e->engines[n]) {
-			clone->engines[n] = NULL;
-			continue;
-		}
-		engine = e->engines[n]->engine;
-
-		/*
-		 * Virtual engines are singletons; they can only exist
-		 * inside a single context, because they embed their
-		 * HW context... As each virtual context implies a single
-		 * timeline (each engine can only dequeue a single request
-		 * at any time), it would be surprising for two contexts
-		 * to use the same engine. So let's create a copy of
-		 * the virtual engine instead.
-		 */
-		if (intel_engine_is_virtual(engine))
-			clone->engines[n] =
-				intel_execlists_clone_virtual(engine);
-		else
-			clone->engines[n] = intel_context_create(engine);
-		if (IS_ERR_OR_NULL(clone->engines[n])) {
-			__free_engines(clone, n);
-			goto err_unlock;
-		}
-
-		intel_context_set_gem(clone->engines[n], dst);
-	}
-	clone->num_engines = n;
-	i915_sw_fence_complete(&e->fence);
-
-	/* Serialised by constructor */
-	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
-	if (user_engines)
-		i915_gem_context_set_user_engines(dst);
-	else
-		i915_gem_context_clear_user_engines(dst);
-	return 0;
-
-err_unlock:
-	i915_sw_fence_complete(&e->fence);
-	return -ENOMEM;
-}
-
-static int clone_flags(struct i915_gem_context *dst,
-		       struct i915_gem_context *src)
-{
-	dst->user_flags = src->user_flags;
-	return 0;
-}
-
-static int clone_schedattr(struct i915_gem_context *dst,
-			   struct i915_gem_context *src)
-{
-	dst->sched = src->sched;
-	return 0;
-}
-
-static int clone_sseu(struct i915_gem_context *dst,
-		      struct i915_gem_context *src)
-{
-	struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
-	struct i915_gem_engines *clone;
-	unsigned long n;
-	int err;
-
-	/* no locking required; sole access under constructor*/
-	clone = __context_engines_static(dst);
-	if (e->num_engines != clone->num_engines) {
-		err = -EINVAL;
-		goto unlock;
-	}
-
-	for (n = 0; n < e->num_engines; n++) {
-		struct intel_context *ce = e->engines[n];
-
-		if (clone->engines[n]->engine->class != ce->engine->class) {
-			/* Must have compatible engine maps! */
-			err = -EINVAL;
-			goto unlock;
-		}
-
-		/* serialises with set_sseu */
-		err = intel_context_lock_pinned(ce);
-		if (err)
-			goto unlock;
-
-		clone->engines[n]->sseu = ce->sseu;
-		intel_context_unlock_pinned(ce);
-	}
-
-	err = 0;
-unlock:
-	i915_gem_context_unlock_engines(src);
-	return err;
-}
-
-static int clone_timeline(struct i915_gem_context *dst,
-			  struct i915_gem_context *src)
-{
-	if (src->timeline)
-		__assign_timeline(dst, src->timeline);
-
-	return 0;
-}
-
-static int clone_vm(struct i915_gem_context *dst,
-		    struct i915_gem_context *src)
-{
-	struct i915_address_space *vm;
-	int err = 0;
-
-	if (!rcu_access_pointer(src->vm))
-		return 0;
-
-	rcu_read_lock();
-	vm = context_get_vm_rcu(src);
-	rcu_read_unlock();
-
-	if (!mutex_lock_interruptible(&dst->mutex)) {
-		__assign_ppgtt(dst, vm);
-		mutex_unlock(&dst->mutex);
-	} else {
-		err = -EINTR;
-	}
-
-	i915_vm_put(vm);
-	return err;
-}
-
-static int create_clone(struct i915_user_extension __user *ext, void *data)
-{
-	static int (* const fn[])(struct i915_gem_context *dst,
-				  struct i915_gem_context *src) = {
-#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
-		MAP(ENGINES, clone_engines),
-		MAP(FLAGS, clone_flags),
-		MAP(SCHEDATTR, clone_schedattr),
-		MAP(SSEU, clone_sseu),
-		MAP(TIMELINE, clone_timeline),
-		MAP(VM, clone_vm),
-#undef MAP
-	};
-	struct drm_i915_gem_context_create_ext_clone local;
-	const struct create_ext *arg = data;
-	struct i915_gem_context *dst = arg->ctx;
-	struct i915_gem_context *src;
-	int err, bit;
-
-	if (copy_from_user(&local, ext, sizeof(local)))
-		return -EFAULT;
-
-	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
-		     I915_CONTEXT_CLONE_UNKNOWN);
-
-	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
-		return -EINVAL;
-
-	if (local.rsvd)
-		return -EINVAL;
-
-	rcu_read_lock();
-	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
-	rcu_read_unlock();
-	if (!src)
-		return -ENOENT;
-
-	GEM_BUG_ON(src == dst);
-
-	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
-		if (!(local.flags & BIT(bit)))
-			continue;
-
-		err = fn[bit](dst, src);
-		if (err)
-			return err;
-	}
-
-	return 0;
+	return -EINVAL;
 }
 
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
-	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
+	[I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
 };
 
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a0aaa8298f28d..75a71b6756ed8 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
 	struct drm_i915_gem_context_param param;
 };
 
-struct drm_i915_gem_context_create_ext_clone {
+/* This API has been removed.  On the off chance someone somewhere has
+ * attempted to use it, never re-use this extension number.
+ */
 #define I915_CONTEXT_CREATE_EXT_CLONE 1
-	struct i915_user_extension base;
-	__u32 clone_id;
-	__u32 flags;
-#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
-#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
-#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
-#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
-#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
-#define I915_CONTEXT_CLONE_VM		(1u << 5)
-#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
-	__u64 rsvd;
-};
 
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Matthew Brost, Jason Ekstrand

This API is entirely unnecessary and I'd love to get rid of it.  If
userspace wants a single timeline across multiple contexts, they can
either use implicit synchronization or a syncobj, both of which existed
at the time this feature landed.  The justification given at the time
was that it would help GL drivers which are inherently single-timeline.
However, neither of our GL drivers actually wanted the feature.  i965
was already in maintenance mode at the time and iris uses syncobj for
everything.

Unfortunately, as much as I'd love to get rid of it, it is used by the
media driver so we can't do that.  We can, however, do the next-best
thing which is to embed a syncobj in the context and do exactly what
we'd expect from userspace internally.  This isn't an entirely identical
implementation because it's no longer atomic if userspace races with
itself by calling execbuffer2 twice simultaneously from different
threads.  It won't crash in that case; it just doesn't guarantee any
ordering between those two submits.

Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
advantages beyond mere annoyance.  One is that intel_timeline is no
longer an api-visible object and can remain entirely an implementation
detail.  This may be advantageous as we make scheduler changes going
forward.  Second is that, together with deleting the CLONE_CONTEXT API,
we should now have a 1:1 mapping between intel_context and
intel_timeline which may help us reduce locking.

v2 (Jason Ekstrand):
 - Update the comment on i915_gem_context::syncobj to mention that it's
   an emulation and the possible race if userspace calls execbuffer2
   twice on the same context concurrently.
 - Wrap the checks for eb.gem_context->syncobj in unlikely()
 - Drop the dma_fence reference
 - Improved commit message

v3 (Jason Ekstrand):
 - Move the dma_fence_put() to before the error exit

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
 3 files changed, 40 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 2c2fefa912805..a72c9b256723b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -67,6 +67,8 @@
 #include <linux/log2.h>
 #include <linux/nospec.h>
 
+#include <drm/drm_syncobj.h>
+
 #include "gt/gen6_ppgtt.h"
 #include "gt/intel_context.h"
 #include "gt/intel_context_param.h"
@@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
 		ce->vm = vm;
 	}
 
-	GEM_BUG_ON(ce->timeline);
-	if (ctx->timeline)
-		ce->timeline = intel_timeline_get(ctx->timeline);
-
 	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
 	    intel_engine_has_timeslices(ce->engine))
 		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
@@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
 	mutex_destroy(&ctx->engines_mutex);
 	mutex_destroy(&ctx->lut_mutex);
 
-	if (ctx->timeline)
-		intel_timeline_put(ctx->timeline);
-
 	put_pid(ctx->pid);
 	mutex_destroy(&ctx->mutex);
 
@@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
 	if (vm)
 		i915_vm_close(vm);
 
+	if (ctx->syncobj)
+		drm_syncobj_put(ctx->syncobj);
+
 	ctx->file_priv = ERR_PTR(-EBADF);
 
 	/*
@@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
 		i915_vm_close(vm);
 }
 
-static void __set_timeline(struct intel_timeline **dst,
-			   struct intel_timeline *src)
-{
-	struct intel_timeline *old = *dst;
-
-	*dst = src ? intel_timeline_get(src) : NULL;
-
-	if (old)
-		intel_timeline_put(old);
-}
-
-static void __apply_timeline(struct intel_context *ce, void *timeline)
-{
-	__set_timeline(&ce->timeline, timeline);
-}
-
-static void __assign_timeline(struct i915_gem_context *ctx,
-			      struct intel_timeline *timeline)
-{
-	__set_timeline(&ctx->timeline, timeline);
-	context_apply_all(ctx, __apply_timeline, timeline);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 {
 	struct i915_gem_context *ctx;
+	int ret;
 
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
 	    !HAS_EXECLISTS(i915))
@@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 	}
 
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
-		struct intel_timeline *timeline;
-
-		timeline = intel_timeline_create(&i915->gt);
-		if (IS_ERR(timeline)) {
+		ret = drm_syncobj_create(&ctx->syncobj,
+					 DRM_SYNCOBJ_CREATE_SIGNALED,
+					 NULL);
+		if (ret) {
 			context_close(ctx);
-			return ERR_CAST(timeline);
+			return ERR_PTR(ret);
 		}
-
-		__assign_timeline(ctx, timeline);
-		intel_timeline_put(timeline);
 	}
 
 	trace_i915_context_create(ctx);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 676592e27e7d2..df76767f0c41b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -83,7 +83,19 @@ struct i915_gem_context {
 	struct i915_gem_engines __rcu *engines;
 	struct mutex engines_mutex; /* guards writes to engines */
 
-	struct intel_timeline *timeline;
+	/**
+	 * @syncobj: Shared timeline syncobj
+	 *
+	 * When the SHARED_TIMELINE flag is set on context creation, we
+	 * emulate a single timeline across all engines using this syncobj.
+	 * For every execbuffer2 call, this syncobj is used as both an in-
+	 * and out-fence.  Unlike the real intel_timeline, this doesn't
+	 * provide perfect atomic in-order guarantees if the client races
+	 * with itself by calling execbuffer2 twice concurrently.  However,
+	 * if userspace races with itself, that's not likely to yield well-
+	 * defined results anyway so we choose to not care.
+	 */
+	struct drm_syncobj *syncobj;
 
 	/**
 	 * @vm: unique address space (GTT)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index b812f313422a9..d640bba6ad9ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		goto err_vma;
 	}
 
+	if (unlikely(eb.gem_context->syncobj)) {
+		struct dma_fence *fence;
+
+		fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
+		err = i915_request_await_dma_fence(eb.request, fence);
+		dma_fence_put(fence);
+		if (err)
+			goto err_ext;
+	}
+
 	if (in_fence) {
 		if (args->flags & I915_EXEC_FENCE_SUBMIT)
 			err = i915_request_await_execution(eb.request,
@@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			fput(out_fence->file);
 		}
 	}
+
+	if (unlikely(eb.gem_context->syncobj)) {
+		drm_syncobj_replace_fence(eb.gem_context->syncobj,
+					  &eb.request->fence);
+	}
+
 	i915_request_put(eb.request);
 
 err_vma:
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

This API is entirely unnecessary and I'd love to get rid of it.  If
userspace wants a single timeline across multiple contexts, they can
either use implicit synchronization or a syncobj, both of which existed
at the time this feature landed.  The justification given at the time
was that it would help GL drivers which are inherently single-timeline.
However, neither of our GL drivers actually wanted the feature.  i965
was already in maintenance mode at the time and iris uses syncobj for
everything.

Unfortunately, as much as I'd love to get rid of it, it is used by the
media driver so we can't do that.  We can, however, do the next-best
thing which is to embed a syncobj in the context and do exactly what
we'd expect from userspace internally.  This isn't an entirely identical
implementation because it's no longer atomic if userspace races with
itself by calling execbuffer2 twice simultaneously from different
threads.  It won't crash in that case; it just doesn't guarantee any
ordering between those two submits.

Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
advantages beyond mere annoyance.  One is that intel_timeline is no
longer an api-visible object and can remain entirely an implementation
detail.  This may be advantageous as we make scheduler changes going
forward.  Second is that, together with deleting the CLONE_CONTEXT API,
we should now have a 1:1 mapping between intel_context and
intel_timeline which may help us reduce locking.

v2 (Jason Ekstrand):
 - Update the comment on i915_gem_context::syncobj to mention that it's
   an emulation and the possible race if userspace calls execbuffer2
   twice on the same context concurrently.
 - Wrap the checks for eb.gem_context->syncobj in unlikely()
 - Drop the dma_fence reference
 - Improved commit message

v3 (Jason Ekstrand):
 - Move the dma_fence_put() to before the error exit

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
 3 files changed, 40 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 2c2fefa912805..a72c9b256723b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -67,6 +67,8 @@
 #include <linux/log2.h>
 #include <linux/nospec.h>
 
+#include <drm/drm_syncobj.h>
+
 #include "gt/gen6_ppgtt.h"
 #include "gt/intel_context.h"
 #include "gt/intel_context_param.h"
@@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
 		ce->vm = vm;
 	}
 
-	GEM_BUG_ON(ce->timeline);
-	if (ctx->timeline)
-		ce->timeline = intel_timeline_get(ctx->timeline);
-
 	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
 	    intel_engine_has_timeslices(ce->engine))
 		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
@@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
 	mutex_destroy(&ctx->engines_mutex);
 	mutex_destroy(&ctx->lut_mutex);
 
-	if (ctx->timeline)
-		intel_timeline_put(ctx->timeline);
-
 	put_pid(ctx->pid);
 	mutex_destroy(&ctx->mutex);
 
@@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
 	if (vm)
 		i915_vm_close(vm);
 
+	if (ctx->syncobj)
+		drm_syncobj_put(ctx->syncobj);
+
 	ctx->file_priv = ERR_PTR(-EBADF);
 
 	/*
@@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
 		i915_vm_close(vm);
 }
 
-static void __set_timeline(struct intel_timeline **dst,
-			   struct intel_timeline *src)
-{
-	struct intel_timeline *old = *dst;
-
-	*dst = src ? intel_timeline_get(src) : NULL;
-
-	if (old)
-		intel_timeline_put(old);
-}
-
-static void __apply_timeline(struct intel_context *ce, void *timeline)
-{
-	__set_timeline(&ce->timeline, timeline);
-}
-
-static void __assign_timeline(struct i915_gem_context *ctx,
-			      struct intel_timeline *timeline)
-{
-	__set_timeline(&ctx->timeline, timeline);
-	context_apply_all(ctx, __apply_timeline, timeline);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 {
 	struct i915_gem_context *ctx;
+	int ret;
 
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
 	    !HAS_EXECLISTS(i915))
@@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 	}
 
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
-		struct intel_timeline *timeline;
-
-		timeline = intel_timeline_create(&i915->gt);
-		if (IS_ERR(timeline)) {
+		ret = drm_syncobj_create(&ctx->syncobj,
+					 DRM_SYNCOBJ_CREATE_SIGNALED,
+					 NULL);
+		if (ret) {
 			context_close(ctx);
-			return ERR_CAST(timeline);
+			return ERR_PTR(ret);
 		}
-
-		__assign_timeline(ctx, timeline);
-		intel_timeline_put(timeline);
 	}
 
 	trace_i915_context_create(ctx);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 676592e27e7d2..df76767f0c41b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -83,7 +83,19 @@ struct i915_gem_context {
 	struct i915_gem_engines __rcu *engines;
 	struct mutex engines_mutex; /* guards writes to engines */
 
-	struct intel_timeline *timeline;
+	/**
+	 * @syncobj: Shared timeline syncobj
+	 *
+	 * When the SHARED_TIMELINE flag is set on context creation, we
+	 * emulate a single timeline across all engines using this syncobj.
+	 * For every execbuffer2 call, this syncobj is used as both an in-
+	 * and out-fence.  Unlike the real intel_timeline, this doesn't
+	 * provide perfect atomic in-order guarantees if the client races
+	 * with itself by calling execbuffer2 twice concurrently.  However,
+	 * if userspace races with itself, that's not likely to yield well-
+	 * defined results anyway so we choose to not care.
+	 */
+	struct drm_syncobj *syncobj;
 
 	/**
 	 * @vm: unique address space (GTT)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index b812f313422a9..d640bba6ad9ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		goto err_vma;
 	}
 
+	if (unlikely(eb.gem_context->syncobj)) {
+		struct dma_fence *fence;
+
+		fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
+		err = i915_request_await_dma_fence(eb.request, fence);
+		dma_fence_put(fence);
+		if (err)
+			goto err_ext;
+	}
+
 	if (in_fence) {
 		if (args->flags & I915_EXEC_FENCE_SUBMIT)
 			err = i915_request_await_execution(eb.request,
@@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			fput(out_fence->file);
 		}
 	}
+
+	if (unlikely(eb.gem_context->syncobj)) {
+		drm_syncobj_replace_fence(eb.gem_context->syncobj,
+					  &eb.request->fence);
+	}
+
 	i915_request_put(eb.request);
 
 err_vma:
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 07/21] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

This has never been used by any userspace except IGT and provides no
real functionality beyond parroting back parameters userspace passed in
as part of context creation or via setparam.  If the context is in
legacy mode (where you use I915_EXEC_RENDER and friends), it returns
success with zero data so it's not useful for discovering what engines
are in the context.  It's also not a replacement for the recently
removed I915_CONTEXT_CLONE_ENGINES because it doesn't return any of the
balancing or bonding information.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 77 +--------------------
 1 file changed, 1 insertion(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index a72c9b256723b..e8179918fa306 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1725,78 +1725,6 @@ set_engines(struct i915_gem_context *ctx,
 	return 0;
 }
 
-static int
-get_engines(struct i915_gem_context *ctx,
-	    struct drm_i915_gem_context_param *args)
-{
-	struct i915_context_param_engines __user *user;
-	struct i915_gem_engines *e;
-	size_t n, count, size;
-	bool user_engines;
-	int err = 0;
-
-	e = __context_engines_await(ctx, &user_engines);
-	if (!e)
-		return -ENOENT;
-
-	if (!user_engines) {
-		i915_sw_fence_complete(&e->fence);
-		args->size = 0;
-		return 0;
-	}
-
-	count = e->num_engines;
-
-	/* Be paranoid in case we have an impedance mismatch */
-	if (!check_struct_size(user, engines, count, &size)) {
-		err = -EINVAL;
-		goto err_free;
-	}
-	if (overflows_type(size, args->size)) {
-		err = -EINVAL;
-		goto err_free;
-	}
-
-	if (!args->size) {
-		args->size = size;
-		goto err_free;
-	}
-
-	if (args->size < size) {
-		err = -EINVAL;
-		goto err_free;
-	}
-
-	user = u64_to_user_ptr(args->value);
-	if (put_user(0, &user->extensions)) {
-		err = -EFAULT;
-		goto err_free;
-	}
-
-	for (n = 0; n < count; n++) {
-		struct i915_engine_class_instance ci = {
-			.engine_class = I915_ENGINE_CLASS_INVALID,
-			.engine_instance = I915_ENGINE_CLASS_INVALID_NONE,
-		};
-
-		if (e->engines[n]) {
-			ci.engine_class = e->engines[n]->engine->uabi_class;
-			ci.engine_instance = e->engines[n]->engine->uabi_instance;
-		}
-
-		if (copy_to_user(&user->engines[n], &ci, sizeof(ci))) {
-			err = -EFAULT;
-			goto err_free;
-		}
-	}
-
-	args->size = size;
-
-err_free:
-	i915_sw_fence_complete(&e->fence);
-	return err;
-}
-
 static int
 set_persistence(struct i915_gem_context *ctx,
 		const struct drm_i915_gem_context_param *args)
@@ -2127,10 +2055,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		ret = get_ppgtt(file_priv, ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_ENGINES:
-		ret = get_engines(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_PERSISTENCE:
 		args->size = 0;
 		args->value = i915_gem_context_is_persistent(ctx);
@@ -2138,6 +2062,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 
 	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_ENGINES:
 	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
 		ret = -EINVAL;
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 07/21] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

This has never been used by any userspace except IGT and provides no
real functionality beyond parroting back parameters userspace passed in
as part of context creation or via setparam.  If the context is in
legacy mode (where you use I915_EXEC_RENDER and friends), it returns
success with zero data so it's not useful for discovering what engines
are in the context.  It's also not a replacement for the recently
removed I915_CONTEXT_CLONE_ENGINES because it doesn't return any of the
balancing or bonding information.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 77 +--------------------
 1 file changed, 1 insertion(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index a72c9b256723b..e8179918fa306 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1725,78 +1725,6 @@ set_engines(struct i915_gem_context *ctx,
 	return 0;
 }
 
-static int
-get_engines(struct i915_gem_context *ctx,
-	    struct drm_i915_gem_context_param *args)
-{
-	struct i915_context_param_engines __user *user;
-	struct i915_gem_engines *e;
-	size_t n, count, size;
-	bool user_engines;
-	int err = 0;
-
-	e = __context_engines_await(ctx, &user_engines);
-	if (!e)
-		return -ENOENT;
-
-	if (!user_engines) {
-		i915_sw_fence_complete(&e->fence);
-		args->size = 0;
-		return 0;
-	}
-
-	count = e->num_engines;
-
-	/* Be paranoid in case we have an impedance mismatch */
-	if (!check_struct_size(user, engines, count, &size)) {
-		err = -EINVAL;
-		goto err_free;
-	}
-	if (overflows_type(size, args->size)) {
-		err = -EINVAL;
-		goto err_free;
-	}
-
-	if (!args->size) {
-		args->size = size;
-		goto err_free;
-	}
-
-	if (args->size < size) {
-		err = -EINVAL;
-		goto err_free;
-	}
-
-	user = u64_to_user_ptr(args->value);
-	if (put_user(0, &user->extensions)) {
-		err = -EFAULT;
-		goto err_free;
-	}
-
-	for (n = 0; n < count; n++) {
-		struct i915_engine_class_instance ci = {
-			.engine_class = I915_ENGINE_CLASS_INVALID,
-			.engine_instance = I915_ENGINE_CLASS_INVALID_NONE,
-		};
-
-		if (e->engines[n]) {
-			ci.engine_class = e->engines[n]->engine->uabi_class;
-			ci.engine_instance = e->engines[n]->engine->uabi_instance;
-		}
-
-		if (copy_to_user(&user->engines[n], &ci, sizeof(ci))) {
-			err = -EFAULT;
-			goto err_free;
-		}
-	}
-
-	args->size = size;
-
-err_free:
-	i915_sw_fence_complete(&e->fence);
-	return err;
-}
-
 static int
 set_persistence(struct i915_gem_context *ctx,
 		const struct drm_i915_gem_context_param *args)
@@ -2127,10 +2055,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		ret = get_ppgtt(file_priv, ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_ENGINES:
-		ret = get_engines(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_PERSISTENCE:
 		args->size = 0;
 		args->value = i915_gem_context_is_persistent(ctx);
@@ -2138,6 +2062,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 
 	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_ENGINES:
 	case I915_CONTEXT_PARAM_RINGSIZE:
 	default:
 		ret = -EINVAL;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

Unless some userspace badly wants it, there's no good reason to support
this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
leave the validation code in place in case we ever decide we want to do
something interesting with the bonding information.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
 .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
 .../drm/i915/gt/intel_execlists_submission.h  |   4 -
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
 6 files changed, 7 insertions(+), 353 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e8179918fa306..5f8d0faf783aa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 	}
 	virtual = set->engines->engines[idx]->engine;
 
+	if (intel_engine_is_virtual(virtual)) {
+		drm_dbg(&i915->drm,
+			"Bonding with virtual engines not allowed\n");
+		return -EINVAL;
+	}
+
 	err = check_user_mbz(&ext->flags);
 	if (err)
 		return err;
@@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 				n, ci.engine_class, ci.engine_instance);
 			return -EINVAL;
 		}
-
-		/*
-		 * A non-virtual engine has no siblings to choose between; and
-		 * a submit fence will always be directed to the one engine.
-		 */
-		if (intel_engine_is_virtual(virtual)) {
-			err = intel_virtual_engine_attach_bond(virtual,
-							       master,
-							       bond);
-			if (err)
-				return err;
-		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d640bba6ad9ab..efb2fa3522a42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		if (args->flags & I915_EXEC_FENCE_SUBMIT)
 			err = i915_request_await_execution(eb.request,
 							   in_fence,
-							   eb.engine->bond_execute);
+							   NULL);
 		else
 			err = i915_request_await_dma_fence(eb.request,
 							   in_fence);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 883bafc449024..68cfe5080325c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -446,13 +446,6 @@ struct intel_engine_cs {
 	 */
 	void		(*submit_request)(struct i915_request *rq);
 
-	/*
-	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
-	 * request down to the bonded pairs.
-	 */
-	void            (*bond_execute)(struct i915_request *rq,
-					struct dma_fence *signal);
-
 	/*
 	 * Call when the priority on a request has changed and it and its
 	 * dependencies may need rescheduling. Note the request itself may
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de124870af44d..b6e2b59f133b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -181,18 +181,6 @@ struct virtual_engine {
 		int prio;
 	} nodes[I915_NUM_ENGINES];
 
-	/*
-	 * Keep track of bonded pairs -- restrictions upon on our selection
-	 * of physical engines any particular request may be submitted to.
-	 * If we receive a submit-fence from a master engine, we will only
-	 * use one of sibling_mask physical engines.
-	 */
-	struct ve_bond {
-		const struct intel_engine_cs *master;
-		intel_engine_mask_t sibling_mask;
-	} *bonds;
-	unsigned int num_bonds;
-
 	/* And finally, which physical engines this virtual engine maps onto. */
 	unsigned int num_siblings;
 	struct intel_engine_cs *siblings[];
@@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 	intel_breadcrumbs_free(ve->base.breadcrumbs);
 	intel_engine_free_request_pool(&ve->base);
 
-	kfree(ve->bonds);
 	kfree(ve);
 }
 
@@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
 	spin_unlock_irqrestore(&ve->base.active.lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
-		  const struct intel_engine_cs *master)
-{
-	int i;
-
-	for (i = 0; i < ve->num_bonds; i++) {
-		if (ve->bonds[i].master == master)
-			return &ve->bonds[i];
-	}
-
-	return NULL;
-}
-
-static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
-{
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
-	intel_engine_mask_t allowed, exec;
-	struct ve_bond *bond;
-
-	allowed = ~to_request(signal)->engine->mask;
-
-	bond = virtual_find_bond(ve, to_request(signal)->engine);
-	if (bond)
-		allowed &= bond->sibling_mask;
-
-	/* Restrict the bonded request to run on only the available engines */
-	exec = READ_ONCE(rq->execution_mask);
-	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
-		;
-
-	/* Prevent the master from being re-run on the bonded engines */
-	to_request(signal)->execution_mask &= ~allowed;
-}
-
 struct intel_context *
 intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 			       unsigned int count)
@@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 
 	ve->base.schedule = i915_schedule;
 	ve->base.submit_request = virtual_submit_request;
-	ve->base.bond_execute = virtual_bond_execute;
 
 	INIT_LIST_HEAD(virtual_queue(ve));
 	ve->base.execlists.queue_priority_hint = INT_MIN;
@@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
 	if (IS_ERR(dst))
 		return dst;
 
-	if (se->num_bonds) {
-		struct virtual_engine *de = to_virtual_engine(dst->engine);
-
-		de->bonds = kmemdup(se->bonds,
-				    sizeof(*se->bonds) * se->num_bonds,
-				    GFP_KERNEL);
-		if (!de->bonds) {
-			intel_context_put(dst);
-			return ERR_PTR(-ENOMEM);
-		}
-
-		de->num_bonds = se->num_bonds;
-	}
-
 	return dst;
 }
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling)
-{
-	struct virtual_engine *ve = to_virtual_engine(engine);
-	struct ve_bond *bond;
-	int n;
-
-	/* Sanity check the sibling is part of the virtual engine */
-	for (n = 0; n < ve->num_siblings; n++)
-		if (sibling == ve->siblings[n])
-			break;
-	if (n == ve->num_siblings)
-		return -EINVAL;
-
-	bond = virtual_find_bond(ve, master);
-	if (bond) {
-		bond->sibling_mask |= sibling->mask;
-		return 0;
-	}
-
-	bond = krealloc(ve->bonds,
-			sizeof(*bond) * (ve->num_bonds + 1),
-			GFP_KERNEL);
-	if (!bond)
-		return -ENOMEM;
-
-	bond[ve->num_bonds].master = master;
-	bond[ve->num_bonds].sibling_mask = sibling->mask;
-
-	ve->bonds = bond;
-	ve->num_bonds++;
-
-	return 0;
-}
-
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index fd61dae820e9e..80cec37a56ba9 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 struct intel_context *
 intel_execlists_clone_virtual(struct intel_engine_cs *src);
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling);
-
 bool
 intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 1081cd36a2bd3..f03446d587160 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
 	return 0;
 }
 
-static int bond_virtual_engine(struct intel_gt *gt,
-			       unsigned int class,
-			       struct intel_engine_cs **siblings,
-			       unsigned int nsibling,
-			       unsigned int flags)
-#define BOND_SCHEDULE BIT(0)
-{
-	struct intel_engine_cs *master;
-	struct i915_request *rq[16];
-	enum intel_engine_id id;
-	struct igt_spinner spin;
-	unsigned long n;
-	int err;
-
-	/*
-	 * A set of bonded requests is intended to be run concurrently
-	 * across a number of engines. We use one request per-engine
-	 * and a magic fence to schedule each of the bonded requests
-	 * at the same time. A consequence of our current scheduler is that
-	 * we only move requests to the HW ready queue when the request
-	 * becomes ready, that is when all of its prerequisite fences have
-	 * been signaled. As one of those fences is the master submit fence,
-	 * there is a delay on all secondary fences as the HW may be
-	 * currently busy. Equally, as all the requests are independent,
-	 * they may have other fences that delay individual request
-	 * submission to HW. Ergo, we do not guarantee that all requests are
-	 * immediately submitted to HW at the same time, just that if the
-	 * rules are abided by, they are ready at the same time as the
-	 * first is submitted. Userspace can embed semaphores in its batch
-	 * to ensure parallel execution of its phases as it requires.
-	 * Though naturally it gets requested that perhaps the scheduler should
-	 * take care of parallel execution, even across preemption events on
-	 * different HW. (The proper answer is of course "lalalala".)
-	 *
-	 * With the submit-fence, we have identified three possible phases
-	 * of synchronisation depending on the master fence: queued (not
-	 * ready), executing, and signaled. The first two are quite simple
-	 * and checked below. However, the signaled master fence handling is
-	 * contentious. Currently we do not distinguish between a signaled
-	 * fence and an expired fence, as once signaled it does not convey
-	 * any information about the previous execution. It may even be freed
-	 * and hence checking later it may not exist at all. Ergo we currently
-	 * do not apply the bonding constraint for an already signaled fence,
-	 * as our expectation is that it should not constrain the secondaries
-	 * and is outside of the scope of the bonded request API (i.e. all
-	 * userspace requests are meant to be running in parallel). As
-	 * it imposes no constraint, and is effectively a no-op, we do not
-	 * check below as normal execution flows are checked extensively above.
-	 *
-	 * XXX Is the degenerate handling of signaled submit fences the
-	 * expected behaviour for userpace?
-	 */
-
-	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
-
-	if (igt_spinner_init(&spin, gt))
-		return -ENOMEM;
-
-	err = 0;
-	rq[0] = ERR_PTR(-ENOMEM);
-	for_each_engine(master, gt, id) {
-		struct i915_sw_fence fence = {};
-		struct intel_context *ce;
-
-		if (master->class == class)
-			continue;
-
-		ce = intel_context_create(master);
-		if (IS_ERR(ce)) {
-			err = PTR_ERR(ce);
-			goto out;
-		}
-
-		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
-
-		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
-		intel_context_put(ce);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			goto out;
-		}
-		i915_request_get(rq[0]);
-
-		if (flags & BOND_SCHEDULE) {
-			onstack_fence_init(&fence);
-			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
-							       &fence,
-							       GFP_KERNEL);
-		}
-
-		i915_request_add(rq[0]);
-		if (err < 0)
-			goto out;
-
-		if (!(flags & BOND_SCHEDULE) &&
-		    !igt_wait_for_spinner(&spin, rq[0])) {
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			struct intel_context *ve;
-
-			ve = intel_execlists_create_virtual(siblings, nsibling);
-			if (IS_ERR(ve)) {
-				err = PTR_ERR(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_virtual_engine_attach_bond(ve->engine,
-							       master,
-							       siblings[n]);
-			if (err) {
-				intel_context_put(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_context_pin(ve);
-			intel_context_put(ve);
-			if (err) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			rq[n + 1] = i915_request_create(ve);
-			intel_context_unpin(ve);
-			if (IS_ERR(rq[n + 1])) {
-				err = PTR_ERR(rq[n + 1]);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-			i915_request_get(rq[n + 1]);
-
-			err = i915_request_await_execution(rq[n + 1],
-							   &rq[0]->fence,
-							   ve->engine->bond_execute);
-			i915_request_add(rq[n + 1]);
-			if (err < 0) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-		}
-		onstack_fence_fini(&fence);
-		intel_engine_flush_submission(master);
-		igt_spinner_end(&spin);
-
-		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
-			pr_err("Master request did not execute (on %s)!\n",
-			       rq[0]->engine->name);
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			if (i915_request_wait(rq[n + 1], 0,
-					      MAX_SCHEDULE_TIMEOUT) < 0) {
-				err = -EIO;
-				goto out;
-			}
-
-			if (rq[n + 1]->engine != siblings[n]) {
-				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
-				       siblings[n]->name,
-				       rq[n + 1]->engine->name,
-				       rq[0]->engine->name);
-				err = -EINVAL;
-				goto out;
-			}
-		}
-
-		for (n = 0; !IS_ERR(rq[n]); n++)
-			i915_request_put(rq[n]);
-		rq[0] = ERR_PTR(-ENOMEM);
-	}
-
-out:
-	for (n = 0; !IS_ERR(rq[n]); n++)
-		i915_request_put(rq[n]);
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_virtual_bond(void *arg)
-{
-	static const struct phase {
-		const char *name;
-		unsigned int flags;
-	} phases[] = {
-		{ "", 0 },
-		{ "schedule", BOND_SCHEDULE },
-		{ },
-	};
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class;
-	int err;
-
-	if (intel_uc_uses_guc_submission(&gt->uc))
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		const struct phase *p;
-		int nsibling;
-
-		nsibling = select_siblings(gt, class, siblings);
-		if (nsibling < 2)
-			continue;
-
-		for (p = phases; p->name; p++) {
-			err = bond_virtual_engine(gt,
-						  class, siblings, nsibling,
-						  p->flags);
-			if (err) {
-				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
-				       __func__, p->name, class, nsibling, err);
-				return err;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int reset_virtual_engine(struct intel_gt *gt,
 				struct intel_engine_cs **siblings,
 				unsigned int nsibling)
@@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_virtual_mask),
 		SUBTEST(live_virtual_preserved),
 		SUBTEST(live_virtual_slice),
-		SUBTEST(live_virtual_bond),
 		SUBTEST(live_virtual_reset),
 	};
 
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

Unless some userspace badly wants it, there's no good reason to support
this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
leave the validation code in place in case we ever decide we want to do
something interesting with the bonding information.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
 .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
 .../drm/i915/gt/intel_execlists_submission.h  |   4 -
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
 6 files changed, 7 insertions(+), 353 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e8179918fa306..5f8d0faf783aa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 	}
 	virtual = set->engines->engines[idx]->engine;
 
+	if (intel_engine_is_virtual(virtual)) {
+		drm_dbg(&i915->drm,
+			"Bonding with virtual engines not allowed\n");
+		return -EINVAL;
+	}
+
 	err = check_user_mbz(&ext->flags);
 	if (err)
 		return err;
@@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 				n, ci.engine_class, ci.engine_instance);
 			return -EINVAL;
 		}
-
-		/*
-		 * A non-virtual engine has no siblings to choose between; and
-		 * a submit fence will always be directed to the one engine.
-		 */
-		if (intel_engine_is_virtual(virtual)) {
-			err = intel_virtual_engine_attach_bond(virtual,
-							       master,
-							       bond);
-			if (err)
-				return err;
-		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d640bba6ad9ab..efb2fa3522a42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		if (args->flags & I915_EXEC_FENCE_SUBMIT)
 			err = i915_request_await_execution(eb.request,
 							   in_fence,
-							   eb.engine->bond_execute);
+							   NULL);
 		else
 			err = i915_request_await_dma_fence(eb.request,
 							   in_fence);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 883bafc449024..68cfe5080325c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -446,13 +446,6 @@ struct intel_engine_cs {
 	 */
 	void		(*submit_request)(struct i915_request *rq);
 
-	/*
-	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
-	 * request down to the bonded pairs.
-	 */
-	void            (*bond_execute)(struct i915_request *rq,
-					struct dma_fence *signal);
-
 	/*
 	 * Call when the priority on a request has changed and it and its
 	 * dependencies may need rescheduling. Note the request itself may
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de124870af44d..b6e2b59f133b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -181,18 +181,6 @@ struct virtual_engine {
 		int prio;
 	} nodes[I915_NUM_ENGINES];
 
-	/*
-	 * Keep track of bonded pairs -- restrictions upon on our selection
-	 * of physical engines any particular request may be submitted to.
-	 * If we receive a submit-fence from a master engine, we will only
-	 * use one of sibling_mask physical engines.
-	 */
-	struct ve_bond {
-		const struct intel_engine_cs *master;
-		intel_engine_mask_t sibling_mask;
-	} *bonds;
-	unsigned int num_bonds;
-
 	/* And finally, which physical engines this virtual engine maps onto. */
 	unsigned int num_siblings;
 	struct intel_engine_cs *siblings[];
@@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 	intel_breadcrumbs_free(ve->base.breadcrumbs);
 	intel_engine_free_request_pool(&ve->base);
 
-	kfree(ve->bonds);
 	kfree(ve);
 }
 
@@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
 	spin_unlock_irqrestore(&ve->base.active.lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
-		  const struct intel_engine_cs *master)
-{
-	int i;
-
-	for (i = 0; i < ve->num_bonds; i++) {
-		if (ve->bonds[i].master == master)
-			return &ve->bonds[i];
-	}
-
-	return NULL;
-}
-
-static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
-{
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
-	intel_engine_mask_t allowed, exec;
-	struct ve_bond *bond;
-
-	allowed = ~to_request(signal)->engine->mask;
-
-	bond = virtual_find_bond(ve, to_request(signal)->engine);
-	if (bond)
-		allowed &= bond->sibling_mask;
-
-	/* Restrict the bonded request to run on only the available engines */
-	exec = READ_ONCE(rq->execution_mask);
-	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
-		;
-
-	/* Prevent the master from being re-run on the bonded engines */
-	to_request(signal)->execution_mask &= ~allowed;
-}
-
 struct intel_context *
 intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 			       unsigned int count)
@@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 
 	ve->base.schedule = i915_schedule;
 	ve->base.submit_request = virtual_submit_request;
-	ve->base.bond_execute = virtual_bond_execute;
 
 	INIT_LIST_HEAD(virtual_queue(ve));
 	ve->base.execlists.queue_priority_hint = INT_MIN;
@@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
 	if (IS_ERR(dst))
 		return dst;
 
-	if (se->num_bonds) {
-		struct virtual_engine *de = to_virtual_engine(dst->engine);
-
-		de->bonds = kmemdup(se->bonds,
-				    sizeof(*se->bonds) * se->num_bonds,
-				    GFP_KERNEL);
-		if (!de->bonds) {
-			intel_context_put(dst);
-			return ERR_PTR(-ENOMEM);
-		}
-
-		de->num_bonds = se->num_bonds;
-	}
-
 	return dst;
 }
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling)
-{
-	struct virtual_engine *ve = to_virtual_engine(engine);
-	struct ve_bond *bond;
-	int n;
-
-	/* Sanity check the sibling is part of the virtual engine */
-	for (n = 0; n < ve->num_siblings; n++)
-		if (sibling == ve->siblings[n])
-			break;
-	if (n == ve->num_siblings)
-		return -EINVAL;
-
-	bond = virtual_find_bond(ve, master);
-	if (bond) {
-		bond->sibling_mask |= sibling->mask;
-		return 0;
-	}
-
-	bond = krealloc(ve->bonds,
-			sizeof(*bond) * (ve->num_bonds + 1),
-			GFP_KERNEL);
-	if (!bond)
-		return -ENOMEM;
-
-	bond[ve->num_bonds].master = master;
-	bond[ve->num_bonds].sibling_mask = sibling->mask;
-
-	ve->bonds = bond;
-	ve->num_bonds++;
-
-	return 0;
-}
-
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index fd61dae820e9e..80cec37a56ba9 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 struct intel_context *
 intel_execlists_clone_virtual(struct intel_engine_cs *src);
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling);
-
 bool
 intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 1081cd36a2bd3..f03446d587160 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
 	return 0;
 }
 
-static int bond_virtual_engine(struct intel_gt *gt,
-			       unsigned int class,
-			       struct intel_engine_cs **siblings,
-			       unsigned int nsibling,
-			       unsigned int flags)
-#define BOND_SCHEDULE BIT(0)
-{
-	struct intel_engine_cs *master;
-	struct i915_request *rq[16];
-	enum intel_engine_id id;
-	struct igt_spinner spin;
-	unsigned long n;
-	int err;
-
-	/*
-	 * A set of bonded requests is intended to be run concurrently
-	 * across a number of engines. We use one request per-engine
-	 * and a magic fence to schedule each of the bonded requests
-	 * at the same time. A consequence of our current scheduler is that
-	 * we only move requests to the HW ready queue when the request
-	 * becomes ready, that is when all of its prerequisite fences have
-	 * been signaled. As one of those fences is the master submit fence,
-	 * there is a delay on all secondary fences as the HW may be
-	 * currently busy. Equally, as all the requests are independent,
-	 * they may have other fences that delay individual request
-	 * submission to HW. Ergo, we do not guarantee that all requests are
-	 * immediately submitted to HW at the same time, just that if the
-	 * rules are abided by, they are ready at the same time as the
-	 * first is submitted. Userspace can embed semaphores in its batch
-	 * to ensure parallel execution of its phases as it requires.
-	 * Though naturally it gets requested that perhaps the scheduler should
-	 * take care of parallel execution, even across preemption events on
-	 * different HW. (The proper answer is of course "lalalala".)
-	 *
-	 * With the submit-fence, we have identified three possible phases
-	 * of synchronisation depending on the master fence: queued (not
-	 * ready), executing, and signaled. The first two are quite simple
-	 * and checked below. However, the signaled master fence handling is
-	 * contentious. Currently we do not distinguish between a signaled
-	 * fence and an expired fence, as once signaled it does not convey
-	 * any information about the previous execution. It may even be freed
-	 * and hence checking later it may not exist at all. Ergo we currently
-	 * do not apply the bonding constraint for an already signaled fence,
-	 * as our expectation is that it should not constrain the secondaries
-	 * and is outside of the scope of the bonded request API (i.e. all
-	 * userspace requests are meant to be running in parallel). As
-	 * it imposes no constraint, and is effectively a no-op, we do not
-	 * check below as normal execution flows are checked extensively above.
-	 *
-	 * XXX Is the degenerate handling of signaled submit fences the
-	 * expected behaviour for userpace?
-	 */
-
-	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
-
-	if (igt_spinner_init(&spin, gt))
-		return -ENOMEM;
-
-	err = 0;
-	rq[0] = ERR_PTR(-ENOMEM);
-	for_each_engine(master, gt, id) {
-		struct i915_sw_fence fence = {};
-		struct intel_context *ce;
-
-		if (master->class == class)
-			continue;
-
-		ce = intel_context_create(master);
-		if (IS_ERR(ce)) {
-			err = PTR_ERR(ce);
-			goto out;
-		}
-
-		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
-
-		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
-		intel_context_put(ce);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			goto out;
-		}
-		i915_request_get(rq[0]);
-
-		if (flags & BOND_SCHEDULE) {
-			onstack_fence_init(&fence);
-			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
-							       &fence,
-							       GFP_KERNEL);
-		}
-
-		i915_request_add(rq[0]);
-		if (err < 0)
-			goto out;
-
-		if (!(flags & BOND_SCHEDULE) &&
-		    !igt_wait_for_spinner(&spin, rq[0])) {
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			struct intel_context *ve;
-
-			ve = intel_execlists_create_virtual(siblings, nsibling);
-			if (IS_ERR(ve)) {
-				err = PTR_ERR(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_virtual_engine_attach_bond(ve->engine,
-							       master,
-							       siblings[n]);
-			if (err) {
-				intel_context_put(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_context_pin(ve);
-			intel_context_put(ve);
-			if (err) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			rq[n + 1] = i915_request_create(ve);
-			intel_context_unpin(ve);
-			if (IS_ERR(rq[n + 1])) {
-				err = PTR_ERR(rq[n + 1]);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-			i915_request_get(rq[n + 1]);
-
-			err = i915_request_await_execution(rq[n + 1],
-							   &rq[0]->fence,
-							   ve->engine->bond_execute);
-			i915_request_add(rq[n + 1]);
-			if (err < 0) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-		}
-		onstack_fence_fini(&fence);
-		intel_engine_flush_submission(master);
-		igt_spinner_end(&spin);
-
-		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
-			pr_err("Master request did not execute (on %s)!\n",
-			       rq[0]->engine->name);
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			if (i915_request_wait(rq[n + 1], 0,
-					      MAX_SCHEDULE_TIMEOUT) < 0) {
-				err = -EIO;
-				goto out;
-			}
-
-			if (rq[n + 1]->engine != siblings[n]) {
-				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
-				       siblings[n]->name,
-				       rq[n + 1]->engine->name,
-				       rq[0]->engine->name);
-				err = -EINVAL;
-				goto out;
-			}
-		}
-
-		for (n = 0; !IS_ERR(rq[n]); n++)
-			i915_request_put(rq[n]);
-		rq[0] = ERR_PTR(-ENOMEM);
-	}
-
-out:
-	for (n = 0; !IS_ERR(rq[n]); n++)
-		i915_request_put(rq[n]);
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_virtual_bond(void *arg)
-{
-	static const struct phase {
-		const char *name;
-		unsigned int flags;
-	} phases[] = {
-		{ "", 0 },
-		{ "schedule", BOND_SCHEDULE },
-		{ },
-	};
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class;
-	int err;
-
-	if (intel_uc_uses_guc_submission(&gt->uc))
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		const struct phase *p;
-		int nsibling;
-
-		nsibling = select_siblings(gt, class, siblings);
-		if (nsibling < 2)
-			continue;
-
-		for (p = phases; p->name; p++) {
-			err = bond_virtual_engine(gt,
-						  class, siblings, nsibling,
-						  p->flags);
-			if (err) {
-				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
-				       __func__, p->name, class, nsibling, err);
-				return err;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int reset_virtual_engine(struct intel_gt *gt,
 				struct intel_engine_cs **siblings,
 				unsigned int nsibling)
@@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_virtual_mask),
 		SUBTEST(live_virtual_preserved),
 		SUBTEST(live_virtual_slice),
-		SUBTEST(live_virtual_bond),
 		SUBTEST(live_virtual_reset),
 	};
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

There's no sense in allowing userspace to create more engines than it
can possibly access via execbuf.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5f8d0faf783aa..ecb3bf5369857 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
 		return -EINVAL;
 	}
 
-	/*
-	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
-	 * first 64 engines defined here.
-	 */
 	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
+	if (num_engines > I915_EXEC_RING_MASK + 1)
+		return -EINVAL;
+
 	set.engines = alloc_engines(num_engines);
 	if (!set.engines)
 		return -ENOMEM;
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

There's no sense in allowing userspace to create more engines than it
can possibly access via execbuf.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5f8d0faf783aa..ecb3bf5369857 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
 		return -EINVAL;
 	}
 
-	/*
-	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
-	 * first 64 engines defined here.
-	 */
 	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
+	if (num_engines > I915_EXEC_RING_MASK + 1)
+		return -EINVAL;
+
 	set.engines = alloc_engines(num_engines);
 	if (!set.engines)
 		return -ENOMEM;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 10/21] drm/i915/request: Remove the hook from await_execution
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

This was only ever used for bonded virtual engine execution.  Since
that's no longer allowed, this is dead code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  3 +-
 drivers/gpu/drm/i915/i915_request.c           | 42 ++++---------------
 drivers/gpu/drm/i915/i915_request.h           |  4 +-
 3 files changed, 9 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index efb2fa3522a42..7024adcd5cf15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3473,8 +3473,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (in_fence) {
 		if (args->flags & I915_EXEC_FENCE_SUBMIT)
 			err = i915_request_await_execution(eb.request,
-							   in_fence,
-							   NULL);
+							   in_fence);
 		else
 			err = i915_request_await_dma_fence(eb.request,
 							   in_fence);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index bec9c3652188b..7e00218b8c105 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -49,7 +49,6 @@
 struct execute_cb {
 	struct irq_work work;
 	struct i915_sw_fence *fence;
-	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
 	struct i915_request *signal;
 };
 
@@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk)
 	kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
-static void irq_execute_cb_hook(struct irq_work *wrk)
-{
-	struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
-
-	cb->hook(container_of(cb->fence, struct i915_request, submit),
-		 &cb->signal->fence);
-	i915_request_put(cb->signal);
-
-	irq_execute_cb(wrk);
-}
-
 static __always_inline void
 __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk))
 {
@@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request *signal)
 static int
 __await_execution(struct i915_request *rq,
 		  struct i915_request *signal,
-		  void (*hook)(struct i915_request *rq,
-			       struct dma_fence *signal),
 		  gfp_t gfp)
 {
 	struct execute_cb *cb;
 
-	if (i915_request_is_active(signal)) {
-		if (hook)
-			hook(rq, &signal->fence);
+	if (i915_request_is_active(signal))
 		return 0;
-	}
 
 	cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
 	if (!cb)
@@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq,
 	i915_sw_fence_await(cb->fence);
 	init_irq_work(&cb->work, irq_execute_cb);
 
-	if (hook) {
-		cb->hook = hook;
-		cb->signal = i915_request_get(signal);
-		cb->work.func = irq_execute_cb_hook;
-	}
-
 	/*
 	 * Register the callback first, then see if the signaler is already
 	 * active. This ensures that if we race with the
@@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to,
 		goto await_fence;
 
 	/* Only submit our spinner after the signaler is running! */
-	if (__await_execution(to, from, NULL, gfp))
+	if (__await_execution(to, from, gfp))
 		goto await_fence;
 
 	if (__emit_semaphore_wait(to, from, from->fence.seqno))
@@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct intel_timeline *tl,
 
 static int
 __i915_request_await_execution(struct i915_request *to,
-			       struct i915_request *from,
-			       void (*hook)(struct i915_request *rq,
-					    struct dma_fence *signal))
+			       struct i915_request *from)
 {
 	int err;
 
 	GEM_BUG_ON(intel_context_is_barrier(from->context));
 
 	/* Submit both requests at the same time */
-	err = __await_execution(to, from, hook, I915_FENCE_GFP);
+	err = __await_execution(to, from, I915_FENCE_GFP);
 	if (err)
 		return err;
 
@@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
 
 int
 i915_request_await_execution(struct i915_request *rq,
-			     struct dma_fence *fence,
-			     void (*hook)(struct i915_request *rq,
-					  struct dma_fence *signal))
+			     struct dma_fence *fence)
 {
 	struct dma_fence **child = &fence;
 	unsigned int nchild = 1;
@@ -1441,8 +1414,7 @@ i915_request_await_execution(struct i915_request *rq,
 
 		if (dma_fence_is_i915(fence))
 			ret = __i915_request_await_execution(rq,
-							     to_request(fence),
-							     hook);
+							     to_request(fence));
 		else
 			ret = i915_request_await_external(rq, fence);
 		if (ret < 0)
@@ -1468,7 +1440,7 @@ await_request_submit(struct i915_request *to, struct i915_request *from)
 							&from->submit,
 							I915_FENCE_GFP);
 	else
-		return __i915_request_await_execution(to, from, NULL);
+		return __i915_request_await_execution(to, from);
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 270f6cd37650c..63b087a7f5707 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -352,9 +352,7 @@ int i915_request_await_object(struct i915_request *to,
 int i915_request_await_dma_fence(struct i915_request *rq,
 				 struct dma_fence *fence);
 int i915_request_await_execution(struct i915_request *rq,
-				 struct dma_fence *fence,
-				 void (*hook)(struct i915_request *rq,
-					      struct dma_fence *signal));
+				 struct dma_fence *fence);
 
 void i915_request_add(struct i915_request *rq);
 
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 10/21] drm/i915/request: Remove the hook from await_execution
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

This was only ever used for bonded virtual engine execution.  Since
that's no longer allowed, this is dead code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  3 +-
 drivers/gpu/drm/i915/i915_request.c           | 42 ++++---------------
 drivers/gpu/drm/i915/i915_request.h           |  4 +-
 3 files changed, 9 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index efb2fa3522a42..7024adcd5cf15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3473,8 +3473,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (in_fence) {
 		if (args->flags & I915_EXEC_FENCE_SUBMIT)
 			err = i915_request_await_execution(eb.request,
-							   in_fence,
-							   NULL);
+							   in_fence);
 		else
 			err = i915_request_await_dma_fence(eb.request,
 							   in_fence);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index bec9c3652188b..7e00218b8c105 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -49,7 +49,6 @@
 struct execute_cb {
 	struct irq_work work;
 	struct i915_sw_fence *fence;
-	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
 	struct i915_request *signal;
 };
 
@@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk)
 	kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
-static void irq_execute_cb_hook(struct irq_work *wrk)
-{
-	struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
-
-	cb->hook(container_of(cb->fence, struct i915_request, submit),
-		 &cb->signal->fence);
-	i915_request_put(cb->signal);
-
-	irq_execute_cb(wrk);
-}
-
 static __always_inline void
 __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk))
 {
@@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request *signal)
 static int
 __await_execution(struct i915_request *rq,
 		  struct i915_request *signal,
-		  void (*hook)(struct i915_request *rq,
-			       struct dma_fence *signal),
 		  gfp_t gfp)
 {
 	struct execute_cb *cb;
 
-	if (i915_request_is_active(signal)) {
-		if (hook)
-			hook(rq, &signal->fence);
+	if (i915_request_is_active(signal))
 		return 0;
-	}
 
 	cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
 	if (!cb)
@@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq,
 	i915_sw_fence_await(cb->fence);
 	init_irq_work(&cb->work, irq_execute_cb);
 
-	if (hook) {
-		cb->hook = hook;
-		cb->signal = i915_request_get(signal);
-		cb->work.func = irq_execute_cb_hook;
-	}
-
 	/*
 	 * Register the callback first, then see if the signaler is already
 	 * active. This ensures that if we race with the
@@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to,
 		goto await_fence;
 
 	/* Only submit our spinner after the signaler is running! */
-	if (__await_execution(to, from, NULL, gfp))
+	if (__await_execution(to, from, gfp))
 		goto await_fence;
 
 	if (__emit_semaphore_wait(to, from, from->fence.seqno))
@@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct intel_timeline *tl,
 
 static int
 __i915_request_await_execution(struct i915_request *to,
-			       struct i915_request *from,
-			       void (*hook)(struct i915_request *rq,
-					    struct dma_fence *signal))
+			       struct i915_request *from)
 {
 	int err;
 
 	GEM_BUG_ON(intel_context_is_barrier(from->context));
 
 	/* Submit both requests at the same time */
-	err = __await_execution(to, from, hook, I915_FENCE_GFP);
+	err = __await_execution(to, from, I915_FENCE_GFP);
 	if (err)
 		return err;
 
@@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
 
 int
 i915_request_await_execution(struct i915_request *rq,
-			     struct dma_fence *fence,
-			     void (*hook)(struct i915_request *rq,
-					  struct dma_fence *signal))
+			     struct dma_fence *fence)
 {
 	struct dma_fence **child = &fence;
 	unsigned int nchild = 1;
@@ -1441,8 +1414,7 @@ i915_request_await_execution(struct i915_request *rq,
 
 		if (dma_fence_is_i915(fence))
 			ret = __i915_request_await_execution(rq,
-							     to_request(fence),
-							     hook);
+							     to_request(fence));
 		else
 			ret = i915_request_await_external(rq, fence);
 		if (ret < 0)
@@ -1468,7 +1440,7 @@ await_request_submit(struct i915_request *to, struct i915_request *from)
 							&from->submit,
 							I915_FENCE_GFP);
 	else
-		return __i915_request_await_execution(to, from, NULL);
+		return __i915_request_await_execution(to, from);
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 270f6cd37650c..63b087a7f5707 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -352,9 +352,7 @@ int i915_request_await_object(struct i915_request *to,
 int i915_request_await_dma_fence(struct i915_request *rq,
 				 struct dma_fence *fence);
 int i915_request_await_execution(struct i915_request *rq,
-				 struct dma_fence *fence,
-				 void (*hook)(struct i915_request *rq,
-					      struct dma_fence *signal));
+				 struct dma_fence *fence);
 
 void i915_request_add(struct i915_request *rq);
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

As far as I can tell, the only real reason for this is to avoid taking a
reference to the i915_gem_context.  The cost of those two atomics
probably pales in comparison to the cost of the ioctl itself so we're
really not buying ourselves anything here.  We're about to make context
lookup a tiny bit more complicated, so let's get rid of the one hand-
rolled case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
 drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
 2 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ecb3bf5369857..941fbf78267b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_reset_stats *args = data;
 	struct i915_gem_context *ctx;
-	int ret;
 
 	if (args->flags || args->pad)
 		return -EINVAL;
 
-	ret = -ENOENT;
-	rcu_read_lock();
-	ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
+	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
 	if (!ctx)
-		goto out;
+		return -ENOENT;
 
 	/*
 	 * We opt for unserialised reads here. This may result in tearing
@@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 	args->batch_active = atomic_read(&ctx->guilty_count);
 	args->batch_pending = atomic_read(&ctx->active_count);
 
-	ret = 0;
-out:
-	rcu_read_unlock();
-	return ret;
+	i915_gem_context_put(ctx);
+	return 0;
 }
 
 /* GEM context-engines iterator: for_each_gem_engine() */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0b44333eb7033..8571c5c1509a7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 
 struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
 
-static inline struct i915_gem_context *
-__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
-{
-	return xa_load(&file_priv->context_xa, id);
-}
-
 static inline struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
 	struct i915_gem_context *ctx;
 
 	rcu_read_lock();
-	ctx = __i915_gem_context_lookup_rcu(file_priv, id);
+	ctx = xa_load(&file_priv->context_xa, id);
 	if (ctx && !kref_get_unless_zero(&ctx->ref))
 		ctx = NULL;
 	rcu_read_unlock();
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

As far as I can tell, the only real reason for this is to avoid taking a
reference to the i915_gem_context.  The cost of those two atomics
probably pales in comparison to the cost of the ioctl itself so we're
really not buying ourselves anything here.  We're about to make context
lookup a tiny bit more complicated, so let's get rid of the one hand-
rolled case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
 drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
 2 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ecb3bf5369857..941fbf78267b4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_reset_stats *args = data;
 	struct i915_gem_context *ctx;
-	int ret;
 
 	if (args->flags || args->pad)
 		return -EINVAL;
 
-	ret = -ENOENT;
-	rcu_read_lock();
-	ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
+	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
 	if (!ctx)
-		goto out;
+		return -ENOENT;
 
 	/*
 	 * We opt for unserialised reads here. This may result in tearing
@@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 	args->batch_active = atomic_read(&ctx->guilty_count);
 	args->batch_pending = atomic_read(&ctx->active_count);
 
-	ret = 0;
-out:
-	rcu_read_unlock();
-	return ret;
+	i915_gem_context_put(ctx);
+	return 0;
 }
 
 /* GEM context-engines iterator: for_each_gem_engine() */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0b44333eb7033..8571c5c1509a7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 
 struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
 
-static inline struct i915_gem_context *
-__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
-{
-	return xa_load(&file_priv->context_xa, id);
-}
-
 static inline struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
 	struct i915_gem_context *ctx;
 
 	rcu_read_lock();
-	ctx = __i915_gem_context_lookup_rcu(file_priv, id);
+	ctx = xa_load(&file_priv->context_xa, id);
 	if (ctx && !kref_get_unless_zero(&ctx->ref))
 		ctx = NULL;
 	rcu_read_unlock();
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 12/21] drm/i915/gem: Add a separate validate_priority helper
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 +++++++++++++--------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 941fbf78267b4..e5efd22c89ba2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx,
 	return i915_gem_context_get_engine(ctx, idx);
 }
 
+static int validate_priority(struct drm_i915_private *i915,
+			     const struct drm_i915_gem_context_param *args)
+{
+	s64 priority = args->value;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
+		return -ENODEV;
+
+	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
+	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
+		return -EINVAL;
+
+	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
+	    !capable(CAP_SYS_NICE))
+		return -EPERM;
+
+	return 0;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, void *arg)
 static int set_priority(struct i915_gem_context *ctx,
 			const struct drm_i915_gem_context_param *args)
 {
-	s64 priority = args->value;
-
-	if (args->size)
-		return -EINVAL;
-
-	if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
-		return -ENODEV;
-
-	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
-	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
-		return -EINVAL;
+	int err;
 
-	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
-	    !capable(CAP_SYS_NICE))
-		return -EPERM;
+	err = validate_priority(ctx->i915, args);
+	if (err)
+		return err;
 
-	ctx->sched.priority = priority;
+	ctx->sched.priority = args->value;
 	context_apply_all(ctx, __apply_priority, ctx);
 
 	return 0;
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 12/21] drm/i915/gem: Add a separate validate_priority helper
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 +++++++++++++--------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 941fbf78267b4..e5efd22c89ba2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx,
 	return i915_gem_context_get_engine(ctx, idx);
 }
 
+static int validate_priority(struct drm_i915_private *i915,
+			     const struct drm_i915_gem_context_param *args)
+{
+	s64 priority = args->value;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
+		return -ENODEV;
+
+	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
+	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
+		return -EINVAL;
+
+	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
+	    !capable(CAP_SYS_NICE))
+		return -EPERM;
+
+	return 0;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, void *arg)
 static int set_priority(struct i915_gem_context *ctx,
 			const struct drm_i915_gem_context_param *args)
 {
-	s64 priority = args->value;
-
-	if (args->size)
-		return -EINVAL;
-
-	if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
-		return -ENODEV;
-
-	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
-	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
-		return -EINVAL;
+	int err;
 
-	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
-	    !capable(CAP_SYS_NICE))
-		return -EPERM;
+	err = validate_priority(ctx->i915, args);
+	if (err)
+		return err;
 
-	ctx->sched.priority = priority;
+	ctx->sched.priority = args->value;
 	context_apply_all(ctx, __apply_priority, ctx);
 
 	return 0;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 143 ++++++++++++++----
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  16 +-
 3 files changed, 150 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e5efd22c89ba2..3e883daab93bf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -191,6 +191,95 @@ static int validate_priority(struct drm_i915_private *i915,
 	return 0;
 }
 
+static void proto_context_close(struct i915_gem_proto_context *pc)
+{
+	if (pc->vm)
+		i915_vm_put(pc->vm);
+	kfree(pc);
+}
+
+static int proto_context_set_persistence(struct drm_i915_private *i915,
+					 struct i915_gem_proto_context *pc,
+					 bool persist)
+{
+	if (test_bit(UCONTEXT_PERSISTENCE, &pc->user_flags) == persist)
+		return 0;
+
+	if (persist) {
+		/*
+		 * Only contexts that are short-lived [that will expire or be
+		 * reset] are allowed to survive past termination. We require
+		 * hangcheck to ensure that the persistent requests are healthy.
+		 */
+		if (!i915->params.enable_hangcheck)
+			return -EINVAL;
+
+		set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+	} else {
+		/* To cancel a context we use "preempt-to-idle" */
+		if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			return -ENODEV;
+
+		/*
+		 * If the cancel fails, we then need to reset, cleanly!
+		 *
+		 * If the per-engine reset fails, all hope is lost! We resort
+		 * to a full GPU reset in that unlikely case, but realistically
+		 * if the engine could not reset, the full reset does not fare
+		 * much better. The damage has been done.
+		 *
+		 * However, if we cannot reset an engine by itself, we cannot
+		 * cleanup a hanging persistent context without causing
+		 * colateral damage, and we should not pretend we can by
+		 * exposing the interface.
+		 */
+		if (!intel_has_reset_engine(&i915->gt))
+			return -ENODEV;
+
+		clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+	}
+
+	return 0;
+}
+
+static struct i915_gem_proto_context *
+proto_context_create(struct drm_i915_private *i915, unsigned int flags)
+{
+	struct i915_gem_proto_context *pc;
+
+	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
+	    !HAS_EXECLISTS(i915))
+		return ERR_PTR(-EINVAL);
+
+	pc = kzalloc(sizeof(*pc), GFP_KERNEL);
+	if (!pc)
+		return ERR_PTR(-ENOMEM);
+
+	if (HAS_FULL_PPGTT(i915)) {
+		struct i915_ppgtt *ppgtt;
+
+		ppgtt = i915_ppgtt_create(&i915->gt);
+		if (IS_ERR(ppgtt)) {
+			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
+				PTR_ERR(ppgtt));
+			proto_context_close(pc);
+			return ERR_CAST(ppgtt);
+		}
+		pc->vm = &ppgtt->vm;
+	}
+
+	pc->user_flags = 0;
+	set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
+	set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
+	proto_context_set_persistence(i915, pc, true);
+	pc->sched.priority = I915_PRIORITY_NORMAL;
+
+	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
+		pc->single_timeline = true;
+
+	return pc;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -660,7 +749,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
 }
 
 static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915)
+__create_context(struct drm_i915_private *i915,
+		 const struct i915_gem_proto_context *pc)
 {
 	struct i915_gem_context *ctx;
 	struct i915_gem_engines *e;
@@ -673,7 +763,7 @@ __create_context(struct drm_i915_private *i915)
 
 	kref_init(&ctx->ref);
 	ctx->i915 = i915;
-	ctx->sched.priority = I915_PRIORITY_NORMAL;
+	ctx->sched = pc->sched;
 	mutex_init(&ctx->mutex);
 	INIT_LIST_HEAD(&ctx->link);
 
@@ -696,9 +786,7 @@ __create_context(struct drm_i915_private *i915)
 	 * is no remap info, it will be a NOP. */
 	ctx->remap_slice = ALL_L3_SLICES(i915);
 
-	i915_gem_context_set_bannable(ctx);
-	i915_gem_context_set_recoverable(ctx);
-	__context_set_persistence(ctx, true /* cgroup hook? */);
+	ctx->user_flags = pc->user_flags;
 
 	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
 		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
@@ -786,38 +874,23 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
 }
 
 static struct i915_gem_context *
-i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
+i915_gem_create_context(struct drm_i915_private *i915,
+			const struct i915_gem_proto_context *pc)
 {
 	struct i915_gem_context *ctx;
 	int ret;
 
-	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
-	    !HAS_EXECLISTS(i915))
-		return ERR_PTR(-EINVAL);
-
-	ctx = __create_context(i915);
+	ctx = __create_context(i915, pc);
 	if (IS_ERR(ctx))
 		return ctx;
 
-	if (HAS_FULL_PPGTT(i915)) {
-		struct i915_ppgtt *ppgtt;
-
-		ppgtt = i915_ppgtt_create(&i915->gt);
-		if (IS_ERR(ppgtt)) {
-			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
-				PTR_ERR(ppgtt));
-			context_close(ctx);
-			return ERR_CAST(ppgtt);
-		}
-
+	if (pc->vm) {
 		mutex_lock(&ctx->mutex);
-		__assign_ppgtt(ctx, &ppgtt->vm);
+		__assign_ppgtt(ctx, pc->vm);
 		mutex_unlock(&ctx->mutex);
-
-		i915_vm_put(&ppgtt->vm);
 	}
 
-	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
+	if (pc->single_timeline) {
 		ret = drm_syncobj_create(&ctx->syncobj,
 					 DRM_SYNCOBJ_CREATE_SIGNALED,
 					 NULL);
@@ -883,6 +956,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 			  struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
 	u32 id;
@@ -892,7 +966,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	/* 0 reserved for invalid/unassigned ppgtt */
 	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
 
-	ctx = i915_gem_create_context(i915, 0);
+	pc = proto_context_create(i915, 0);
+	if (IS_ERR(pc)) {
+		err = PTR_ERR(pc);
+		goto err;
+	}
+
+	ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ctx)) {
 		err = PTR_ERR(ctx);
 		goto err;
@@ -1884,6 +1965,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_context_create_ext *args = data;
+	struct i915_gem_proto_context *pc;
 	struct create_ext ext_data;
 	int ret;
 	u32 id;
@@ -1906,7 +1988,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
-	ext_data.ctx = i915_gem_create_context(i915, args->flags);
+	pc = proto_context_create(i915, args->flags);
+	if (IS_ERR(pc))
+		return PTR_ERR(pc);
+
+	ext_data.ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ext_data.ctx))
 		return PTR_ERR(ext_data.ctx);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index df76767f0c41b..a42c429f94577 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -46,6 +46,27 @@ struct i915_gem_engines_iter {
 	const struct i915_gem_engines *engines;
 };
 
+/**
+ * struct i915_gem_proto_context - prototype context
+ *
+ * The struct i915_gem_proto_context represents the creation parameters for
+ * an i915_gem_context.  This is used to gather parameters provided either
+ * through creation flags or via SET_CONTEXT_PARAM so that, when we create
+ * the final i915_gem_context, those parameters can be immutable.
+ */
+struct i915_gem_proto_context {
+	/** @vm: See i915_gem_context::vm */
+	struct i915_address_space *vm;
+
+	/** @user_flags: See i915_gem_context::user_flags */
+	unsigned long user_flags;
+
+	/** @sched: See i915_gem_context::sched */
+	struct i915_sched_attr sched;
+
+	bool single_timeline;
+};
+
 /**
  * struct i915_gem_context - client state
  *
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 51b5a3421b400..e0f512ef7f3c6 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -80,11 +80,17 @@ void mock_init_contexts(struct drm_i915_private *i915)
 struct i915_gem_context *
 live_context(struct drm_i915_private *i915, struct file *file)
 {
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
 	u32 id;
 
-	ctx = i915_gem_create_context(i915, 0);
+	pc = proto_context_create(i915, 0);
+	if (IS_ERR(pc))
+		return ERR_CAST(pc);
+
+	ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ctx))
 		return ctx;
 
@@ -142,8 +148,14 @@ struct i915_gem_context *
 kernel_context(struct drm_i915_private *i915)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_proto_context *pc;
+
+	pc = proto_context_create(i915, 0);
+	if (IS_ERR(pc))
+		return ERR_CAST(pc);
 
-	ctx = i915_gem_create_context(i915, 0);
+	ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ctx))
 		return ctx;
 
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 143 ++++++++++++++----
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  16 +-
 3 files changed, 150 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e5efd22c89ba2..3e883daab93bf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -191,6 +191,95 @@ static int validate_priority(struct drm_i915_private *i915,
 	return 0;
 }
 
+static void proto_context_close(struct i915_gem_proto_context *pc)
+{
+	if (pc->vm)
+		i915_vm_put(pc->vm);
+	kfree(pc);
+}
+
+static int proto_context_set_persistence(struct drm_i915_private *i915,
+					 struct i915_gem_proto_context *pc,
+					 bool persist)
+{
+	if (test_bit(UCONTEXT_PERSISTENCE, &pc->user_flags) == persist)
+		return 0;
+
+	if (persist) {
+		/*
+		 * Only contexts that are short-lived [that will expire or be
+		 * reset] are allowed to survive past termination. We require
+		 * hangcheck to ensure that the persistent requests are healthy.
+		 */
+		if (!i915->params.enable_hangcheck)
+			return -EINVAL;
+
+		set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+	} else {
+		/* To cancel a context we use "preempt-to-idle" */
+		if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+			return -ENODEV;
+
+		/*
+		 * If the cancel fails, we then need to reset, cleanly!
+		 *
+		 * If the per-engine reset fails, all hope is lost! We resort
+		 * to a full GPU reset in that unlikely case, but realistically
+		 * if the engine could not reset, the full reset does not fare
+		 * much better. The damage has been done.
+		 *
+		 * However, if we cannot reset an engine by itself, we cannot
+		 * cleanup a hanging persistent context without causing
+		 * colateral damage, and we should not pretend we can by
+		 * exposing the interface.
+		 */
+		if (!intel_has_reset_engine(&i915->gt))
+			return -ENODEV;
+
+		clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+	}
+
+	return 0;
+}
+
+static struct i915_gem_proto_context *
+proto_context_create(struct drm_i915_private *i915, unsigned int flags)
+{
+	struct i915_gem_proto_context *pc;
+
+	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
+	    !HAS_EXECLISTS(i915))
+		return ERR_PTR(-EINVAL);
+
+	pc = kzalloc(sizeof(*pc), GFP_KERNEL);
+	if (!pc)
+		return ERR_PTR(-ENOMEM);
+
+	if (HAS_FULL_PPGTT(i915)) {
+		struct i915_ppgtt *ppgtt;
+
+		ppgtt = i915_ppgtt_create(&i915->gt);
+		if (IS_ERR(ppgtt)) {
+			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
+				PTR_ERR(ppgtt));
+			proto_context_close(pc);
+			return ERR_CAST(ppgtt);
+		}
+		pc->vm = &ppgtt->vm;
+	}
+
+	pc->user_flags = 0;
+	set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
+	set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
+	proto_context_set_persistence(i915, pc, true);
+	pc->sched.priority = I915_PRIORITY_NORMAL;
+
+	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
+		pc->single_timeline = true;
+
+	return pc;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -660,7 +749,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
 }
 
 static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915)
+__create_context(struct drm_i915_private *i915,
+		 const struct i915_gem_proto_context *pc)
 {
 	struct i915_gem_context *ctx;
 	struct i915_gem_engines *e;
@@ -673,7 +763,7 @@ __create_context(struct drm_i915_private *i915)
 
 	kref_init(&ctx->ref);
 	ctx->i915 = i915;
-	ctx->sched.priority = I915_PRIORITY_NORMAL;
+	ctx->sched = pc->sched;
 	mutex_init(&ctx->mutex);
 	INIT_LIST_HEAD(&ctx->link);
 
@@ -696,9 +786,7 @@ __create_context(struct drm_i915_private *i915)
 	 * is no remap info, it will be a NOP. */
 	ctx->remap_slice = ALL_L3_SLICES(i915);
 
-	i915_gem_context_set_bannable(ctx);
-	i915_gem_context_set_recoverable(ctx);
-	__context_set_persistence(ctx, true /* cgroup hook? */);
+	ctx->user_flags = pc->user_flags;
 
 	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
 		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
@@ -786,38 +874,23 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
 }
 
 static struct i915_gem_context *
-i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
+i915_gem_create_context(struct drm_i915_private *i915,
+			const struct i915_gem_proto_context *pc)
 {
 	struct i915_gem_context *ctx;
 	int ret;
 
-	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
-	    !HAS_EXECLISTS(i915))
-		return ERR_PTR(-EINVAL);
-
-	ctx = __create_context(i915);
+	ctx = __create_context(i915, pc);
 	if (IS_ERR(ctx))
 		return ctx;
 
-	if (HAS_FULL_PPGTT(i915)) {
-		struct i915_ppgtt *ppgtt;
-
-		ppgtt = i915_ppgtt_create(&i915->gt);
-		if (IS_ERR(ppgtt)) {
-			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
-				PTR_ERR(ppgtt));
-			context_close(ctx);
-			return ERR_CAST(ppgtt);
-		}
-
+	if (pc->vm) {
 		mutex_lock(&ctx->mutex);
-		__assign_ppgtt(ctx, &ppgtt->vm);
+		__assign_ppgtt(ctx, pc->vm);
 		mutex_unlock(&ctx->mutex);
-
-		i915_vm_put(&ppgtt->vm);
 	}
 
-	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
+	if (pc->single_timeline) {
 		ret = drm_syncobj_create(&ctx->syncobj,
 					 DRM_SYNCOBJ_CREATE_SIGNALED,
 					 NULL);
@@ -883,6 +956,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 			  struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
 	u32 id;
@@ -892,7 +966,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	/* 0 reserved for invalid/unassigned ppgtt */
 	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
 
-	ctx = i915_gem_create_context(i915, 0);
+	pc = proto_context_create(i915, 0);
+	if (IS_ERR(pc)) {
+		err = PTR_ERR(pc);
+		goto err;
+	}
+
+	ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ctx)) {
 		err = PTR_ERR(ctx);
 		goto err;
@@ -1884,6 +1965,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_context_create_ext *args = data;
+	struct i915_gem_proto_context *pc;
 	struct create_ext ext_data;
 	int ret;
 	u32 id;
@@ -1906,7 +1988,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
-	ext_data.ctx = i915_gem_create_context(i915, args->flags);
+	pc = proto_context_create(i915, args->flags);
+	if (IS_ERR(pc))
+		return PTR_ERR(pc);
+
+	ext_data.ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ext_data.ctx))
 		return PTR_ERR(ext_data.ctx);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index df76767f0c41b..a42c429f94577 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -46,6 +46,27 @@ struct i915_gem_engines_iter {
 	const struct i915_gem_engines *engines;
 };
 
+/**
+ * struct i915_gem_proto_context - prototype context
+ *
+ * The struct i915_gem_proto_context represents the creation parameters for
+ * an i915_gem_context.  This is used to gather parameters provided either
+ * through creation flags or via SET_CONTEXT_PARAM so that, when we create
+ * the final i915_gem_context, those parameters can be immutable.
+ */
+struct i915_gem_proto_context {
+	/** @vm: See i915_gem_context::vm */
+	struct i915_address_space *vm;
+
+	/** @user_flags: See i915_gem_context::user_flags */
+	unsigned long user_flags;
+
+	/** @sched: See i915_gem_context::sched */
+	struct i915_sched_attr sched;
+
+	bool single_timeline;
+};
+
 /**
  * struct i915_gem_context - client state
  *
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 51b5a3421b400..e0f512ef7f3c6 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -80,11 +80,17 @@ void mock_init_contexts(struct drm_i915_private *i915)
 struct i915_gem_context *
 live_context(struct drm_i915_private *i915, struct file *file)
 {
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
 	u32 id;
 
-	ctx = i915_gem_create_context(i915, 0);
+	pc = proto_context_create(i915, 0);
+	if (IS_ERR(pc))
+		return ERR_CAST(pc);
+
+	ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ctx))
 		return ctx;
 
@@ -142,8 +148,14 @@ struct i915_gem_context *
 kernel_context(struct drm_i915_private *i915)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_proto_context *pc;
+
+	pc = proto_context_create(i915, 0);
+	if (IS_ERR(pc))
+		return ERR_CAST(pc);
 
-	ctx = i915_gem_create_context(i915, 0);
+	ctx = i915_gem_create_context(i915, pc);
+	proto_context_close(pc);
 	if (IS_ERR(ctx))
 		return ctx;
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

We're about to start doing lazy context creation which means contexts
get created in i915_gem_context_lookup and we may start having more
errors than -ENOENT.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h                |  2 +-
 drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3e883daab93bf..7929d5a8be449 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	int ret = 0;
 
 	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_GTT_SIZE:
@@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 	int ret;
 
 	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	ret = ctx_setparam(file_priv, ctx, args);
 
@@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 		return -EINVAL;
 
 	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	/*
 	 * We opt for unserialised reads here. This may result in tearing
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 7024adcd5cf15..de14b26f3b2d5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	struct i915_gem_context *ctx;
 
 	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
-	if (unlikely(!ctx))
-		return -ENOENT;
+	if (unlikely(IS_ERR(ctx)))
+		return PTR_ERR(ctx);
 
 	eb->gem_context = ctx;
 	if (rcu_access_pointer(ctx->vm))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8571c5c1509a7..004ed0e59c999 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 		ctx = NULL;
 	rcu_read_unlock();
 
-	return ctx;
+	return ctx ? ctx : ERR_PTR(-ENOENT);
 }
 
 /* i915_gem_evict.c */
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 85ad62dbabfab..b86ed03f6a705 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
 		struct drm_i915_file_private *file_priv = file->driver_priv;
 
 		specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
-		if (!specific_ctx) {
+		if (IS_ERR(specific_ctx)) {
 			DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
 				  ctx_handle);
-			ret = -ENOENT;
+			ret = PTR_ERR(specific_ctx);
 			goto err;
 		}
 	}
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

We're about to start doing lazy context creation which means contexts
get created in i915_gem_context_lookup and we may start having more
errors than -ENOENT.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h                |  2 +-
 drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3e883daab93bf..7929d5a8be449 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	int ret = 0;
 
 	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_GTT_SIZE:
@@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 	int ret;
 
 	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	ret = ctx_setparam(file_priv, ctx, args);
 
@@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 		return -EINVAL;
 
 	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	/*
 	 * We opt for unserialised reads here. This may result in tearing
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 7024adcd5cf15..de14b26f3b2d5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	struct i915_gem_context *ctx;
 
 	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
-	if (unlikely(!ctx))
-		return -ENOENT;
+	if (unlikely(IS_ERR(ctx)))
+		return PTR_ERR(ctx);
 
 	eb->gem_context = ctx;
 	if (rcu_access_pointer(ctx->vm))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8571c5c1509a7..004ed0e59c999 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 		ctx = NULL;
 	rcu_read_unlock();
 
-	return ctx;
+	return ctx ? ctx : ERR_PTR(-ENOENT);
 }
 
 /* i915_gem_evict.c */
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 85ad62dbabfab..b86ed03f6a705 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
 		struct drm_i915_file_private *file_priv = file->driver_priv;
 
 		specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
-		if (!specific_ctx) {
+		if (IS_ERR(specific_ctx)) {
 			DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
 				  ctx_handle);
-			ret = -ENOENT;
+			ret = PTR_ERR(specific_ctx);
 			goto err;
 		}
 	}
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

There's a big comment saying how useful it is but no one is using this
for anything.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 ---------
 drivers/gpu/drm/i915/gt/intel_gtt.h         | 10 ----------
 drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
 3 files changed, 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7929d5a8be449..db9153e0f85a7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -921,17 +921,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
 				u32 *id)
 {
 	struct drm_i915_private *i915 = ctx->i915;
-	struct i915_address_space *vm;
 	int ret;
 
 	ctx->file_priv = fpriv;
 
-	mutex_lock(&ctx->mutex);
-	vm = i915_gem_context_vm(ctx);
-	if (vm)
-		WRITE_ONCE(vm->file, fpriv); /* XXX */
-	mutex_unlock(&ctx->mutex);
-
 	ctx->pid = get_task_pid(current, PIDTYPE_PID);
 	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 		 current->comm, pid_nr(ctx->pid));
@@ -1030,8 +1023,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (IS_ERR(ppgtt))
 		return PTR_ERR(ppgtt);
 
-	ppgtt->vm.file = file_priv;
-
 	if (args->extensions) {
 		err = i915_user_extensions(u64_to_user_ptr(args->extensions),
 					   NULL, 0,
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index e67e34e179131..4c46068e63c9d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -217,16 +217,6 @@ struct i915_address_space {
 	struct intel_gt *gt;
 	struct drm_i915_private *i915;
 	struct device *dma;
-	/*
-	 * Every address space belongs to a struct file - except for the global
-	 * GTT that is owned by the driver (and so @file is set to NULL). In
-	 * principle, no information should leak from one context to another
-	 * (or between files/processes etc) unless explicitly shared by the
-	 * owner. Tracking the owner is important in order to free up per-file
-	 * objects along with the file, to aide resource tracking, and to
-	 * assign blame.
-	 */
-	struct drm_i915_file_private *file;
 	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
 	u64 reserved;		/* size addr space reserved */
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 5c7ae40bba634..cc047ec594f93 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 	ppgtt->vm.gt = &i915->gt;
 	ppgtt->vm.i915 = i915;
 	ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
-	ppgtt->vm.file = ERR_PTR(-ENODEV);
 	ppgtt->vm.dma = i915->drm.dev;
 
 	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

There's a big comment saying how useful it is but no one is using this
for anything.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 ---------
 drivers/gpu/drm/i915/gt/intel_gtt.h         | 10 ----------
 drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
 3 files changed, 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7929d5a8be449..db9153e0f85a7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -921,17 +921,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
 				u32 *id)
 {
 	struct drm_i915_private *i915 = ctx->i915;
-	struct i915_address_space *vm;
 	int ret;
 
 	ctx->file_priv = fpriv;
 
-	mutex_lock(&ctx->mutex);
-	vm = i915_gem_context_vm(ctx);
-	if (vm)
-		WRITE_ONCE(vm->file, fpriv); /* XXX */
-	mutex_unlock(&ctx->mutex);
-
 	ctx->pid = get_task_pid(current, PIDTYPE_PID);
 	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 		 current->comm, pid_nr(ctx->pid));
@@ -1030,8 +1023,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (IS_ERR(ppgtt))
 		return PTR_ERR(ppgtt);
 
-	ppgtt->vm.file = file_priv;
-
 	if (args->extensions) {
 		err = i915_user_extensions(u64_to_user_ptr(args->extensions),
 					   NULL, 0,
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index e67e34e179131..4c46068e63c9d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -217,16 +217,6 @@ struct i915_address_space {
 	struct intel_gt *gt;
 	struct drm_i915_private *i915;
 	struct device *dma;
-	/*
-	 * Every address space belongs to a struct file - except for the global
-	 * GTT that is owned by the driver (and so @file is set to NULL). In
-	 * principle, no information should leak from one context to another
-	 * (or between files/processes etc) unless explicitly shared by the
-	 * owner. Tracking the owner is important in order to free up per-file
-	 * objects along with the file, to aide resource tracking, and to
-	 * assign blame.
-	 */
-	struct drm_i915_file_private *file;
 	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
 	u64 reserved;		/* size addr space reserved */
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 5c7ae40bba634..cc047ec594f93 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 	ppgtt->vm.gt = &i915->gt;
 	ppgtt->vm.i915 = i915;
 	ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
-	ppgtt->vm.file = ERR_PTR(-ENODEV);
 	ppgtt->vm.dma = i915->drm.dev;
 
 	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 657 ++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  26 +
 .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
 drivers/gpu/drm/i915/i915_drv.h               |  17 +-
 5 files changed, 648 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index db9153e0f85a7..aa8e61211924f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
 
 static void proto_context_close(struct i915_gem_proto_context *pc)
 {
+	int i;
+
 	if (pc->vm)
 		i915_vm_put(pc->vm);
+	if (pc->user_engines) {
+		for (i = 0; i < pc->num_user_engines; i++)
+			kfree(pc->user_engines[i].siblings);
+		kfree(pc->user_engines);
+	}
 	kfree(pc);
 }
 
@@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
 	proto_context_set_persistence(i915, pc, true);
 	pc->sched.priority = I915_PRIORITY_NORMAL;
 
+	pc->num_user_engines = -1;
+	pc->user_engines = NULL;
+
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
 		pc->single_timeline = true;
 
 	return pc;
 }
 
+static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
+					 struct i915_gem_proto_context *pc,
+					 u32 *id)
+{
+	int ret;
+	void *old;
+
+	ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
+	if (ret)
+		return ret;
+
+	old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL);
+	if (xa_is_err(old)) {
+		xa_erase(&fpriv->context_xa, *id);
+		return xa_err(old);
+	}
+	GEM_BUG_ON(old);
+
+	return 0;
+}
+
+static int proto_context_register(struct drm_i915_file_private *fpriv,
+				  struct i915_gem_proto_context *pc,
+				  u32 *id)
+{
+	int ret;
+
+	mutex_lock(&fpriv->proto_context_lock);
+	ret = proto_context_register_locked(fpriv, pc, id);
+	mutex_unlock(&fpriv->proto_context_lock);
+
+	return ret;
+}
+
+static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
+			    struct i915_gem_proto_context *pc,
+			    const struct drm_i915_gem_context_param *args)
+{
+	struct i915_address_space *vm;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!pc->vm)
+		return -ENODEV;
+
+	if (upper_32_bits(args->value))
+		return -ENOENT;
+
+	rcu_read_lock();
+	vm = xa_load(&fpriv->vm_xa, args->value);
+	if (vm && !kref_get_unless_zero(&vm->ref))
+		vm = NULL;
+	rcu_read_unlock();
+	if (!vm)
+		return -ENOENT;
+
+	i915_vm_put(pc->vm);
+	pc->vm = vm;
+
+	return 0;
+}
+
+struct set_proto_ctx_engines {
+	struct drm_i915_private *i915;
+	unsigned num_engines;
+	struct i915_gem_proto_engine *engines;
+};
+
+static int
+set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
+			      void *data)
+{
+	struct i915_context_engines_load_balance __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_proto_ctx_engines *set = data;
+	struct drm_i915_private *i915 = set->i915;
+	struct intel_engine_cs **siblings;
+	u16 num_siblings, idx;
+	unsigned int n;
+	int err;
+
+	if (!HAS_EXECLISTS(i915))
+		return -ENODEV;
+
+	if (intel_uc_uses_guc_submission(&i915->gt.uc))
+		return -ENODEV; /* not implement yet */
+
+	if (get_user(idx, &ext->engine_index))
+		return -EFAULT;
+
+	if (idx >= set->num_engines) {
+		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
+			idx, set->num_engines);
+		return -EINVAL;
+	}
+
+	idx = array_index_nospec(idx, set->num_engines);
+	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
+		drm_dbg(&i915->drm,
+			"Invalid placement[%d], already occupied\n", idx);
+		return -EEXIST;
+	}
+
+	if (get_user(num_siblings, &ext->num_siblings))
+		return -EFAULT;
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	err = check_user_mbz(&ext->mbz64);
+	if (err)
+		return err;
+
+	if (num_siblings == 0)
+		return 0;
+
+	siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
+	if (!siblings)
+		return -ENOMEM;
+
+	for (n = 0; n < num_siblings; n++) {
+		struct i915_engine_class_instance ci;
+
+		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
+			err = -EFAULT;
+			goto err_siblings;
+		}
+
+		siblings[n] = intel_engine_lookup_user(i915,
+						       ci.engine_class,
+						       ci.engine_instance);
+		if (!siblings[n]) {
+			drm_dbg(&i915->drm,
+				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
+				n, ci.engine_class, ci.engine_instance);
+			err = -EINVAL;
+			goto err_siblings;
+		}
+	}
+
+	if (num_siblings == 1) {
+		set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
+		set->engines[idx].engine = siblings[0];
+		kfree(siblings);
+	} else {
+		set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED;
+		set->engines[idx].num_siblings = num_siblings;
+		set->engines[idx].siblings = siblings;
+	}
+
+	return 0;
+
+err_siblings:
+	kfree(siblings);
+
+	return err;
+}
+
+static int
+set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_bond __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_proto_ctx_engines *set = data;
+	struct drm_i915_private *i915 = set->i915;
+	struct i915_engine_class_instance ci;
+	struct intel_engine_cs *master;
+	u16 idx, num_bonds;
+	int err, n;
+
+	if (get_user(idx, &ext->virtual_index))
+		return -EFAULT;
+
+	if (idx >= set->num_engines) {
+		drm_dbg(&i915->drm,
+			"Invalid index for virtual engine: %d >= %d\n",
+			idx, set->num_engines);
+		return -EINVAL;
+	}
+
+	idx = array_index_nospec(idx, set->num_engines);
+	if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) {
+		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
+		return -EINVAL;
+	}
+
+	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) {
+		drm_dbg(&i915->drm,
+			"Bonding with virtual engines not allowed\n");
+		return -EINVAL;
+	}
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+		err = check_user_mbz(&ext->mbz64[n]);
+		if (err)
+			return err;
+	}
+
+	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
+		return -EFAULT;
+
+	master = intel_engine_lookup_user(i915,
+					  ci.engine_class,
+					  ci.engine_instance);
+	if (!master) {
+		drm_dbg(&i915->drm,
+			"Unrecognised master engine: { class:%u, instance:%u }\n",
+			ci.engine_class, ci.engine_instance);
+		return -EINVAL;
+	}
+
+	if (get_user(num_bonds, &ext->num_bonds))
+		return -EFAULT;
+
+	for (n = 0; n < num_bonds; n++) {
+		struct intel_engine_cs *bond;
+
+		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
+			return -EFAULT;
+
+		bond = intel_engine_lookup_user(i915,
+						ci.engine_class,
+						ci.engine_instance);
+		if (!bond) {
+			drm_dbg(&i915->drm,
+				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
+				n, ci.engine_class, ci.engine_instance);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = {
+	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance,
+	[I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond,
+};
+
+static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
+			         struct i915_gem_proto_context *pc,
+			         const struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_private *i915 = fpriv->dev_priv;
+	struct set_proto_ctx_engines set = { .i915 = i915 };
+	struct i915_context_param_engines __user *user =
+		u64_to_user_ptr(args->value);
+	unsigned int n;
+	u64 extensions;
+	int err;
+
+	if (!args->size) {
+		kfree(pc->user_engines);
+		pc->num_user_engines = -1;
+		pc->user_engines = NULL;
+		return 0;
+	}
+
+	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
+	if (args->size < sizeof(*user) ||
+	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
+		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
+			args->size);
+		return -EINVAL;
+	}
+
+	set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
+	if (set.num_engines > I915_EXEC_RING_MASK + 1)
+		return -EINVAL;
+
+	set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL);
+	if (!set.engines)
+		return -ENOMEM;
+
+	for (n = 0; n < set.num_engines; n++) {
+		struct i915_engine_class_instance ci;
+		struct intel_engine_cs *engine;
+
+		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
+			kfree(set.engines);
+			return -EFAULT;
+		}
+
+		memset(&set.engines[n], 0, sizeof(set.engines[n]));
+
+		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
+		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE)
+			continue;
+
+		engine = intel_engine_lookup_user(i915,
+						  ci.engine_class,
+						  ci.engine_instance);
+		if (!engine) {
+			drm_dbg(&i915->drm,
+				"Invalid engine[%d]: { class:%d, instance:%d }\n",
+				n, ci.engine_class, ci.engine_instance);
+			kfree(set.engines);
+			return -ENOENT;
+		}
+
+		set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
+		set.engines[n].engine = engine;
+	}
+
+	err = -EFAULT;
+	if (!get_user(extensions, &user->extensions))
+		err = i915_user_extensions(u64_to_user_ptr(extensions),
+					   set_proto_ctx_engines_extensions,
+					   ARRAY_SIZE(set_proto_ctx_engines_extensions),
+					   &set);
+	if (err) {
+		kfree(set.engines);
+		return err;
+	}
+
+	kfree(pc->user_engines);
+	pc->num_user_engines = set.num_engines;
+	pc->user_engines = set.engines;
+
+	return 0;
+}
+
+static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
+			       struct i915_gem_proto_context *pc,
+			       struct drm_i915_gem_context_param *args)
+{
+	int ret = 0;
+
+	switch (args->param) {
+	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value)
+			set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_BANNABLE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (!capable(CAP_SYS_ADMIN) && !args->value)
+			ret = -EPERM;
+		else if (args->value)
+			set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_BANNABLE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_RECOVERABLE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value)
+			set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_PRIORITY:
+		ret = validate_priority(fpriv->dev_priv, args);
+		if (!ret)
+			pc->sched.priority = args->value;
+		break;
+
+	case I915_CONTEXT_PARAM_SSEU:
+		ret = -ENOTSUPP;
+		break;
+
+	case I915_CONTEXT_PARAM_VM:
+		ret = set_proto_ctx_vm(fpriv, pc, args);
+		break;
+
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = set_proto_ctx_engines(fpriv, pc, args);
+		break;
+
+	case I915_CONTEXT_PARAM_PERSISTENCE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value)
+			set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
+	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_RINGSIZE:
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 	return e;
 }
 
+static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
+					     unsigned int num_engines,
+					     struct i915_gem_proto_engine *pe)
+{
+	struct i915_gem_engines *e;
+	unsigned int n;
+
+	e = alloc_engines(num_engines);
+	for (n = 0; n < num_engines; n++) {
+		struct intel_context *ce;
+
+		switch (pe[n].type) {
+		case I915_GEM_ENGINE_TYPE_PHYSICAL:
+			ce = intel_context_create(pe[n].engine);
+			break;
+
+		case I915_GEM_ENGINE_TYPE_BALANCED:
+			ce = intel_execlists_create_virtual(pe[n].siblings,
+							    pe[n].num_siblings);
+			break;
+
+		case I915_GEM_ENGINE_TYPE_INVALID:
+		default:
+			GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID);
+			continue;
+		}
+
+		if (IS_ERR(ce)) {
+			__free_engines(e, n);
+			return ERR_CAST(ce);
+		}
+
+		intel_context_set_gem(ce, ctx);
+
+		e->engines[n] = ce;
+	}
+	e->num_engines = num_engines;
+
+	return e;
+}
+
 void i915_gem_context_release(struct kref *ref)
 {
 	struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
@@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915,
 		mutex_unlock(&ctx->mutex);
 	}
 
+	if (pc->num_user_engines >= 0) {
+		struct i915_gem_engines *engines;
+
+		engines = user_engines(ctx, pc->num_user_engines,
+				       pc->user_engines);
+		if (IS_ERR(engines)) {
+			context_close(ctx);
+			return ERR_CAST(engines);
+		}
+
+		mutex_lock(&ctx->engines_mutex);
+		i915_gem_context_set_user_engines(ctx);
+		engines = rcu_replace_pointer(ctx->engines, engines, 1);
+		mutex_unlock(&ctx->engines_mutex);
+
+		free_engines(engines);
+	}
+
 	if (pc->single_timeline) {
 		ret = drm_syncobj_create(&ctx->syncobj,
 					 DRM_SYNCOBJ_CREATE_SIGNALED,
@@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915)
 	init_contexts(&i915->gem.contexts);
 }
 
-static int gem_context_register(struct i915_gem_context *ctx,
-				struct drm_i915_file_private *fpriv,
-				u32 *id)
+static void gem_context_register(struct i915_gem_context *ctx,
+				 struct drm_i915_file_private *fpriv,
+				 u32 id)
 {
 	struct drm_i915_private *i915 = ctx->i915;
-	int ret;
+	void *old;
 
 	ctx->file_priv = fpriv;
 
@@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx,
 		 current->comm, pid_nr(ctx->pid));
 
 	/* And finally expose ourselves to userspace via the idr */
-	ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
-	if (ret)
-		goto err_pid;
+	old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
+	GEM_BUG_ON(old);
 
 	spin_lock(&i915->gem.contexts.lock);
 	list_add_tail(&ctx->link, &i915->gem.contexts.list);
 	spin_unlock(&i915->gem.contexts.lock);
-
-	return 0;
-
-err_pid:
-	put_pid(fetch_and_zero(&ctx->pid));
-	return ret;
 }
 
 int i915_gem_context_open(struct drm_i915_private *i915,
@@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
-	u32 id;
 
-	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC);
+	mutex_init(&file_priv->proto_context_lock);
+	xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC);
+
+	/* 0 reserved for the default context */
+	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1);
 
 	/* 0 reserved for invalid/unassigned ppgtt */
 	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
@@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 		goto err;
 	}
 
-	err = gem_context_register(ctx, file_priv, &id);
-	if (err < 0)
-		goto err_ctx;
+	gem_context_register(ctx, file_priv, 0);
 
-	GEM_BUG_ON(id);
 	return 0;
 
-err_ctx:
-	context_close(ctx);
 err:
 	xa_destroy(&file_priv->vm_xa);
 	xa_destroy(&file_priv->context_xa);
+	xa_destroy(&file_priv->proto_context_xa);
+	mutex_destroy(&file_priv->proto_context_lock);
 	return err;
 }
 
 void i915_gem_context_close(struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_proto_context *pc;
 	struct i915_address_space *vm;
 	struct i915_gem_context *ctx;
 	unsigned long idx;
 
+	xa_for_each(&file_priv->proto_context_xa, idx, pc)
+		proto_context_close(pc);
+	xa_destroy(&file_priv->proto_context_xa);
+	mutex_destroy(&file_priv->proto_context_lock);
+
 	xa_for_each(&file_priv->context_xa, idx, ctx)
 		context_close(ctx);
 	xa_destroy(&file_priv->context_xa);
@@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 }
 
 struct create_ext {
-	struct i915_gem_context *ctx;
+	struct i915_gem_proto_context *pc;
 	struct drm_i915_file_private *fpriv;
 };
 
@@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	if (local.param.ctx_id)
 		return -EINVAL;
 
-	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
+	return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param);
 }
 
 static int invalid_ext(struct i915_user_extension __user *ext, void *data)
@@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
 	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
 }
 
+static inline struct i915_gem_context *
+__context_lookup(struct drm_i915_file_private *file_priv, u32 id)
+{
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	ctx = xa_load(&file_priv->context_xa, id);
+	if (ctx && !kref_get_unless_zero(&ctx->ref))
+		ctx = NULL;
+	rcu_read_unlock();
+
+	return ctx;
+}
+
+struct i915_gem_context *
+lazy_create_context_locked(struct drm_i915_file_private *file_priv,
+			   struct i915_gem_proto_context *pc, u32 id)
+{
+	struct i915_gem_context *ctx;
+	void *old;
+
+	ctx = i915_gem_create_context(file_priv->dev_priv, pc);
+	if (IS_ERR(ctx))
+		return ctx;
+
+	gem_context_register(ctx, file_priv, id);
+
+	old = xa_erase(&file_priv->proto_context_xa, id);
+	GEM_BUG_ON(old != pc);
+	proto_context_close(pc);
+
+	/* One for the xarray and one for the caller */
+	return i915_gem_context_get(ctx);
+}
+
+struct i915_gem_context *
+i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
+{
+	struct i915_gem_proto_context *pc;
+	struct i915_gem_context *ctx;
+
+	ctx = __context_lookup(file_priv, id);
+	if (ctx)
+		return ctx;
+
+	mutex_lock(&file_priv->proto_context_lock);
+	/* Try one more time under the lock */
+	ctx = __context_lookup(file_priv, id);
+	if (!ctx) {
+		pc = xa_load(&file_priv->proto_context_xa, id);
+		if (!pc)
+			ctx = ERR_PTR(-ENOENT);
+		else
+			ctx = lazy_create_context_locked(file_priv, pc, id);
+	}
+	mutex_unlock(&file_priv->proto_context_lock);
+
+	return ctx;
+}
+
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file)
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_context_create_ext *args = data;
-	struct i915_gem_proto_context *pc;
 	struct create_ext ext_data;
 	int ret;
 	u32 id;
@@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
-	pc = proto_context_create(i915, args->flags);
-	if (IS_ERR(pc))
-		return PTR_ERR(pc);
-
-	ext_data.ctx = i915_gem_create_context(i915, pc);
-	proto_context_close(pc);
-	if (IS_ERR(ext_data.ctx))
-		return PTR_ERR(ext_data.ctx);
+	ext_data.pc = proto_context_create(i915, args->flags);
+	if (IS_ERR(ext_data.pc))
+		return PTR_ERR(ext_data.pc);
 
 	if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
 		ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
@@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 					   ARRAY_SIZE(create_extensions),
 					   &ext_data);
 		if (ret)
-			goto err_ctx;
+			goto err_pc;
 	}
 
-	ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id);
+	ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
 	if (ret < 0)
-		goto err_ctx;
+		goto err_pc;
 
 	args->ctx_id = id;
 	drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id);
 
 	return 0;
 
-err_ctx:
-	context_close(ext_data.ctx);
+err_pc:
+	proto_context_close(ext_data.pc);
 	return ret;
 }
 
@@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_context_destroy *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 
 	if (args->pad != 0)
@@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	if (!args->ctx_id)
 		return -ENOENT;
 
+	mutex_lock(&file_priv->proto_context_lock);
 	ctx = xa_erase(&file_priv->context_xa, args->ctx_id);
-	if (!ctx)
+	pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id);
+	mutex_unlock(&file_priv->proto_context_lock);
+
+	if (!ctx && !pc)
 		return -ENOENT;
+	GEM_WARN_ON(ctx && pc);
+
+	if (pc)
+		proto_context_close(pc);
+
+	if (ctx)
+		context_close(ctx);
 
-	context_close(ctx);
 	return 0;
 }
 
@@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct drm_i915_gem_context_param *args = data;
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
-	int ret;
+	int ret = 0;
 
-	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
+	ctx = __context_lookup(file_priv, args->ctx_id);
+	if (ctx)
+		goto set_ctx_param;
 
-	ret = ctx_setparam(file_priv, ctx, args);
+	mutex_lock(&file_priv->proto_context_lock);
+	ctx = __context_lookup(file_priv, args->ctx_id);
+	if (ctx)
+		goto unlock;
+
+	pc = xa_load(&file_priv->proto_context_xa, args->ctx_id);
+	if (!pc) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	ret = set_proto_ctx_param(file_priv, pc, args);
+	if (ret == -ENOTSUPP) {
+		/* Some params, specifically SSEU, can only be set on fully
+		 * created contexts.
+		 */
+		ret = 0;
+		ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id);
+		if (IS_ERR(ctx)) {
+			ret = PTR_ERR(ctx);
+			ctx = NULL;
+		}
+	}
+
+unlock:
+	mutex_unlock(&file_priv->proto_context_lock);
+
+set_ctx_param:
+	if (!ret && ctx)
+		ret = ctx_setparam(file_priv, ctx, args);
+
+	if (ctx)
+		i915_gem_context_put(ctx);
 
-	i915_gem_context_put(ctx);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index b5c908f3f4f22..20411db84914a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+struct i915_gem_context *
+i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 static inline struct i915_gem_context *
 i915_gem_context_get(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index a42c429f94577..067ea3030ac91 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -46,6 +46,26 @@ struct i915_gem_engines_iter {
 	const struct i915_gem_engines *engines;
 };
 
+enum i915_gem_engine_type {
+	I915_GEM_ENGINE_TYPE_INVALID = 0,
+	I915_GEM_ENGINE_TYPE_PHYSICAL,
+	I915_GEM_ENGINE_TYPE_BALANCED,
+};
+
+struct i915_gem_proto_engine {
+	/** @type: Type of this engine */
+	enum i915_gem_engine_type type;
+
+	/** @num_siblings: Engine, for physical */
+	struct intel_engine_cs *engine;
+
+	/** @num_siblings: Number of balanced siblings */
+	unsigned int num_siblings;
+
+	/** @num_siblings: Balanced siblings */
+	struct intel_engine_cs **siblings;
+};
+
 /**
  * struct i915_gem_proto_context - prototype context
  *
@@ -64,6 +84,12 @@ struct i915_gem_proto_context {
 	/** @sched: See i915_gem_context::sched */
 	struct i915_sched_attr sched;
 
+	/** @num_user_engines: Number of user-specified engines or -1 */
+	int num_user_engines;
+
+	/** @num_user_engines: User-specified engines */
+	struct i915_gem_proto_engine *user_engines;
+
 	bool single_timeline;
 };
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index e0f512ef7f3c6..32cf2103828f9 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915)
 struct i915_gem_context *
 live_context(struct drm_i915_private *i915, struct file *file)
 {
+	struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
 	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
@@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file)
 
 	i915_gem_context_set_no_error_capture(ctx);
 
-	err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
+	err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
 	if (err < 0)
 		goto err_ctx;
 
+	gem_context_register(ctx, fpriv, id);
+
 	return ctx;
 
 err_ctx:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 004ed0e59c999..365c042529d72 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -200,6 +200,9 @@ struct drm_i915_file_private {
 		struct rcu_head rcu;
 	};
 
+	struct mutex proto_context_lock;
+	struct xarray proto_context_xa;
+
 	struct xarray context_xa;
 	struct xarray vm_xa;
 
@@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 
 struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
 
-static inline struct i915_gem_context *
-i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
-{
-	struct i915_gem_context *ctx;
-
-	rcu_read_lock();
-	ctx = xa_load(&file_priv->context_xa, id);
-	if (ctx && !kref_get_unless_zero(&ctx->ref))
-		ctx = NULL;
-	rcu_read_unlock();
-
-	return ctx ? ctx : ERR_PTR(-ENOENT);
-}
-
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 657 ++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  26 +
 .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
 drivers/gpu/drm/i915/i915_drv.h               |  17 +-
 5 files changed, 648 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index db9153e0f85a7..aa8e61211924f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
 
 static void proto_context_close(struct i915_gem_proto_context *pc)
 {
+	int i;
+
 	if (pc->vm)
 		i915_vm_put(pc->vm);
+	if (pc->user_engines) {
+		for (i = 0; i < pc->num_user_engines; i++)
+			kfree(pc->user_engines[i].siblings);
+		kfree(pc->user_engines);
+	}
 	kfree(pc);
 }
 
@@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
 	proto_context_set_persistence(i915, pc, true);
 	pc->sched.priority = I915_PRIORITY_NORMAL;
 
+	pc->num_user_engines = -1;
+	pc->user_engines = NULL;
+
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
 		pc->single_timeline = true;
 
 	return pc;
 }
 
+static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
+					 struct i915_gem_proto_context *pc,
+					 u32 *id)
+{
+	int ret;
+	void *old;
+
+	ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
+	if (ret)
+		return ret;
+
+	old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL);
+	if (xa_is_err(old)) {
+		xa_erase(&fpriv->context_xa, *id);
+		return xa_err(old);
+	}
+	GEM_BUG_ON(old);
+
+	return 0;
+}
+
+static int proto_context_register(struct drm_i915_file_private *fpriv,
+				  struct i915_gem_proto_context *pc,
+				  u32 *id)
+{
+	int ret;
+
+	mutex_lock(&fpriv->proto_context_lock);
+	ret = proto_context_register_locked(fpriv, pc, id);
+	mutex_unlock(&fpriv->proto_context_lock);
+
+	return ret;
+}
+
+static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
+			    struct i915_gem_proto_context *pc,
+			    const struct drm_i915_gem_context_param *args)
+{
+	struct i915_address_space *vm;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!pc->vm)
+		return -ENODEV;
+
+	if (upper_32_bits(args->value))
+		return -ENOENT;
+
+	rcu_read_lock();
+	vm = xa_load(&fpriv->vm_xa, args->value);
+	if (vm && !kref_get_unless_zero(&vm->ref))
+		vm = NULL;
+	rcu_read_unlock();
+	if (!vm)
+		return -ENOENT;
+
+	i915_vm_put(pc->vm);
+	pc->vm = vm;
+
+	return 0;
+}
+
+struct set_proto_ctx_engines {
+	struct drm_i915_private *i915;
+	unsigned num_engines;
+	struct i915_gem_proto_engine *engines;
+};
+
+static int
+set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
+			      void *data)
+{
+	struct i915_context_engines_load_balance __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_proto_ctx_engines *set = data;
+	struct drm_i915_private *i915 = set->i915;
+	struct intel_engine_cs **siblings;
+	u16 num_siblings, idx;
+	unsigned int n;
+	int err;
+
+	if (!HAS_EXECLISTS(i915))
+		return -ENODEV;
+
+	if (intel_uc_uses_guc_submission(&i915->gt.uc))
+		return -ENODEV; /* not implement yet */
+
+	if (get_user(idx, &ext->engine_index))
+		return -EFAULT;
+
+	if (idx >= set->num_engines) {
+		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
+			idx, set->num_engines);
+		return -EINVAL;
+	}
+
+	idx = array_index_nospec(idx, set->num_engines);
+	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
+		drm_dbg(&i915->drm,
+			"Invalid placement[%d], already occupied\n", idx);
+		return -EEXIST;
+	}
+
+	if (get_user(num_siblings, &ext->num_siblings))
+		return -EFAULT;
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	err = check_user_mbz(&ext->mbz64);
+	if (err)
+		return err;
+
+	if (num_siblings == 0)
+		return 0;
+
+	siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
+	if (!siblings)
+		return -ENOMEM;
+
+	for (n = 0; n < num_siblings; n++) {
+		struct i915_engine_class_instance ci;
+
+		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
+			err = -EFAULT;
+			goto err_siblings;
+		}
+
+		siblings[n] = intel_engine_lookup_user(i915,
+						       ci.engine_class,
+						       ci.engine_instance);
+		if (!siblings[n]) {
+			drm_dbg(&i915->drm,
+				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
+				n, ci.engine_class, ci.engine_instance);
+			err = -EINVAL;
+			goto err_siblings;
+		}
+	}
+
+	if (num_siblings == 1) {
+		set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
+		set->engines[idx].engine = siblings[0];
+		kfree(siblings);
+	} else {
+		set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED;
+		set->engines[idx].num_siblings = num_siblings;
+		set->engines[idx].siblings = siblings;
+	}
+
+	return 0;
+
+err_siblings:
+	kfree(siblings);
+
+	return err;
+}
+
+static int
+set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_bond __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_proto_ctx_engines *set = data;
+	struct drm_i915_private *i915 = set->i915;
+	struct i915_engine_class_instance ci;
+	struct intel_engine_cs *master;
+	u16 idx, num_bonds;
+	int err, n;
+
+	if (get_user(idx, &ext->virtual_index))
+		return -EFAULT;
+
+	if (idx >= set->num_engines) {
+		drm_dbg(&i915->drm,
+			"Invalid index for virtual engine: %d >= %d\n",
+			idx, set->num_engines);
+		return -EINVAL;
+	}
+
+	idx = array_index_nospec(idx, set->num_engines);
+	if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) {
+		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
+		return -EINVAL;
+	}
+
+	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) {
+		drm_dbg(&i915->drm,
+			"Bonding with virtual engines not allowed\n");
+		return -EINVAL;
+	}
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+		err = check_user_mbz(&ext->mbz64[n]);
+		if (err)
+			return err;
+	}
+
+	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
+		return -EFAULT;
+
+	master = intel_engine_lookup_user(i915,
+					  ci.engine_class,
+					  ci.engine_instance);
+	if (!master) {
+		drm_dbg(&i915->drm,
+			"Unrecognised master engine: { class:%u, instance:%u }\n",
+			ci.engine_class, ci.engine_instance);
+		return -EINVAL;
+	}
+
+	if (get_user(num_bonds, &ext->num_bonds))
+		return -EFAULT;
+
+	for (n = 0; n < num_bonds; n++) {
+		struct intel_engine_cs *bond;
+
+		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
+			return -EFAULT;
+
+		bond = intel_engine_lookup_user(i915,
+						ci.engine_class,
+						ci.engine_instance);
+		if (!bond) {
+			drm_dbg(&i915->drm,
+				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
+				n, ci.engine_class, ci.engine_instance);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = {
+	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance,
+	[I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond,
+};
+
+static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
+			         struct i915_gem_proto_context *pc,
+			         const struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_private *i915 = fpriv->dev_priv;
+	struct set_proto_ctx_engines set = { .i915 = i915 };
+	struct i915_context_param_engines __user *user =
+		u64_to_user_ptr(args->value);
+	unsigned int n;
+	u64 extensions;
+	int err;
+
+	if (!args->size) {
+		kfree(pc->user_engines);
+		pc->num_user_engines = -1;
+		pc->user_engines = NULL;
+		return 0;
+	}
+
+	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
+	if (args->size < sizeof(*user) ||
+	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
+		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
+			args->size);
+		return -EINVAL;
+	}
+
+	set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
+	if (set.num_engines > I915_EXEC_RING_MASK + 1)
+		return -EINVAL;
+
+	set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL);
+	if (!set.engines)
+		return -ENOMEM;
+
+	for (n = 0; n < set.num_engines; n++) {
+		struct i915_engine_class_instance ci;
+		struct intel_engine_cs *engine;
+
+		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
+			kfree(set.engines);
+			return -EFAULT;
+		}
+
+		memset(&set.engines[n], 0, sizeof(set.engines[n]));
+
+		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
+		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE)
+			continue;
+
+		engine = intel_engine_lookup_user(i915,
+						  ci.engine_class,
+						  ci.engine_instance);
+		if (!engine) {
+			drm_dbg(&i915->drm,
+				"Invalid engine[%d]: { class:%d, instance:%d }\n",
+				n, ci.engine_class, ci.engine_instance);
+			kfree(set.engines);
+			return -ENOENT;
+		}
+
+		set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
+		set.engines[n].engine = engine;
+	}
+
+	err = -EFAULT;
+	if (!get_user(extensions, &user->extensions))
+		err = i915_user_extensions(u64_to_user_ptr(extensions),
+					   set_proto_ctx_engines_extensions,
+					   ARRAY_SIZE(set_proto_ctx_engines_extensions),
+					   &set);
+	if (err) {
+		kfree(set.engines);
+		return err;
+	}
+
+	kfree(pc->user_engines);
+	pc->num_user_engines = set.num_engines;
+	pc->user_engines = set.engines;
+
+	return 0;
+}
+
+static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
+			       struct i915_gem_proto_context *pc,
+			       struct drm_i915_gem_context_param *args)
+{
+	int ret = 0;
+
+	switch (args->param) {
+	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value)
+			set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_BANNABLE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (!capable(CAP_SYS_ADMIN) && !args->value)
+			ret = -EPERM;
+		else if (args->value)
+			set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_BANNABLE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_RECOVERABLE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value)
+			set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_PRIORITY:
+		ret = validate_priority(fpriv->dev_priv, args);
+		if (!ret)
+			pc->sched.priority = args->value;
+		break;
+
+	case I915_CONTEXT_PARAM_SSEU:
+		ret = -ENOTSUPP;
+		break;
+
+	case I915_CONTEXT_PARAM_VM:
+		ret = set_proto_ctx_vm(fpriv, pc, args);
+		break;
+
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = set_proto_ctx_engines(fpriv, pc, args);
+		break;
+
+	case I915_CONTEXT_PARAM_PERSISTENCE:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value)
+			set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+		else
+			clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
+	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	case I915_CONTEXT_PARAM_RINGSIZE:
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 	return e;
 }
 
+static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
+					     unsigned int num_engines,
+					     struct i915_gem_proto_engine *pe)
+{
+	struct i915_gem_engines *e;
+	unsigned int n;
+
+	e = alloc_engines(num_engines);
+	for (n = 0; n < num_engines; n++) {
+		struct intel_context *ce;
+
+		switch (pe[n].type) {
+		case I915_GEM_ENGINE_TYPE_PHYSICAL:
+			ce = intel_context_create(pe[n].engine);
+			break;
+
+		case I915_GEM_ENGINE_TYPE_BALANCED:
+			ce = intel_execlists_create_virtual(pe[n].siblings,
+							    pe[n].num_siblings);
+			break;
+
+		case I915_GEM_ENGINE_TYPE_INVALID:
+		default:
+			GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID);
+			continue;
+		}
+
+		if (IS_ERR(ce)) {
+			__free_engines(e, n);
+			return ERR_CAST(ce);
+		}
+
+		intel_context_set_gem(ce, ctx);
+
+		e->engines[n] = ce;
+	}
+	e->num_engines = num_engines;
+
+	return e;
+}
+
 void i915_gem_context_release(struct kref *ref)
 {
 	struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
@@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915,
 		mutex_unlock(&ctx->mutex);
 	}
 
+	if (pc->num_user_engines >= 0) {
+		struct i915_gem_engines *engines;
+
+		engines = user_engines(ctx, pc->num_user_engines,
+				       pc->user_engines);
+		if (IS_ERR(engines)) {
+			context_close(ctx);
+			return ERR_CAST(engines);
+		}
+
+		mutex_lock(&ctx->engines_mutex);
+		i915_gem_context_set_user_engines(ctx);
+		engines = rcu_replace_pointer(ctx->engines, engines, 1);
+		mutex_unlock(&ctx->engines_mutex);
+
+		free_engines(engines);
+	}
+
 	if (pc->single_timeline) {
 		ret = drm_syncobj_create(&ctx->syncobj,
 					 DRM_SYNCOBJ_CREATE_SIGNALED,
@@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915)
 	init_contexts(&i915->gem.contexts);
 }
 
-static int gem_context_register(struct i915_gem_context *ctx,
-				struct drm_i915_file_private *fpriv,
-				u32 *id)
+static void gem_context_register(struct i915_gem_context *ctx,
+				 struct drm_i915_file_private *fpriv,
+				 u32 id)
 {
 	struct drm_i915_private *i915 = ctx->i915;
-	int ret;
+	void *old;
 
 	ctx->file_priv = fpriv;
 
@@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx,
 		 current->comm, pid_nr(ctx->pid));
 
 	/* And finally expose ourselves to userspace via the idr */
-	ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
-	if (ret)
-		goto err_pid;
+	old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
+	GEM_BUG_ON(old);
 
 	spin_lock(&i915->gem.contexts.lock);
 	list_add_tail(&ctx->link, &i915->gem.contexts.list);
 	spin_unlock(&i915->gem.contexts.lock);
-
-	return 0;
-
-err_pid:
-	put_pid(fetch_and_zero(&ctx->pid));
-	return ret;
 }
 
 int i915_gem_context_open(struct drm_i915_private *i915,
@@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
-	u32 id;
 
-	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC);
+	mutex_init(&file_priv->proto_context_lock);
+	xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC);
+
+	/* 0 reserved for the default context */
+	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1);
 
 	/* 0 reserved for invalid/unassigned ppgtt */
 	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
@@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 		goto err;
 	}
 
-	err = gem_context_register(ctx, file_priv, &id);
-	if (err < 0)
-		goto err_ctx;
+	gem_context_register(ctx, file_priv, 0);
 
-	GEM_BUG_ON(id);
 	return 0;
 
-err_ctx:
-	context_close(ctx);
 err:
 	xa_destroy(&file_priv->vm_xa);
 	xa_destroy(&file_priv->context_xa);
+	xa_destroy(&file_priv->proto_context_xa);
+	mutex_destroy(&file_priv->proto_context_lock);
 	return err;
 }
 
 void i915_gem_context_close(struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_proto_context *pc;
 	struct i915_address_space *vm;
 	struct i915_gem_context *ctx;
 	unsigned long idx;
 
+	xa_for_each(&file_priv->proto_context_xa, idx, pc)
+		proto_context_close(pc);
+	xa_destroy(&file_priv->proto_context_xa);
+	mutex_destroy(&file_priv->proto_context_lock);
+
 	xa_for_each(&file_priv->context_xa, idx, ctx)
 		context_close(ctx);
 	xa_destroy(&file_priv->context_xa);
@@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 }
 
 struct create_ext {
-	struct i915_gem_context *ctx;
+	struct i915_gem_proto_context *pc;
 	struct drm_i915_file_private *fpriv;
 };
 
@@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	if (local.param.ctx_id)
 		return -EINVAL;
 
-	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
+	return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param);
 }
 
 static int invalid_ext(struct i915_user_extension __user *ext, void *data)
@@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
 	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
 }
 
+static inline struct i915_gem_context *
+__context_lookup(struct drm_i915_file_private *file_priv, u32 id)
+{
+	struct i915_gem_context *ctx;
+
+	rcu_read_lock();
+	ctx = xa_load(&file_priv->context_xa, id);
+	if (ctx && !kref_get_unless_zero(&ctx->ref))
+		ctx = NULL;
+	rcu_read_unlock();
+
+	return ctx;
+}
+
+struct i915_gem_context *
+lazy_create_context_locked(struct drm_i915_file_private *file_priv,
+			   struct i915_gem_proto_context *pc, u32 id)
+{
+	struct i915_gem_context *ctx;
+	void *old;
+
+	ctx = i915_gem_create_context(file_priv->dev_priv, pc);
+	if (IS_ERR(ctx))
+		return ctx;
+
+	gem_context_register(ctx, file_priv, id);
+
+	old = xa_erase(&file_priv->proto_context_xa, id);
+	GEM_BUG_ON(old != pc);
+	proto_context_close(pc);
+
+	/* One for the xarray and one for the caller */
+	return i915_gem_context_get(ctx);
+}
+
+struct i915_gem_context *
+i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
+{
+	struct i915_gem_proto_context *pc;
+	struct i915_gem_context *ctx;
+
+	ctx = __context_lookup(file_priv, id);
+	if (ctx)
+		return ctx;
+
+	mutex_lock(&file_priv->proto_context_lock);
+	/* Try one more time under the lock */
+	ctx = __context_lookup(file_priv, id);
+	if (!ctx) {
+		pc = xa_load(&file_priv->proto_context_xa, id);
+		if (!pc)
+			ctx = ERR_PTR(-ENOENT);
+		else
+			ctx = lazy_create_context_locked(file_priv, pc, id);
+	}
+	mutex_unlock(&file_priv->proto_context_lock);
+
+	return ctx;
+}
+
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file)
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_context_create_ext *args = data;
-	struct i915_gem_proto_context *pc;
 	struct create_ext ext_data;
 	int ret;
 	u32 id;
@@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
-	pc = proto_context_create(i915, args->flags);
-	if (IS_ERR(pc))
-		return PTR_ERR(pc);
-
-	ext_data.ctx = i915_gem_create_context(i915, pc);
-	proto_context_close(pc);
-	if (IS_ERR(ext_data.ctx))
-		return PTR_ERR(ext_data.ctx);
+	ext_data.pc = proto_context_create(i915, args->flags);
+	if (IS_ERR(ext_data.pc))
+		return PTR_ERR(ext_data.pc);
 
 	if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
 		ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
@@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 					   ARRAY_SIZE(create_extensions),
 					   &ext_data);
 		if (ret)
-			goto err_ctx;
+			goto err_pc;
 	}
 
-	ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id);
+	ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
 	if (ret < 0)
-		goto err_ctx;
+		goto err_pc;
 
 	args->ctx_id = id;
 	drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id);
 
 	return 0;
 
-err_ctx:
-	context_close(ext_data.ctx);
+err_pc:
+	proto_context_close(ext_data.pc);
 	return ret;
 }
 
@@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_context_destroy *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 
 	if (args->pad != 0)
@@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	if (!args->ctx_id)
 		return -ENOENT;
 
+	mutex_lock(&file_priv->proto_context_lock);
 	ctx = xa_erase(&file_priv->context_xa, args->ctx_id);
-	if (!ctx)
+	pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id);
+	mutex_unlock(&file_priv->proto_context_lock);
+
+	if (!ctx && !pc)
 		return -ENOENT;
+	GEM_WARN_ON(ctx && pc);
+
+	if (pc)
+		proto_context_close(pc);
+
+	if (ctx)
+		context_close(ctx);
 
-	context_close(ctx);
 	return 0;
 }
 
@@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct drm_i915_gem_context_param *args = data;
+	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
-	int ret;
+	int ret = 0;
 
-	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
+	ctx = __context_lookup(file_priv, args->ctx_id);
+	if (ctx)
+		goto set_ctx_param;
 
-	ret = ctx_setparam(file_priv, ctx, args);
+	mutex_lock(&file_priv->proto_context_lock);
+	ctx = __context_lookup(file_priv, args->ctx_id);
+	if (ctx)
+		goto unlock;
+
+	pc = xa_load(&file_priv->proto_context_xa, args->ctx_id);
+	if (!pc) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	ret = set_proto_ctx_param(file_priv, pc, args);
+	if (ret == -ENOTSUPP) {
+		/* Some params, specifically SSEU, can only be set on fully
+		 * created contexts.
+		 */
+		ret = 0;
+		ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id);
+		if (IS_ERR(ctx)) {
+			ret = PTR_ERR(ctx);
+			ctx = NULL;
+		}
+	}
+
+unlock:
+	mutex_unlock(&file_priv->proto_context_lock);
+
+set_ctx_param:
+	if (!ret && ctx)
+		ret = ctx_setparam(file_priv, ctx, args);
+
+	if (ctx)
+		i915_gem_context_put(ctx);
 
-	i915_gem_context_put(ctx);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index b5c908f3f4f22..20411db84914a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+struct i915_gem_context *
+i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 static inline struct i915_gem_context *
 i915_gem_context_get(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index a42c429f94577..067ea3030ac91 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -46,6 +46,26 @@ struct i915_gem_engines_iter {
 	const struct i915_gem_engines *engines;
 };
 
+enum i915_gem_engine_type {
+	I915_GEM_ENGINE_TYPE_INVALID = 0,
+	I915_GEM_ENGINE_TYPE_PHYSICAL,
+	I915_GEM_ENGINE_TYPE_BALANCED,
+};
+
+struct i915_gem_proto_engine {
+	/** @type: Type of this engine */
+	enum i915_gem_engine_type type;
+
+	/** @num_siblings: Engine, for physical */
+	struct intel_engine_cs *engine;
+
+	/** @num_siblings: Number of balanced siblings */
+	unsigned int num_siblings;
+
+	/** @num_siblings: Balanced siblings */
+	struct intel_engine_cs **siblings;
+};
+
 /**
  * struct i915_gem_proto_context - prototype context
  *
@@ -64,6 +84,12 @@ struct i915_gem_proto_context {
 	/** @sched: See i915_gem_context::sched */
 	struct i915_sched_attr sched;
 
+	/** @num_user_engines: Number of user-specified engines or -1 */
+	int num_user_engines;
+
+	/** @num_user_engines: User-specified engines */
+	struct i915_gem_proto_engine *user_engines;
+
 	bool single_timeline;
 };
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index e0f512ef7f3c6..32cf2103828f9 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915)
 struct i915_gem_context *
 live_context(struct drm_i915_private *i915, struct file *file)
 {
+	struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
 	struct i915_gem_proto_context *pc;
 	struct i915_gem_context *ctx;
 	int err;
@@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file)
 
 	i915_gem_context_set_no_error_capture(ctx);
 
-	err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
+	err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
 	if (err < 0)
 		goto err_ctx;
 
+	gem_context_register(ctx, fpriv, id);
+
 	return ctx;
 
 err_ctx:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 004ed0e59c999..365c042529d72 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -200,6 +200,9 @@ struct drm_i915_file_private {
 		struct rcu_head rcu;
 	};
 
+	struct mutex proto_context_lock;
+	struct xarray proto_context_xa;
+
 	struct xarray context_xa;
 	struct xarray vm_xa;
 
@@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 
 struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
 
-static inline struct i915_gem_context *
-i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
-{
-	struct i915_gem_context *ctx;
-
-	rcu_read_lock();
-	ctx = xa_load(&file_priv->context_xa, id);
-	if (ctx && !kref_get_unless_zero(&ctx->ref))
-		ctx = NULL;
-	rcu_read_unlock();
-
-	return ctx ? ctx : ERR_PTR(-ENOENT);
-}
-
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
 					  u64 min_size, u64 alignment,
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 17/21] drm/i915/gem: Don't allow changing the VM on running contexts
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 267 ------------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   2 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 119 --------
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 4 files changed, 1 insertion(+), 388 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index aa8e61211924f..3238260cffa31 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1536,121 +1536,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
-struct context_barrier_task {
-	struct i915_active base;
-	void (*task)(void *data);
-	void *data;
-};
-
-__i915_active_call
-static void cb_retire(struct i915_active *base)
-{
-	struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
-
-	if (cb->task)
-		cb->task(cb->data);
-
-	i915_active_fini(&cb->base);
-	kfree(cb);
-}
-
-I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault);
-static int context_barrier_task(struct i915_gem_context *ctx,
-				intel_engine_mask_t engines,
-				bool (*skip)(struct intel_context *ce, void *data),
-				int (*pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data),
-				int (*emit)(struct i915_request *rq, void *data),
-				void (*task)(void *data),
-				void *data)
-{
-	struct context_barrier_task *cb;
-	struct i915_gem_engines_iter it;
-	struct i915_gem_engines *e;
-	struct i915_gem_ww_ctx ww;
-	struct intel_context *ce;
-	int err = 0;
-
-	GEM_BUG_ON(!task);
-
-	cb = kmalloc(sizeof(*cb), GFP_KERNEL);
-	if (!cb)
-		return -ENOMEM;
-
-	i915_active_init(&cb->base, NULL, cb_retire);
-	err = i915_active_acquire(&cb->base);
-	if (err) {
-		kfree(cb);
-		return err;
-	}
-
-	e = __context_engines_await(ctx, NULL);
-	if (!e) {
-		i915_active_release(&cb->base);
-		return -ENOENT;
-	}
-
-	for_each_gem_engine(ce, e, it) {
-		struct i915_request *rq;
-
-		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-				       ce->engine->mask)) {
-			err = -ENXIO;
-			break;
-		}
-
-		if (!(ce->engine->mask & engines))
-			continue;
-
-		if (skip && skip(ce, data))
-			continue;
-
-		i915_gem_ww_ctx_init(&ww, true);
-retry:
-		err = intel_context_pin_ww(ce, &ww);
-		if (err)
-			goto err;
-
-		if (pin)
-			err = pin(ce, &ww, data);
-		if (err)
-			goto err_unpin;
-
-		rq = i915_request_create(ce);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_unpin;
-		}
-
-		err = 0;
-		if (emit)
-			err = emit(rq, data);
-		if (err == 0)
-			err = i915_active_add_request(&cb->base, rq);
-
-		i915_request_add(rq);
-err_unpin:
-		intel_context_unpin(ce);
-err:
-		if (err == -EDEADLK) {
-			err = i915_gem_ww_ctx_backoff(&ww);
-			if (!err)
-				goto retry;
-		}
-		i915_gem_ww_ctx_fini(&ww);
-
-		if (err)
-			break;
-	}
-	i915_sw_fence_complete(&e->fence);
-
-	cb->task = err ? NULL : task; /* caller needs to unwind instead */
-	cb->data = data;
-
-	i915_active_release(&cb->base);
-
-	return err;
-}
-
 static int get_ppgtt(struct drm_i915_file_private *file_priv,
 		     struct i915_gem_context *ctx,
 		     struct drm_i915_gem_context_param *args)
@@ -1683,154 +1568,6 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
 	return err;
 }
 
-static void set_ppgtt_barrier(void *data)
-{
-	struct i915_address_space *old = data;
-
-	if (INTEL_GEN(old->i915) < 8)
-		gen6_ppgtt_unpin_all(i915_vm_to_ppgtt(old));
-
-	i915_vm_close(old);
-}
-
-static int pin_ppgtt_update(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data)
-{
-	struct i915_address_space *vm = ce->vm;
-
-	if (!HAS_LOGICAL_RING_CONTEXTS(vm->i915))
-		/* ppGTT is not part of the legacy context image */
-		return gen6_ppgtt_pin(i915_vm_to_ppgtt(vm), ww);
-
-	return 0;
-}
-
-static int emit_ppgtt_update(struct i915_request *rq, void *data)
-{
-	struct i915_address_space *vm = rq->context->vm;
-	struct intel_engine_cs *engine = rq->engine;
-	u32 base = engine->mmio_base;
-	u32 *cs;
-	int i;
-
-	if (i915_vm_is_4lvl(vm)) {
-		struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-		const dma_addr_t pd_daddr = px_dma(ppgtt->pd);
-
-		cs = intel_ring_begin(rq, 6);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		*cs++ = MI_LOAD_REGISTER_IMM(2);
-
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 0));
-		*cs++ = upper_32_bits(pd_daddr);
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 0));
-		*cs++ = lower_32_bits(pd_daddr);
-
-		*cs++ = MI_NOOP;
-		intel_ring_advance(rq, cs);
-	} else if (HAS_LOGICAL_RING_CONTEXTS(engine->i915)) {
-		struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-		int err;
-
-		/* Magic required to prevent forcewake errors! */
-		err = engine->emit_flush(rq, EMIT_INVALIDATE);
-		if (err)
-			return err;
-
-		cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
-		for (i = GEN8_3LVL_PDPES; i--; ) {
-			const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
-
-			*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, i));
-			*cs++ = upper_32_bits(pd_daddr);
-			*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, i));
-			*cs++ = lower_32_bits(pd_daddr);
-		}
-		*cs++ = MI_NOOP;
-		intel_ring_advance(rq, cs);
-	}
-
-	return 0;
-}
-
-static bool skip_ppgtt_update(struct intel_context *ce, void *data)
-{
-	if (HAS_LOGICAL_RING_CONTEXTS(ce->engine->i915))
-		return !ce->state;
-	else
-		return !atomic_read(&ce->pin_count);
-}
-
-static int set_ppgtt(struct drm_i915_file_private *file_priv,
-		     struct i915_gem_context *ctx,
-		     struct drm_i915_gem_context_param *args)
-{
-	struct i915_address_space *vm, *old;
-	int err;
-
-	if (args->size)
-		return -EINVAL;
-
-	if (!rcu_access_pointer(ctx->vm))
-		return -ENODEV;
-
-	if (upper_32_bits(args->value))
-		return -ENOENT;
-
-	rcu_read_lock();
-	vm = xa_load(&file_priv->vm_xa, args->value);
-	if (vm && !kref_get_unless_zero(&vm->ref))
-		vm = NULL;
-	rcu_read_unlock();
-	if (!vm)
-		return -ENOENT;
-
-	err = mutex_lock_interruptible(&ctx->mutex);
-	if (err)
-		goto out;
-
-	if (i915_gem_context_is_closed(ctx)) {
-		err = -ENOENT;
-		goto unlock;
-	}
-
-	if (vm == rcu_access_pointer(ctx->vm))
-		goto unlock;
-
-	old = __set_ppgtt(ctx, vm);
-
-	/* Teardown the existing obj:vma cache, it will have to be rebuilt. */
-	lut_close(ctx);
-
-	/*
-	 * We need to flush any requests using the current ppgtt before
-	 * we release it as the requests do not hold a reference themselves,
-	 * only indirectly through the context.
-	 */
-	err = context_barrier_task(ctx, ALL_ENGINES,
-				   skip_ppgtt_update,
-				   pin_ppgtt_update,
-				   emit_ppgtt_update,
-				   set_ppgtt_barrier,
-				   old);
-	if (err) {
-		i915_vm_close(__set_ppgtt(ctx, old));
-		i915_vm_close(old);
-		lut_close(ctx); /* force a rebuild of the old obj:vma cache */
-	}
-
-unlock:
-	mutex_unlock(&ctx->mutex);
-out:
-	i915_vm_put(vm);
-	return err;
-}
-
 int
 i915_gem_user_to_context_sseu(struct intel_gt *gt,
 			      const struct drm_i915_gem_context_param_sseu *user,
@@ -2364,10 +2101,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_sseu(ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_VM:
-		ret = set_ppgtt(fpriv, ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_ENGINES:
 		ret = set_engines(ctx, args);
 		break;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 067ea3030ac91..4aee3667358f0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -153,7 +153,7 @@ struct i915_gem_context {
 	 * In other modes, this is a NULL pointer with the expectation that
 	 * the caller uses the shared global GTT.
 	 */
-	struct i915_address_space __rcu *vm;
+	struct i915_address_space *vm;
 
 	/**
 	 * @pid: process id of creator
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb5..16ff64ab34a1b 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1882,125 +1882,6 @@ static int igt_vm_isolation(void *arg)
 	return err;
 }
 
-static bool skip_unused_engines(struct intel_context *ce, void *data)
-{
-	return !ce->state;
-}
-
-static void mock_barrier_task(void *data)
-{
-	unsigned int *counter = data;
-
-	++*counter;
-}
-
-static int mock_context_barrier(void *arg)
-{
-#undef pr_fmt
-#define pr_fmt(x) "context_barrier_task():" # x
-	struct drm_i915_private *i915 = arg;
-	struct i915_gem_context *ctx;
-	struct i915_request *rq;
-	unsigned int counter;
-	int err;
-
-	/*
-	 * The context barrier provides us with a callback after it emits
-	 * a request; useful for retiring old state after loading new.
-	 */
-
-	ctx = mock_context(i915, "mock");
-	if (!ctx)
-		return -ENOMEM;
-
-	counter = 0;
-	err = context_barrier_task(ctx, 0, NULL, NULL, NULL,
-				   mock_barrier_task, &counter);
-	if (err) {
-		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
-		goto out;
-	}
-	if (counter == 0) {
-		pr_err("Did not retire immediately with 0 engines\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	counter = 0;
-	err = context_barrier_task(ctx, ALL_ENGINES, skip_unused_engines,
-				   NULL, NULL, mock_barrier_task, &counter);
-	if (err) {
-		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
-		goto out;
-	}
-	if (counter == 0) {
-		pr_err("Did not retire immediately for all unused engines\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	rq = igt_request_alloc(ctx, i915->gt.engine[RCS0]);
-	if (IS_ERR(rq)) {
-		pr_err("Request allocation failed!\n");
-		goto out;
-	}
-	i915_request_add(rq);
-
-	counter = 0;
-	context_barrier_inject_fault = BIT(RCS0);
-	err = context_barrier_task(ctx, ALL_ENGINES, NULL, NULL, NULL,
-				   mock_barrier_task, &counter);
-	context_barrier_inject_fault = 0;
-	if (err == -ENXIO)
-		err = 0;
-	else
-		pr_err("Did not hit fault injection!\n");
-	if (counter != 0) {
-		pr_err("Invoked callback on error!\n");
-		err = -EIO;
-	}
-	if (err)
-		goto out;
-
-	counter = 0;
-	err = context_barrier_task(ctx, ALL_ENGINES, skip_unused_engines,
-				   NULL, NULL, mock_barrier_task, &counter);
-	if (err) {
-		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
-		goto out;
-	}
-	mock_device_flush(i915);
-	if (counter == 0) {
-		pr_err("Did not retire on each active engines\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-out:
-	mock_context_close(ctx);
-	return err;
-#undef pr_fmt
-#define pr_fmt(x) x
-}
-
-int i915_gem_context_mock_selftests(void)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(mock_context_barrier),
-	};
-	struct drm_i915_private *i915;
-	int err;
-
-	i915 = mock_gem_device();
-	if (!i915)
-		return -ENOMEM;
-
-	err = i915_subtests(tests, i915);
-
-	mock_destroy_device(i915);
-	return err;
-}
-
 int i915_gem_context_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
index 3db34d3eea58a..52aa91716dc1f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
@@ -32,6 +32,5 @@ selftest(vma, i915_vma_mock_selftests)
 selftest(evict, i915_gem_evict_mock_selftests)
 selftest(gtt, i915_gem_gtt_mock_selftests)
 selftest(hugepages, i915_gem_huge_page_mock_selftests)
-selftest(contexts, i915_gem_context_mock_selftests)
 selftest(buddy, i915_buddy_mock_selftests)
 selftest(memory_region, intel_memory_region_mock_selftests)
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 17/21] drm/i915/gem: Don't allow changing the VM on running contexts
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 267 ------------------
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   2 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 119 --------
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 4 files changed, 1 insertion(+), 388 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index aa8e61211924f..3238260cffa31 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1536,121 +1536,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
-struct context_barrier_task {
-	struct i915_active base;
-	void (*task)(void *data);
-	void *data;
-};
-
-__i915_active_call
-static void cb_retire(struct i915_active *base)
-{
-	struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
-
-	if (cb->task)
-		cb->task(cb->data);
-
-	i915_active_fini(&cb->base);
-	kfree(cb);
-}
-
-I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault);
-static int context_barrier_task(struct i915_gem_context *ctx,
-				intel_engine_mask_t engines,
-				bool (*skip)(struct intel_context *ce, void *data),
-				int (*pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data),
-				int (*emit)(struct i915_request *rq, void *data),
-				void (*task)(void *data),
-				void *data)
-{
-	struct context_barrier_task *cb;
-	struct i915_gem_engines_iter it;
-	struct i915_gem_engines *e;
-	struct i915_gem_ww_ctx ww;
-	struct intel_context *ce;
-	int err = 0;
-
-	GEM_BUG_ON(!task);
-
-	cb = kmalloc(sizeof(*cb), GFP_KERNEL);
-	if (!cb)
-		return -ENOMEM;
-
-	i915_active_init(&cb->base, NULL, cb_retire);
-	err = i915_active_acquire(&cb->base);
-	if (err) {
-		kfree(cb);
-		return err;
-	}
-
-	e = __context_engines_await(ctx, NULL);
-	if (!e) {
-		i915_active_release(&cb->base);
-		return -ENOENT;
-	}
-
-	for_each_gem_engine(ce, e, it) {
-		struct i915_request *rq;
-
-		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-				       ce->engine->mask)) {
-			err = -ENXIO;
-			break;
-		}
-
-		if (!(ce->engine->mask & engines))
-			continue;
-
-		if (skip && skip(ce, data))
-			continue;
-
-		i915_gem_ww_ctx_init(&ww, true);
-retry:
-		err = intel_context_pin_ww(ce, &ww);
-		if (err)
-			goto err;
-
-		if (pin)
-			err = pin(ce, &ww, data);
-		if (err)
-			goto err_unpin;
-
-		rq = i915_request_create(ce);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			goto err_unpin;
-		}
-
-		err = 0;
-		if (emit)
-			err = emit(rq, data);
-		if (err == 0)
-			err = i915_active_add_request(&cb->base, rq);
-
-		i915_request_add(rq);
-err_unpin:
-		intel_context_unpin(ce);
-err:
-		if (err == -EDEADLK) {
-			err = i915_gem_ww_ctx_backoff(&ww);
-			if (!err)
-				goto retry;
-		}
-		i915_gem_ww_ctx_fini(&ww);
-
-		if (err)
-			break;
-	}
-	i915_sw_fence_complete(&e->fence);
-
-	cb->task = err ? NULL : task; /* caller needs to unwind instead */
-	cb->data = data;
-
-	i915_active_release(&cb->base);
-
-	return err;
-}
-
 static int get_ppgtt(struct drm_i915_file_private *file_priv,
 		     struct i915_gem_context *ctx,
 		     struct drm_i915_gem_context_param *args)
@@ -1683,154 +1568,6 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
 	return err;
 }
 
-static void set_ppgtt_barrier(void *data)
-{
-	struct i915_address_space *old = data;
-
-	if (INTEL_GEN(old->i915) < 8)
-		gen6_ppgtt_unpin_all(i915_vm_to_ppgtt(old));
-
-	i915_vm_close(old);
-}
-
-static int pin_ppgtt_update(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data)
-{
-	struct i915_address_space *vm = ce->vm;
-
-	if (!HAS_LOGICAL_RING_CONTEXTS(vm->i915))
-		/* ppGTT is not part of the legacy context image */
-		return gen6_ppgtt_pin(i915_vm_to_ppgtt(vm), ww);
-
-	return 0;
-}
-
-static int emit_ppgtt_update(struct i915_request *rq, void *data)
-{
-	struct i915_address_space *vm = rq->context->vm;
-	struct intel_engine_cs *engine = rq->engine;
-	u32 base = engine->mmio_base;
-	u32 *cs;
-	int i;
-
-	if (i915_vm_is_4lvl(vm)) {
-		struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-		const dma_addr_t pd_daddr = px_dma(ppgtt->pd);
-
-		cs = intel_ring_begin(rq, 6);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		*cs++ = MI_LOAD_REGISTER_IMM(2);
-
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 0));
-		*cs++ = upper_32_bits(pd_daddr);
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 0));
-		*cs++ = lower_32_bits(pd_daddr);
-
-		*cs++ = MI_NOOP;
-		intel_ring_advance(rq, cs);
-	} else if (HAS_LOGICAL_RING_CONTEXTS(engine->i915)) {
-		struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-		int err;
-
-		/* Magic required to prevent forcewake errors! */
-		err = engine->emit_flush(rq, EMIT_INVALIDATE);
-		if (err)
-			return err;
-
-		cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
-		for (i = GEN8_3LVL_PDPES; i--; ) {
-			const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
-
-			*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, i));
-			*cs++ = upper_32_bits(pd_daddr);
-			*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, i));
-			*cs++ = lower_32_bits(pd_daddr);
-		}
-		*cs++ = MI_NOOP;
-		intel_ring_advance(rq, cs);
-	}
-
-	return 0;
-}
-
-static bool skip_ppgtt_update(struct intel_context *ce, void *data)
-{
-	if (HAS_LOGICAL_RING_CONTEXTS(ce->engine->i915))
-		return !ce->state;
-	else
-		return !atomic_read(&ce->pin_count);
-}
-
-static int set_ppgtt(struct drm_i915_file_private *file_priv,
-		     struct i915_gem_context *ctx,
-		     struct drm_i915_gem_context_param *args)
-{
-	struct i915_address_space *vm, *old;
-	int err;
-
-	if (args->size)
-		return -EINVAL;
-
-	if (!rcu_access_pointer(ctx->vm))
-		return -ENODEV;
-
-	if (upper_32_bits(args->value))
-		return -ENOENT;
-
-	rcu_read_lock();
-	vm = xa_load(&file_priv->vm_xa, args->value);
-	if (vm && !kref_get_unless_zero(&vm->ref))
-		vm = NULL;
-	rcu_read_unlock();
-	if (!vm)
-		return -ENOENT;
-
-	err = mutex_lock_interruptible(&ctx->mutex);
-	if (err)
-		goto out;
-
-	if (i915_gem_context_is_closed(ctx)) {
-		err = -ENOENT;
-		goto unlock;
-	}
-
-	if (vm == rcu_access_pointer(ctx->vm))
-		goto unlock;
-
-	old = __set_ppgtt(ctx, vm);
-
-	/* Teardown the existing obj:vma cache, it will have to be rebuilt. */
-	lut_close(ctx);
-
-	/*
-	 * We need to flush any requests using the current ppgtt before
-	 * we release it as the requests do not hold a reference themselves,
-	 * only indirectly through the context.
-	 */
-	err = context_barrier_task(ctx, ALL_ENGINES,
-				   skip_ppgtt_update,
-				   pin_ppgtt_update,
-				   emit_ppgtt_update,
-				   set_ppgtt_barrier,
-				   old);
-	if (err) {
-		i915_vm_close(__set_ppgtt(ctx, old));
-		i915_vm_close(old);
-		lut_close(ctx); /* force a rebuild of the old obj:vma cache */
-	}
-
-unlock:
-	mutex_unlock(&ctx->mutex);
-out:
-	i915_vm_put(vm);
-	return err;
-}
-
 int
 i915_gem_user_to_context_sseu(struct intel_gt *gt,
 			      const struct drm_i915_gem_context_param_sseu *user,
@@ -2364,10 +2101,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_sseu(ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_VM:
-		ret = set_ppgtt(fpriv, ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_ENGINES:
 		ret = set_engines(ctx, args);
 		break;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index 067ea3030ac91..4aee3667358f0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -153,7 +153,7 @@ struct i915_gem_context {
 	 * In other modes, this is a NULL pointer with the expectation that
 	 * the caller uses the shared global GTT.
 	 */
-	struct i915_address_space __rcu *vm;
+	struct i915_address_space *vm;
 
 	/**
 	 * @pid: process id of creator
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb5..16ff64ab34a1b 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1882,125 +1882,6 @@ static int igt_vm_isolation(void *arg)
 	return err;
 }
 
-static bool skip_unused_engines(struct intel_context *ce, void *data)
-{
-	return !ce->state;
-}
-
-static void mock_barrier_task(void *data)
-{
-	unsigned int *counter = data;
-
-	++*counter;
-}
-
-static int mock_context_barrier(void *arg)
-{
-#undef pr_fmt
-#define pr_fmt(x) "context_barrier_task():" # x
-	struct drm_i915_private *i915 = arg;
-	struct i915_gem_context *ctx;
-	struct i915_request *rq;
-	unsigned int counter;
-	int err;
-
-	/*
-	 * The context barrier provides us with a callback after it emits
-	 * a request; useful for retiring old state after loading new.
-	 */
-
-	ctx = mock_context(i915, "mock");
-	if (!ctx)
-		return -ENOMEM;
-
-	counter = 0;
-	err = context_barrier_task(ctx, 0, NULL, NULL, NULL,
-				   mock_barrier_task, &counter);
-	if (err) {
-		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
-		goto out;
-	}
-	if (counter == 0) {
-		pr_err("Did not retire immediately with 0 engines\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	counter = 0;
-	err = context_barrier_task(ctx, ALL_ENGINES, skip_unused_engines,
-				   NULL, NULL, mock_barrier_task, &counter);
-	if (err) {
-		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
-		goto out;
-	}
-	if (counter == 0) {
-		pr_err("Did not retire immediately for all unused engines\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-	rq = igt_request_alloc(ctx, i915->gt.engine[RCS0]);
-	if (IS_ERR(rq)) {
-		pr_err("Request allocation failed!\n");
-		goto out;
-	}
-	i915_request_add(rq);
-
-	counter = 0;
-	context_barrier_inject_fault = BIT(RCS0);
-	err = context_barrier_task(ctx, ALL_ENGINES, NULL, NULL, NULL,
-				   mock_barrier_task, &counter);
-	context_barrier_inject_fault = 0;
-	if (err == -ENXIO)
-		err = 0;
-	else
-		pr_err("Did not hit fault injection!\n");
-	if (counter != 0) {
-		pr_err("Invoked callback on error!\n");
-		err = -EIO;
-	}
-	if (err)
-		goto out;
-
-	counter = 0;
-	err = context_barrier_task(ctx, ALL_ENGINES, skip_unused_engines,
-				   NULL, NULL, mock_barrier_task, &counter);
-	if (err) {
-		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
-		goto out;
-	}
-	mock_device_flush(i915);
-	if (counter == 0) {
-		pr_err("Did not retire on each active engines\n");
-		err = -EINVAL;
-		goto out;
-	}
-
-out:
-	mock_context_close(ctx);
-	return err;
-#undef pr_fmt
-#define pr_fmt(x) x
-}
-
-int i915_gem_context_mock_selftests(void)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(mock_context_barrier),
-	};
-	struct drm_i915_private *i915;
-	int err;
-
-	i915 = mock_gem_device();
-	if (!i915)
-		return -ENOMEM;
-
-	err = i915_subtests(tests, i915);
-
-	mock_destroy_device(i915);
-	return err;
-}
-
 int i915_gem_context_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
index 3db34d3eea58a..52aa91716dc1f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
@@ -32,6 +32,5 @@ selftest(vma, i915_vma_mock_selftests)
 selftest(evict, i915_gem_evict_mock_selftests)
 selftest(gtt, i915_gem_gtt_mock_selftests)
 selftest(hugepages, i915_gem_huge_page_mock_selftests)
-selftest(contexts, i915_gem_context_mock_selftests)
 selftest(buddy, i915_buddy_mock_selftests)
 selftest(memory_region, intel_memory_region_mock_selftests)
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 18/21] drm/i915/gem: Don't allow changing the engine set on running contexts
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 301 --------------------
 1 file changed, 301 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3238260cffa31..ef23ab4260c24 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1722,303 +1722,6 @@ static int set_sseu(struct i915_gem_context *ctx,
 	return ret;
 }
 
-struct set_engines {
-	struct i915_gem_context *ctx;
-	struct i915_gem_engines *engines;
-};
-
-static int
-set_engines__load_balance(struct i915_user_extension __user *base, void *data)
-{
-	struct i915_context_engines_load_balance __user *ext =
-		container_of_user(base, typeof(*ext), base);
-	const struct set_engines *set = data;
-	struct drm_i915_private *i915 = set->ctx->i915;
-	struct intel_engine_cs *stack[16];
-	struct intel_engine_cs **siblings;
-	struct intel_context *ce;
-	u16 num_siblings, idx;
-	unsigned int n;
-	int err;
-
-	if (!HAS_EXECLISTS(i915))
-		return -ENODEV;
-
-	if (intel_uc_uses_guc_submission(&i915->gt.uc))
-		return -ENODEV; /* not implement yet */
-
-	if (get_user(idx, &ext->engine_index))
-		return -EFAULT;
-
-	if (idx >= set->engines->num_engines) {
-		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
-			idx, set->engines->num_engines);
-		return -EINVAL;
-	}
-
-	idx = array_index_nospec(idx, set->engines->num_engines);
-	if (set->engines->engines[idx]) {
-		drm_dbg(&i915->drm,
-			"Invalid placement[%d], already occupied\n", idx);
-		return -EEXIST;
-	}
-
-	if (get_user(num_siblings, &ext->num_siblings))
-		return -EFAULT;
-
-	err = check_user_mbz(&ext->flags);
-	if (err)
-		return err;
-
-	err = check_user_mbz(&ext->mbz64);
-	if (err)
-		return err;
-
-	siblings = stack;
-	if (num_siblings > ARRAY_SIZE(stack)) {
-		siblings = kmalloc_array(num_siblings,
-					 sizeof(*siblings),
-					 GFP_KERNEL);
-		if (!siblings)
-			return -ENOMEM;
-	}
-
-	for (n = 0; n < num_siblings; n++) {
-		struct i915_engine_class_instance ci;
-
-		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
-			err = -EFAULT;
-			goto out_siblings;
-		}
-
-		siblings[n] = intel_engine_lookup_user(i915,
-						       ci.engine_class,
-						       ci.engine_instance);
-		if (!siblings[n]) {
-			drm_dbg(&i915->drm,
-				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
-				n, ci.engine_class, ci.engine_instance);
-			err = -EINVAL;
-			goto out_siblings;
-		}
-	}
-
-	ce = intel_execlists_create_virtual(siblings, n);
-	if (IS_ERR(ce)) {
-		err = PTR_ERR(ce);
-		goto out_siblings;
-	}
-
-	intel_context_set_gem(ce, set->ctx);
-
-	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
-		intel_context_put(ce);
-		err = -EEXIST;
-		goto out_siblings;
-	}
-
-out_siblings:
-	if (siblings != stack)
-		kfree(siblings);
-
-	return err;
-}
-
-static int
-set_engines__bond(struct i915_user_extension __user *base, void *data)
-{
-	struct i915_context_engines_bond __user *ext =
-		container_of_user(base, typeof(*ext), base);
-	const struct set_engines *set = data;
-	struct drm_i915_private *i915 = set->ctx->i915;
-	struct i915_engine_class_instance ci;
-	struct intel_engine_cs *virtual;
-	struct intel_engine_cs *master;
-	u16 idx, num_bonds;
-	int err, n;
-
-	if (get_user(idx, &ext->virtual_index))
-		return -EFAULT;
-
-	if (idx >= set->engines->num_engines) {
-		drm_dbg(&i915->drm,
-			"Invalid index for virtual engine: %d >= %d\n",
-			idx, set->engines->num_engines);
-		return -EINVAL;
-	}
-
-	idx = array_index_nospec(idx, set->engines->num_engines);
-	if (!set->engines->engines[idx]) {
-		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
-		return -EINVAL;
-	}
-	virtual = set->engines->engines[idx]->engine;
-
-	if (intel_engine_is_virtual(virtual)) {
-		drm_dbg(&i915->drm,
-			"Bonding with virtual engines not allowed\n");
-		return -EINVAL;
-	}
-
-	err = check_user_mbz(&ext->flags);
-	if (err)
-		return err;
-
-	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
-		err = check_user_mbz(&ext->mbz64[n]);
-		if (err)
-			return err;
-	}
-
-	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
-		return -EFAULT;
-
-	master = intel_engine_lookup_user(i915,
-					  ci.engine_class, ci.engine_instance);
-	if (!master) {
-		drm_dbg(&i915->drm,
-			"Unrecognised master engine: { class:%u, instance:%u }\n",
-			ci.engine_class, ci.engine_instance);
-		return -EINVAL;
-	}
-
-	if (get_user(num_bonds, &ext->num_bonds))
-		return -EFAULT;
-
-	for (n = 0; n < num_bonds; n++) {
-		struct intel_engine_cs *bond;
-
-		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
-			return -EFAULT;
-
-		bond = intel_engine_lookup_user(i915,
-						ci.engine_class,
-						ci.engine_instance);
-		if (!bond) {
-			drm_dbg(&i915->drm,
-				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
-				n, ci.engine_class, ci.engine_instance);
-			return -EINVAL;
-		}
-	}
-
-	return 0;
-}
-
-static const i915_user_extension_fn set_engines__extensions[] = {
-	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
-	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
-};
-
-static int
-set_engines(struct i915_gem_context *ctx,
-	    const struct drm_i915_gem_context_param *args)
-{
-	struct drm_i915_private *i915 = ctx->i915;
-	struct i915_context_param_engines __user *user =
-		u64_to_user_ptr(args->value);
-	struct set_engines set = { .ctx = ctx };
-	unsigned int num_engines, n;
-	u64 extensions;
-	int err;
-
-	if (!args->size) { /* switch back to legacy user_ring_map */
-		if (!i915_gem_context_user_engines(ctx))
-			return 0;
-
-		set.engines = default_engines(ctx);
-		if (IS_ERR(set.engines))
-			return PTR_ERR(set.engines);
-
-		goto replace;
-	}
-
-	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
-	if (args->size < sizeof(*user) ||
-	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
-		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
-			args->size);
-		return -EINVAL;
-	}
-
-	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
-	if (num_engines > I915_EXEC_RING_MASK + 1)
-		return -EINVAL;
-
-	set.engines = alloc_engines(num_engines);
-	if (!set.engines)
-		return -ENOMEM;
-
-	for (n = 0; n < num_engines; n++) {
-		struct i915_engine_class_instance ci;
-		struct intel_engine_cs *engine;
-		struct intel_context *ce;
-
-		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
-			__free_engines(set.engines, n);
-			return -EFAULT;
-		}
-
-		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
-		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
-			set.engines->engines[n] = NULL;
-			continue;
-		}
-
-		engine = intel_engine_lookup_user(ctx->i915,
-						  ci.engine_class,
-						  ci.engine_instance);
-		if (!engine) {
-			drm_dbg(&i915->drm,
-				"Invalid engine[%d]: { class:%d, instance:%d }\n",
-				n, ci.engine_class, ci.engine_instance);
-			__free_engines(set.engines, n);
-			return -ENOENT;
-		}
-
-		ce = intel_context_create(engine);
-		if (IS_ERR(ce)) {
-			__free_engines(set.engines, n);
-			return PTR_ERR(ce);
-		}
-
-		intel_context_set_gem(ce, ctx);
-
-		set.engines->engines[n] = ce;
-	}
-	set.engines->num_engines = num_engines;
-
-	err = -EFAULT;
-	if (!get_user(extensions, &user->extensions))
-		err = i915_user_extensions(u64_to_user_ptr(extensions),
-					   set_engines__extensions,
-					   ARRAY_SIZE(set_engines__extensions),
-					   &set);
-	if (err) {
-		free_engines(set.engines);
-		return err;
-	}
-
-replace:
-	mutex_lock(&ctx->engines_mutex);
-	if (i915_gem_context_is_closed(ctx)) {
-		mutex_unlock(&ctx->engines_mutex);
-		free_engines(set.engines);
-		return -ENOENT;
-	}
-	if (args->size)
-		i915_gem_context_set_user_engines(ctx);
-	else
-		i915_gem_context_clear_user_engines(ctx);
-	set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1);
-	mutex_unlock(&ctx->engines_mutex);
-
-	/* Keep track of old engine sets for kill_context() */
-	engines_idle_release(ctx, set.engines);
-
-	return 0;
-}
-
 static int
 set_persistence(struct i915_gem_context *ctx,
 		const struct drm_i915_gem_context_param *args)
@@ -2101,10 +1804,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_sseu(ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_ENGINES:
-		ret = set_engines(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_PERSISTENCE:
 		ret = set_persistence(ctx, args);
 		break;
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 18/21] drm/i915/gem: Don't allow changing the engine set on running contexts
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 301 --------------------
 1 file changed, 301 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3238260cffa31..ef23ab4260c24 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1722,303 +1722,6 @@ static int set_sseu(struct i915_gem_context *ctx,
 	return ret;
 }
 
-struct set_engines {
-	struct i915_gem_context *ctx;
-	struct i915_gem_engines *engines;
-};
-
-static int
-set_engines__load_balance(struct i915_user_extension __user *base, void *data)
-{
-	struct i915_context_engines_load_balance __user *ext =
-		container_of_user(base, typeof(*ext), base);
-	const struct set_engines *set = data;
-	struct drm_i915_private *i915 = set->ctx->i915;
-	struct intel_engine_cs *stack[16];
-	struct intel_engine_cs **siblings;
-	struct intel_context *ce;
-	u16 num_siblings, idx;
-	unsigned int n;
-	int err;
-
-	if (!HAS_EXECLISTS(i915))
-		return -ENODEV;
-
-	if (intel_uc_uses_guc_submission(&i915->gt.uc))
-		return -ENODEV; /* not implement yet */
-
-	if (get_user(idx, &ext->engine_index))
-		return -EFAULT;
-
-	if (idx >= set->engines->num_engines) {
-		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
-			idx, set->engines->num_engines);
-		return -EINVAL;
-	}
-
-	idx = array_index_nospec(idx, set->engines->num_engines);
-	if (set->engines->engines[idx]) {
-		drm_dbg(&i915->drm,
-			"Invalid placement[%d], already occupied\n", idx);
-		return -EEXIST;
-	}
-
-	if (get_user(num_siblings, &ext->num_siblings))
-		return -EFAULT;
-
-	err = check_user_mbz(&ext->flags);
-	if (err)
-		return err;
-
-	err = check_user_mbz(&ext->mbz64);
-	if (err)
-		return err;
-
-	siblings = stack;
-	if (num_siblings > ARRAY_SIZE(stack)) {
-		siblings = kmalloc_array(num_siblings,
-					 sizeof(*siblings),
-					 GFP_KERNEL);
-		if (!siblings)
-			return -ENOMEM;
-	}
-
-	for (n = 0; n < num_siblings; n++) {
-		struct i915_engine_class_instance ci;
-
-		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
-			err = -EFAULT;
-			goto out_siblings;
-		}
-
-		siblings[n] = intel_engine_lookup_user(i915,
-						       ci.engine_class,
-						       ci.engine_instance);
-		if (!siblings[n]) {
-			drm_dbg(&i915->drm,
-				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
-				n, ci.engine_class, ci.engine_instance);
-			err = -EINVAL;
-			goto out_siblings;
-		}
-	}
-
-	ce = intel_execlists_create_virtual(siblings, n);
-	if (IS_ERR(ce)) {
-		err = PTR_ERR(ce);
-		goto out_siblings;
-	}
-
-	intel_context_set_gem(ce, set->ctx);
-
-	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
-		intel_context_put(ce);
-		err = -EEXIST;
-		goto out_siblings;
-	}
-
-out_siblings:
-	if (siblings != stack)
-		kfree(siblings);
-
-	return err;
-}
-
-static int
-set_engines__bond(struct i915_user_extension __user *base, void *data)
-{
-	struct i915_context_engines_bond __user *ext =
-		container_of_user(base, typeof(*ext), base);
-	const struct set_engines *set = data;
-	struct drm_i915_private *i915 = set->ctx->i915;
-	struct i915_engine_class_instance ci;
-	struct intel_engine_cs *virtual;
-	struct intel_engine_cs *master;
-	u16 idx, num_bonds;
-	int err, n;
-
-	if (get_user(idx, &ext->virtual_index))
-		return -EFAULT;
-
-	if (idx >= set->engines->num_engines) {
-		drm_dbg(&i915->drm,
-			"Invalid index for virtual engine: %d >= %d\n",
-			idx, set->engines->num_engines);
-		return -EINVAL;
-	}
-
-	idx = array_index_nospec(idx, set->engines->num_engines);
-	if (!set->engines->engines[idx]) {
-		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
-		return -EINVAL;
-	}
-	virtual = set->engines->engines[idx]->engine;
-
-	if (intel_engine_is_virtual(virtual)) {
-		drm_dbg(&i915->drm,
-			"Bonding with virtual engines not allowed\n");
-		return -EINVAL;
-	}
-
-	err = check_user_mbz(&ext->flags);
-	if (err)
-		return err;
-
-	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
-		err = check_user_mbz(&ext->mbz64[n]);
-		if (err)
-			return err;
-	}
-
-	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
-		return -EFAULT;
-
-	master = intel_engine_lookup_user(i915,
-					  ci.engine_class, ci.engine_instance);
-	if (!master) {
-		drm_dbg(&i915->drm,
-			"Unrecognised master engine: { class:%u, instance:%u }\n",
-			ci.engine_class, ci.engine_instance);
-		return -EINVAL;
-	}
-
-	if (get_user(num_bonds, &ext->num_bonds))
-		return -EFAULT;
-
-	for (n = 0; n < num_bonds; n++) {
-		struct intel_engine_cs *bond;
-
-		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
-			return -EFAULT;
-
-		bond = intel_engine_lookup_user(i915,
-						ci.engine_class,
-						ci.engine_instance);
-		if (!bond) {
-			drm_dbg(&i915->drm,
-				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
-				n, ci.engine_class, ci.engine_instance);
-			return -EINVAL;
-		}
-	}
-
-	return 0;
-}
-
-static const i915_user_extension_fn set_engines__extensions[] = {
-	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
-	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
-};
-
-static int
-set_engines(struct i915_gem_context *ctx,
-	    const struct drm_i915_gem_context_param *args)
-{
-	struct drm_i915_private *i915 = ctx->i915;
-	struct i915_context_param_engines __user *user =
-		u64_to_user_ptr(args->value);
-	struct set_engines set = { .ctx = ctx };
-	unsigned int num_engines, n;
-	u64 extensions;
-	int err;
-
-	if (!args->size) { /* switch back to legacy user_ring_map */
-		if (!i915_gem_context_user_engines(ctx))
-			return 0;
-
-		set.engines = default_engines(ctx);
-		if (IS_ERR(set.engines))
-			return PTR_ERR(set.engines);
-
-		goto replace;
-	}
-
-	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
-	if (args->size < sizeof(*user) ||
-	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
-		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
-			args->size);
-		return -EINVAL;
-	}
-
-	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
-	if (num_engines > I915_EXEC_RING_MASK + 1)
-		return -EINVAL;
-
-	set.engines = alloc_engines(num_engines);
-	if (!set.engines)
-		return -ENOMEM;
-
-	for (n = 0; n < num_engines; n++) {
-		struct i915_engine_class_instance ci;
-		struct intel_engine_cs *engine;
-		struct intel_context *ce;
-
-		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
-			__free_engines(set.engines, n);
-			return -EFAULT;
-		}
-
-		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
-		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
-			set.engines->engines[n] = NULL;
-			continue;
-		}
-
-		engine = intel_engine_lookup_user(ctx->i915,
-						  ci.engine_class,
-						  ci.engine_instance);
-		if (!engine) {
-			drm_dbg(&i915->drm,
-				"Invalid engine[%d]: { class:%d, instance:%d }\n",
-				n, ci.engine_class, ci.engine_instance);
-			__free_engines(set.engines, n);
-			return -ENOENT;
-		}
-
-		ce = intel_context_create(engine);
-		if (IS_ERR(ce)) {
-			__free_engines(set.engines, n);
-			return PTR_ERR(ce);
-		}
-
-		intel_context_set_gem(ce, ctx);
-
-		set.engines->engines[n] = ce;
-	}
-	set.engines->num_engines = num_engines;
-
-	err = -EFAULT;
-	if (!get_user(extensions, &user->extensions))
-		err = i915_user_extensions(u64_to_user_ptr(extensions),
-					   set_engines__extensions,
-					   ARRAY_SIZE(set_engines__extensions),
-					   &set);
-	if (err) {
-		free_engines(set.engines);
-		return err;
-	}
-
-replace:
-	mutex_lock(&ctx->engines_mutex);
-	if (i915_gem_context_is_closed(ctx)) {
-		mutex_unlock(&ctx->engines_mutex);
-		free_engines(set.engines);
-		return -ENOENT;
-	}
-	if (args->size)
-		i915_gem_context_set_user_engines(ctx);
-	else
-		i915_gem_context_clear_user_engines(ctx);
-	set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1);
-	mutex_unlock(&ctx->engines_mutex);
-
-	/* Keep track of old engine sets for kill_context() */
-	engines_idle_release(ctx, set.engines);
-
-	return 0;
-}
-
 static int
 set_persistence(struct i915_gem_context *ctx,
 		const struct drm_i915_gem_context_param *args)
@@ -2101,10 +1804,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_sseu(ctx, args);
 		break;
 
-	case I915_CONTEXT_PARAM_ENGINES:
-		ret = set_engines(ctx, args);
-		break;
-
 	case I915_CONTEXT_PARAM_PERSISTENCE:
 		ret = set_persistence(ctx, args);
 		break;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 19/21] drm/i915/selftests: Take a VM in kernel_context()
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 .../drm/i915/gem/selftests/i915_gem_context.c |  4 ++--
 .../gpu/drm/i915/gem/selftests/mock_context.c |  8 +++++++-
 .../gpu/drm/i915/gem/selftests/mock_context.h |  4 +++-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++++++++----------
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 5 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 16ff64ab34a1b..76029d7143f6c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -680,7 +680,7 @@ static int igt_ctx_exec(void *arg)
 			struct i915_gem_context *ctx;
 			struct intel_context *ce;
 
-			ctx = kernel_context(i915);
+			ctx = kernel_context(i915, NULL);
 			if (IS_ERR(ctx)) {
 				err = PTR_ERR(ctx);
 				goto out_file;
@@ -813,7 +813,7 @@ static int igt_shared_ctx_exec(void *arg)
 			struct i915_gem_context *ctx;
 			struct intel_context *ce;
 
-			ctx = kernel_context(i915);
+			ctx = kernel_context(i915, NULL);
 			if (IS_ERR(ctx)) {
 				err = PTR_ERR(ctx);
 				goto out_test;
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 32cf2103828f9..e4aced7eabb72 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -148,7 +148,8 @@ live_context_for_engine(struct intel_engine_cs *engine, struct file *file)
 }
 
 struct i915_gem_context *
-kernel_context(struct drm_i915_private *i915)
+kernel_context(struct drm_i915_private *i915,
+	       struct i915_address_space *vm)
 {
 	struct i915_gem_context *ctx;
 	struct i915_gem_proto_context *pc;
@@ -157,6 +158,11 @@ kernel_context(struct drm_i915_private *i915)
 	if (IS_ERR(pc))
 		return ERR_CAST(pc);
 
+	if (vm) {
+		i915_vm_put(pc->vm);
+		pc->vm = i915_vm_get(vm);
+	}
+
 	ctx = i915_gem_create_context(i915, pc);
 	proto_context_close(pc);
 	if (IS_ERR(ctx))
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.h b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
index 2a6121d33352d..7a02fd9b5866a 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.h
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
@@ -10,6 +10,7 @@
 struct file;
 struct drm_i915_private;
 struct intel_engine_cs;
+struct i915_address_space;
 
 void mock_init_contexts(struct drm_i915_private *i915);
 
@@ -25,7 +26,8 @@ live_context(struct drm_i915_private *i915, struct file *file);
 struct i915_gem_context *
 live_context_for_engine(struct intel_engine_cs *engine, struct file *file);
 
-struct i915_gem_context *kernel_context(struct drm_i915_private *i915);
+struct i915_gem_context *kernel_context(struct drm_i915_private *i915,
+					struct i915_address_space *vm);
 void kernel_context_close(struct i915_gem_context *ctx);
 
 #endif /* !__MOCK_CONTEXT_H */
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index f03446d587160..0bb35c29ea193 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -1522,12 +1522,12 @@ static int live_busywait_preempt(void *arg)
 	 * preempt the busywaits used to synchronise between rings.
 	 */
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		return -ENOMEM;
 	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1724,12 +1724,12 @@ static int live_preempt(void *arg)
 	if (igt_spinner_init(&spin_lo, gt))
 		goto err_spin_hi;
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		goto err_spin_lo;
 	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1816,11 +1816,11 @@ static int live_late_preempt(void *arg)
 	if (igt_spinner_init(&spin_lo, gt))
 		goto err_spin_hi;
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		goto err_spin_lo;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 
@@ -1910,7 +1910,7 @@ struct preempt_client {
 
 static int preempt_client_init(struct intel_gt *gt, struct preempt_client *c)
 {
-	c->ctx = kernel_context(gt->i915);
+	c->ctx = kernel_context(gt->i915, NULL);
 	if (!c->ctx)
 		return -ENOMEM;
 
@@ -3367,12 +3367,12 @@ static int live_preempt_timeout(void *arg)
 	if (igt_spinner_init(&spin_lo, gt))
 		return -ENOMEM;
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		goto err_spin_lo;
 	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -3659,7 +3659,7 @@ static int live_preempt_smoke(void *arg)
 	}
 
 	for (n = 0; n < smoke.ncontext; n++) {
-		smoke.contexts[n] = kernel_context(smoke.gt->i915);
+		smoke.contexts[n] = kernel_context(smoke.gt->i915, NULL);
 		if (!smoke.contexts[n])
 			goto err_ctx;
 	}
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 746985971c3a6..3676eaf6b2aee 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -42,7 +42,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	memset(h, 0, sizeof(*h));
 	h->gt = gt;
 
-	h->ctx = kernel_context(gt->i915);
+	h->ctx = kernel_context(gt->i915, NULL);
 	if (IS_ERR(h->ctx))
 		return PTR_ERR(h->ctx);
 
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 19/21] drm/i915/selftests: Take a VM in kernel_context()
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 .../drm/i915/gem/selftests/i915_gem_context.c |  4 ++--
 .../gpu/drm/i915/gem/selftests/mock_context.c |  8 +++++++-
 .../gpu/drm/i915/gem/selftests/mock_context.h |  4 +++-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++++++++----------
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 5 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 16ff64ab34a1b..76029d7143f6c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -680,7 +680,7 @@ static int igt_ctx_exec(void *arg)
 			struct i915_gem_context *ctx;
 			struct intel_context *ce;
 
-			ctx = kernel_context(i915);
+			ctx = kernel_context(i915, NULL);
 			if (IS_ERR(ctx)) {
 				err = PTR_ERR(ctx);
 				goto out_file;
@@ -813,7 +813,7 @@ static int igt_shared_ctx_exec(void *arg)
 			struct i915_gem_context *ctx;
 			struct intel_context *ce;
 
-			ctx = kernel_context(i915);
+			ctx = kernel_context(i915, NULL);
 			if (IS_ERR(ctx)) {
 				err = PTR_ERR(ctx);
 				goto out_test;
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 32cf2103828f9..e4aced7eabb72 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -148,7 +148,8 @@ live_context_for_engine(struct intel_engine_cs *engine, struct file *file)
 }
 
 struct i915_gem_context *
-kernel_context(struct drm_i915_private *i915)
+kernel_context(struct drm_i915_private *i915,
+	       struct i915_address_space *vm)
 {
 	struct i915_gem_context *ctx;
 	struct i915_gem_proto_context *pc;
@@ -157,6 +158,11 @@ kernel_context(struct drm_i915_private *i915)
 	if (IS_ERR(pc))
 		return ERR_CAST(pc);
 
+	if (vm) {
+		i915_vm_put(pc->vm);
+		pc->vm = i915_vm_get(vm);
+	}
+
 	ctx = i915_gem_create_context(i915, pc);
 	proto_context_close(pc);
 	if (IS_ERR(ctx))
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.h b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
index 2a6121d33352d..7a02fd9b5866a 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.h
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
@@ -10,6 +10,7 @@
 struct file;
 struct drm_i915_private;
 struct intel_engine_cs;
+struct i915_address_space;
 
 void mock_init_contexts(struct drm_i915_private *i915);
 
@@ -25,7 +26,8 @@ live_context(struct drm_i915_private *i915, struct file *file);
 struct i915_gem_context *
 live_context_for_engine(struct intel_engine_cs *engine, struct file *file);
 
-struct i915_gem_context *kernel_context(struct drm_i915_private *i915);
+struct i915_gem_context *kernel_context(struct drm_i915_private *i915,
+					struct i915_address_space *vm);
 void kernel_context_close(struct i915_gem_context *ctx);
 
 #endif /* !__MOCK_CONTEXT_H */
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index f03446d587160..0bb35c29ea193 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -1522,12 +1522,12 @@ static int live_busywait_preempt(void *arg)
 	 * preempt the busywaits used to synchronise between rings.
 	 */
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		return -ENOMEM;
 	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1724,12 +1724,12 @@ static int live_preempt(void *arg)
 	if (igt_spinner_init(&spin_lo, gt))
 		goto err_spin_hi;
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		goto err_spin_lo;
 	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1816,11 +1816,11 @@ static int live_late_preempt(void *arg)
 	if (igt_spinner_init(&spin_lo, gt))
 		goto err_spin_hi;
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		goto err_spin_lo;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 
@@ -1910,7 +1910,7 @@ struct preempt_client {
 
 static int preempt_client_init(struct intel_gt *gt, struct preempt_client *c)
 {
-	c->ctx = kernel_context(gt->i915);
+	c->ctx = kernel_context(gt->i915, NULL);
 	if (!c->ctx)
 		return -ENOMEM;
 
@@ -3367,12 +3367,12 @@ static int live_preempt_timeout(void *arg)
 	if (igt_spinner_init(&spin_lo, gt))
 		return -ENOMEM;
 
-	ctx_hi = kernel_context(gt->i915);
+	ctx_hi = kernel_context(gt->i915, NULL);
 	if (!ctx_hi)
 		goto err_spin_lo;
 	ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-	ctx_lo = kernel_context(gt->i915);
+	ctx_lo = kernel_context(gt->i915, NULL);
 	if (!ctx_lo)
 		goto err_ctx_hi;
 	ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -3659,7 +3659,7 @@ static int live_preempt_smoke(void *arg)
 	}
 
 	for (n = 0; n < smoke.ncontext; n++) {
-		smoke.contexts[n] = kernel_context(smoke.gt->i915);
+		smoke.contexts[n] = kernel_context(smoke.gt->i915, NULL);
 		if (!smoke.contexts[n])
 			goto err_ctx;
 	}
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 746985971c3a6..3676eaf6b2aee 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -42,7 +42,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	memset(h, 0, sizeof(*h));
 	h->gt = gt;
 
-	h->ctx = kernel_context(gt->i915);
+	h->ctx = kernel_context(gt->i915, NULL);
 	if (IS_ERR(h->ctx))
 		return PTR_ERR(h->ctx);
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 20/21] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 76029d7143f6c..76dd5cfe11b3c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
 			struct i915_gem_context *ctx;
 			struct intel_context *ce;
 
-			ctx = kernel_context(i915, NULL);
+			ctx = kernel_context(i915, ctx_vm(parent));
 			if (IS_ERR(ctx)) {
 				err = PTR_ERR(ctx);
 				goto out_test;
 			}
 
-			mutex_lock(&ctx->mutex);
-			__assign_ppgtt(ctx, ctx_vm(parent));
-			mutex_unlock(&ctx->mutex);
-
 			ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
 			GEM_BUG_ON(IS_ERR(ce));
 
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 20/21] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 76029d7143f6c..76dd5cfe11b3c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
 			struct i915_gem_context *ctx;
 			struct intel_context *ce;
 
-			ctx = kernel_context(i915, NULL);
+			ctx = kernel_context(i915, ctx_vm(parent));
 			if (IS_ERR(ctx)) {
 				err = PTR_ERR(ctx);
 				goto out_test;
 			}
 
-			mutex_lock(&ctx->mutex);
-			__assign_ppgtt(ctx, ctx_vm(parent));
-			mutex_unlock(&ctx->mutex);
-
 			ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
 			GEM_BUG_ON(IS_ERR(ce));
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [PATCH 21/21] drm/i915/gem: Roll all of context creation together
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-23 22:31   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Now that we have the whole engine set and VM at context creation time,
we can just assign those fields instead of creating first and handling
the VM and engines later.  This lets us avoid creating useless VMs and
engine sets and lets us git rid of the complex VM setting code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 159 ++++++------------
 .../gpu/drm/i915/gem/selftests/mock_context.c |  33 ++--
 2 files changed, 64 insertions(+), 128 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ef23ab4260c24..829730d402e8a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1201,56 +1201,6 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
 	return 0;
 }
 
-static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915,
-		 const struct i915_gem_proto_context *pc)
-{
-	struct i915_gem_context *ctx;
-	struct i915_gem_engines *e;
-	int err;
-	int i;
-
-	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
-	if (!ctx)
-		return ERR_PTR(-ENOMEM);
-
-	kref_init(&ctx->ref);
-	ctx->i915 = i915;
-	ctx->sched = pc->sched;
-	mutex_init(&ctx->mutex);
-	INIT_LIST_HEAD(&ctx->link);
-
-	spin_lock_init(&ctx->stale.lock);
-	INIT_LIST_HEAD(&ctx->stale.engines);
-
-	mutex_init(&ctx->engines_mutex);
-	e = default_engines(ctx);
-	if (IS_ERR(e)) {
-		err = PTR_ERR(e);
-		goto err_free;
-	}
-	RCU_INIT_POINTER(ctx->engines, e);
-
-	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
-	mutex_init(&ctx->lut_mutex);
-
-	/* NB: Mark all slices as needing a remap so that when the context first
-	 * loads it will restore whatever remap state already exists. If there
-	 * is no remap info, it will be a NOP. */
-	ctx->remap_slice = ALL_L3_SLICES(i915);
-
-	ctx->user_flags = pc->user_flags;
-
-	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
-		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
-
-	return ctx;
-
-err_free:
-	kfree(ctx);
-	return ERR_PTR(err);
-}
-
 static inline struct i915_gem_engines *
 __context_engines_await(const struct i915_gem_context *ctx,
 			bool *user_engines)
@@ -1294,86 +1244,77 @@ context_apply_all(struct i915_gem_context *ctx,
 	i915_sw_fence_complete(&e->fence);
 }
 
-static void __apply_ppgtt(struct intel_context *ce, void *vm)
-{
-	i915_vm_put(ce->vm);
-	ce->vm = i915_vm_get(vm);
-}
-
-static struct i915_address_space *
-__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm)
-{
-	struct i915_address_space *old;
-
-	old = rcu_replace_pointer(ctx->vm,
-				  i915_vm_open(vm),
-				  lockdep_is_held(&ctx->mutex));
-	GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old));
-
-	context_apply_all(ctx, __apply_ppgtt, vm);
-
-	return old;
-}
-
-static void __assign_ppgtt(struct i915_gem_context *ctx,
-			   struct i915_address_space *vm)
-{
-	if (vm == rcu_access_pointer(ctx->vm))
-		return;
-
-	vm = __set_ppgtt(ctx, vm);
-	if (vm)
-		i915_vm_close(vm);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915,
 			const struct i915_gem_proto_context *pc)
 {
 	struct i915_gem_context *ctx;
-	int ret;
+	struct i915_gem_engines *e;
+	int err;
+	int i;
 
-	ctx = __create_context(i915, pc);
-	if (IS_ERR(ctx))
-		return ctx;
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
 
-	if (pc->vm) {
-		mutex_lock(&ctx->mutex);
-		__assign_ppgtt(ctx, pc->vm);
-		mutex_unlock(&ctx->mutex);
-	}
+	kref_init(&ctx->ref);
+	ctx->i915 = i915;
+	ctx->sched = pc->sched;
+	mutex_init(&ctx->mutex);
+	INIT_LIST_HEAD(&ctx->link);
 
-	if (pc->num_user_engines >= 0) {
-		struct i915_gem_engines *engines;
+	spin_lock_init(&ctx->stale.lock);
+	INIT_LIST_HEAD(&ctx->stale.engines);
 
-		engines = user_engines(ctx, pc->num_user_engines,
-				       pc->user_engines);
-		if (IS_ERR(engines)) {
-			context_close(ctx);
-			return ERR_CAST(engines);
-		}
+	if (pc->vm)
+		RCU_INIT_POINTER(ctx->vm, i915_vm_open(pc->vm));
 
-		mutex_lock(&ctx->engines_mutex);
+	mutex_init(&ctx->engines_mutex);
+	if (pc->num_user_engines >= 0) {
 		i915_gem_context_set_user_engines(ctx);
-		engines = rcu_replace_pointer(ctx->engines, engines, 1);
-		mutex_unlock(&ctx->engines_mutex);
-
-		free_engines(engines);
+		e = user_engines(ctx, pc->num_user_engines, pc->user_engines);
+	} else {
+		i915_gem_context_clear_user_engines(ctx);
+		e = default_engines(ctx);
+	}
+	if (IS_ERR(e)) {
+		err = PTR_ERR(e);
+		goto err_vm;
 	}
+	RCU_INIT_POINTER(ctx->engines, e);
+
+	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
+	mutex_init(&ctx->lut_mutex);
+
+	/* NB: Mark all slices as needing a remap so that when the context first
+	 * loads it will restore whatever remap state already exists. If there
+	 * is no remap info, it will be a NOP. */
+	ctx->remap_slice = ALL_L3_SLICES(i915);
+
+	ctx->user_flags = pc->user_flags;
+
+	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
+		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
 
 	if (pc->single_timeline) {
-		ret = drm_syncobj_create(&ctx->syncobj,
+		err = drm_syncobj_create(&ctx->syncobj,
 					 DRM_SYNCOBJ_CREATE_SIGNALED,
 					 NULL);
-		if (ret) {
-			context_close(ctx);
-			return ERR_PTR(ret);
-		}
+		if (err)
+			goto err_engines;
 	}
 
 	trace_i915_context_create(ctx);
 
 	return ctx;
+
+err_engines:
+	free_engines(e);
+err_vm:
+	if (ctx->vm)
+		i915_vm_close(ctx->vm);
+	kfree(ctx);
+	return ERR_PTR(err);
 }
 
 static void init_contexts(struct i915_gem_contexts *gc)
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index e4aced7eabb72..5ee7e9bb6175d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -30,15 +30,6 @@ mock_context(struct drm_i915_private *i915,
 
 	i915_gem_context_set_persistence(ctx);
 
-	mutex_init(&ctx->engines_mutex);
-	e = default_engines(ctx);
-	if (IS_ERR(e))
-		goto err_free;
-	RCU_INIT_POINTER(ctx->engines, e);
-
-	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
-	mutex_init(&ctx->lut_mutex);
-
 	if (name) {
 		struct i915_ppgtt *ppgtt;
 
@@ -46,25 +37,29 @@ mock_context(struct drm_i915_private *i915,
 
 		ppgtt = mock_ppgtt(i915, name);
 		if (!ppgtt)
-			goto err_put;
-
-		mutex_lock(&ctx->mutex);
-		__set_ppgtt(ctx, &ppgtt->vm);
-		mutex_unlock(&ctx->mutex);
+			goto err_free;
 
+		ctx->vm = i915_vm_open(&ppgtt->vm);
 		i915_vm_put(&ppgtt->vm);
 	}
 
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e))
+		goto err_vm;
+	RCU_INIT_POINTER(ctx->engines, e);
+
+	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
+	mutex_init(&ctx->lut_mutex);
+
 	return ctx;
 
+err_vm:
+	if (ctx->vm)
+		i915_vm_close(ctx->vm);
 err_free:
 	kfree(ctx);
 	return NULL;
-
-err_put:
-	i915_gem_context_set_closed(ctx);
-	i915_gem_context_put(ctx);
-	return NULL;
 }
 
 void mock_context_close(struct i915_gem_context *ctx)
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 21/21] drm/i915/gem: Roll all of context creation together
@ 2021-04-23 22:31   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-23 22:31 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Now that we have the whole engine set and VM at context creation time,
we can just assign those fields instead of creating first and handling
the VM and engines later.  This lets us avoid creating useless VMs and
engine sets and lets us git rid of the complex VM setting code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 159 ++++++------------
 .../gpu/drm/i915/gem/selftests/mock_context.c |  33 ++--
 2 files changed, 64 insertions(+), 128 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ef23ab4260c24..829730d402e8a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1201,56 +1201,6 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
 	return 0;
 }
 
-static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915,
-		 const struct i915_gem_proto_context *pc)
-{
-	struct i915_gem_context *ctx;
-	struct i915_gem_engines *e;
-	int err;
-	int i;
-
-	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
-	if (!ctx)
-		return ERR_PTR(-ENOMEM);
-
-	kref_init(&ctx->ref);
-	ctx->i915 = i915;
-	ctx->sched = pc->sched;
-	mutex_init(&ctx->mutex);
-	INIT_LIST_HEAD(&ctx->link);
-
-	spin_lock_init(&ctx->stale.lock);
-	INIT_LIST_HEAD(&ctx->stale.engines);
-
-	mutex_init(&ctx->engines_mutex);
-	e = default_engines(ctx);
-	if (IS_ERR(e)) {
-		err = PTR_ERR(e);
-		goto err_free;
-	}
-	RCU_INIT_POINTER(ctx->engines, e);
-
-	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
-	mutex_init(&ctx->lut_mutex);
-
-	/* NB: Mark all slices as needing a remap so that when the context first
-	 * loads it will restore whatever remap state already exists. If there
-	 * is no remap info, it will be a NOP. */
-	ctx->remap_slice = ALL_L3_SLICES(i915);
-
-	ctx->user_flags = pc->user_flags;
-
-	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
-		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
-
-	return ctx;
-
-err_free:
-	kfree(ctx);
-	return ERR_PTR(err);
-}
-
 static inline struct i915_gem_engines *
 __context_engines_await(const struct i915_gem_context *ctx,
 			bool *user_engines)
@@ -1294,86 +1244,77 @@ context_apply_all(struct i915_gem_context *ctx,
 	i915_sw_fence_complete(&e->fence);
 }
 
-static void __apply_ppgtt(struct intel_context *ce, void *vm)
-{
-	i915_vm_put(ce->vm);
-	ce->vm = i915_vm_get(vm);
-}
-
-static struct i915_address_space *
-__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm)
-{
-	struct i915_address_space *old;
-
-	old = rcu_replace_pointer(ctx->vm,
-				  i915_vm_open(vm),
-				  lockdep_is_held(&ctx->mutex));
-	GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old));
-
-	context_apply_all(ctx, __apply_ppgtt, vm);
-
-	return old;
-}
-
-static void __assign_ppgtt(struct i915_gem_context *ctx,
-			   struct i915_address_space *vm)
-{
-	if (vm == rcu_access_pointer(ctx->vm))
-		return;
-
-	vm = __set_ppgtt(ctx, vm);
-	if (vm)
-		i915_vm_close(vm);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915,
 			const struct i915_gem_proto_context *pc)
 {
 	struct i915_gem_context *ctx;
-	int ret;
+	struct i915_gem_engines *e;
+	int err;
+	int i;
 
-	ctx = __create_context(i915, pc);
-	if (IS_ERR(ctx))
-		return ctx;
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
 
-	if (pc->vm) {
-		mutex_lock(&ctx->mutex);
-		__assign_ppgtt(ctx, pc->vm);
-		mutex_unlock(&ctx->mutex);
-	}
+	kref_init(&ctx->ref);
+	ctx->i915 = i915;
+	ctx->sched = pc->sched;
+	mutex_init(&ctx->mutex);
+	INIT_LIST_HEAD(&ctx->link);
 
-	if (pc->num_user_engines >= 0) {
-		struct i915_gem_engines *engines;
+	spin_lock_init(&ctx->stale.lock);
+	INIT_LIST_HEAD(&ctx->stale.engines);
 
-		engines = user_engines(ctx, pc->num_user_engines,
-				       pc->user_engines);
-		if (IS_ERR(engines)) {
-			context_close(ctx);
-			return ERR_CAST(engines);
-		}
+	if (pc->vm)
+		RCU_INIT_POINTER(ctx->vm, i915_vm_open(pc->vm));
 
-		mutex_lock(&ctx->engines_mutex);
+	mutex_init(&ctx->engines_mutex);
+	if (pc->num_user_engines >= 0) {
 		i915_gem_context_set_user_engines(ctx);
-		engines = rcu_replace_pointer(ctx->engines, engines, 1);
-		mutex_unlock(&ctx->engines_mutex);
-
-		free_engines(engines);
+		e = user_engines(ctx, pc->num_user_engines, pc->user_engines);
+	} else {
+		i915_gem_context_clear_user_engines(ctx);
+		e = default_engines(ctx);
+	}
+	if (IS_ERR(e)) {
+		err = PTR_ERR(e);
+		goto err_vm;
 	}
+	RCU_INIT_POINTER(ctx->engines, e);
+
+	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
+	mutex_init(&ctx->lut_mutex);
+
+	/* NB: Mark all slices as needing a remap so that when the context first
+	 * loads it will restore whatever remap state already exists. If there
+	 * is no remap info, it will be a NOP. */
+	ctx->remap_slice = ALL_L3_SLICES(i915);
+
+	ctx->user_flags = pc->user_flags;
+
+	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
+		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
 
 	if (pc->single_timeline) {
-		ret = drm_syncobj_create(&ctx->syncobj,
+		err = drm_syncobj_create(&ctx->syncobj,
 					 DRM_SYNCOBJ_CREATE_SIGNALED,
 					 NULL);
-		if (ret) {
-			context_close(ctx);
-			return ERR_PTR(ret);
-		}
+		if (err)
+			goto err_engines;
 	}
 
 	trace_i915_context_create(ctx);
 
 	return ctx;
+
+err_engines:
+	free_engines(e);
+err_vm:
+	if (ctx->vm)
+		i915_vm_close(ctx->vm);
+	kfree(ctx);
+	return ERR_PTR(err);
 }
 
 static void init_contexts(struct i915_gem_contexts *gc)
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index e4aced7eabb72..5ee7e9bb6175d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -30,15 +30,6 @@ mock_context(struct drm_i915_private *i915,
 
 	i915_gem_context_set_persistence(ctx);
 
-	mutex_init(&ctx->engines_mutex);
-	e = default_engines(ctx);
-	if (IS_ERR(e))
-		goto err_free;
-	RCU_INIT_POINTER(ctx->engines, e);
-
-	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
-	mutex_init(&ctx->lut_mutex);
-
 	if (name) {
 		struct i915_ppgtt *ppgtt;
 
@@ -46,25 +37,29 @@ mock_context(struct drm_i915_private *i915,
 
 		ppgtt = mock_ppgtt(i915, name);
 		if (!ppgtt)
-			goto err_put;
-
-		mutex_lock(&ctx->mutex);
-		__set_ppgtt(ctx, &ppgtt->vm);
-		mutex_unlock(&ctx->mutex);
+			goto err_free;
 
+		ctx->vm = i915_vm_open(&ppgtt->vm);
 		i915_vm_put(&ppgtt->vm);
 	}
 
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e))
+		goto err_vm;
+	RCU_INIT_POINTER(ctx->engines, e);
+
+	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
+	mutex_init(&ctx->lut_mutex);
+
 	return ctx;
 
+err_vm:
+	if (ctx->vm)
+		i915_vm_close(ctx->vm);
 err_free:
 	kfree(ctx);
 	return NULL;
-
-err_put:
-	i915_gem_context_set_closed(ctx);
-	i915_gem_context_put(ctx);
-	return NULL;
 }
 
 void mock_context_close(struct i915_gem_context *ctx)
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gem: ioctl clean-ups
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
                   ` (21 preceding siblings ...)
  (?)
@ 2021-04-23 22:49 ` Patchwork
  -1 siblings, 0 replies; 226+ messages in thread
From: Patchwork @ 2021-04-23 22:49 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gem: ioctl clean-ups
URL   : https://patchwork.freedesktop.org/series/89443/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
468456983a83 drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
-:176: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#176: 
deleted file mode 100644

total: 0 errors, 1 warnings, 0 checks, 159 lines checked
79a91e982ff7 drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
31e3478abfe0 drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
-:25: WARNING:LINE_SPACING: Missing a blank line after declarations
#25: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:239:
+		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
+		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);

total: 0 errors, 1 warnings, 0 checks, 83 lines checked
53f030af52d4 drm/i915/gem: Return void from context_apply_all
4b9a8e315c1c drm/i915: Drop the CONTEXT_CLONE API
ebe52e477467 drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
65796fb10e50 drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
d3ad59ed0a22 drm/i915/gem: Disallow bonding of virtual engines
21cb51520e4b drm/i915/gem: Disallow creating contexts with too many engines
b6ef4a4c6f47 drm/i915/request: Remove the hook from await_execution
ab3620b9adb5 drm/i915: Stop manually RCU banging in reset_stats_ioctl
fa138a73374c drm/i915/gem: Add a separate validate_priority helper
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 56 lines checked
c50f4dd9ee3f drm/i915/gem: Add an intermediate proto_context struct
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 268 lines checked
70451a1734a2 drm/i915/gem: Return an error ptr from context_lookup
-:59: WARNING:LIKELY_MISUSE: nested (un)?likely() calls, IS_ERR already uses unlikely() internally
#59: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:742:
+	if (unlikely(IS_ERR(ctx)))

total: 0 errors, 1 warnings, 0 checks, 60 lines checked
d0f063d4604e drm/i915/gt: Drop i915_address_space::file
6fce8cf246ee drm/i915/gem: Delay context creation
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

-:106: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#106: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:358:
+	unsigned num_engines;

-:287: ERROR:CODE_INDENT: code indent should use tabs where possible
#287: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:539:
+^I^I^I         struct i915_gem_proto_context *pc,$

-:287: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#287: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:539:
+static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
+			         struct i915_gem_proto_context *pc,

-:288: ERROR:CODE_INDENT: code indent should use tabs where possible
#288: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:540:
+^I^I^I         const struct drm_i915_gem_context_param *args)$

-:412: WARNING:ENOTSUPP: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP
#412: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:664:
+		ret = -ENOTSUPP;

-:807: WARNING:ENOTSUPP: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP
#807: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:2719:
+	if (ret == -ENOTSUPP) {

-:925: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#925: FILE: drivers/gpu/drm/i915/i915_drv.h:203:
+	struct mutex proto_context_lock;

total: 2 errors, 4 warnings, 2 checks, 901 lines checked
40c7b0f86bc0 drm/i915/gem: Don't allow changing the VM on running contexts
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 424 lines checked
9bd5fd34f557 drm/i915/gem: Don't allow changing the engine set on running contexts
-:8: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 313 lines checked
1694fd69838b drm/i915/selftests: Take a VM in kernel_context()
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 131 lines checked
e1c0a99c4bc6 i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
-:8: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 17 lines checked
0c4562693de5 drm/i915/gem: Roll all of context creation together
-:176: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#176: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:1291:
+	 * is no remap info, it will be a NOP. */

total: 0 errors, 1 warnings, 0 checks, 246 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/gem: ioctl clean-ups
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
                   ` (22 preceding siblings ...)
  (?)
@ 2021-04-23 22:51 ` Patchwork
  -1 siblings, 0 replies; 226+ messages in thread
From: Patchwork @ 2021-04-23 22:51 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/gem: ioctl clean-ups
URL   : https://patchwork.freedesktop.org/series/89443/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1270:17: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1270:17:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1270:17:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1488:14: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1488:14:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1488:14:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1811:25: warning: symbol 'lazy_create_context_locked' was not declared. Should it be static?
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2014:21: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2014:21:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2014:21:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2015:39: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2015:39:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2015:39:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:698:9: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:698:9:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:698:9:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:707:22: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:707:22:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:707:22:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:726:27: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:726:27:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:726:27:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:742:13: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.c:742:13:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.c:742:13:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:154:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.h:154:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.h:154:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:154:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:154:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:154:16:    struct i915_address_space [noderef] __rcu *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14: error: incompatible types in comparison expression (different address spaces):
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+./drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_context.h:163:14:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:746:13: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:746:13:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:746:13:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:772:49: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:772:49:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:772:49:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:33:16:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:704:33: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:704:33:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:704:33:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:838:33: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:838:33:    struct i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c:838:33:    struct i915_address_space [noderef] __rcu *
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1329:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1203:24: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/i915/gvt/mmio.c:295:23: warning: memcpy with byte count of 279040
+drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y
+drivers/gpu/drm/i915/selftests/i915_vma.c:42:24: error: incompatible types in comparison expression (different address spaces):
+drivers/gpu/drm/i915/selftests/i915_vma.c:42:24:    struct i915_address_space *
+drivers/gpu/drm/i915/selftests/i915_vma.c:42:24:    struct i915_address_space [noderef] __rcu *
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gem: ioctl clean-ups
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
                   ` (23 preceding siblings ...)
  (?)
@ 2021-04-23 23:16 ` Patchwork
  -1 siblings, 0 replies; 226+ messages in thread
From: Patchwork @ 2021-04-23 23:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 4955 bytes --]

== Series Details ==

Series: drm/i915/gem: ioctl clean-ups
URL   : https://patchwork.freedesktop.org/series/89443/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10005 -> Patchwork_19984
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/index.html

Known issues
------------

  Here are the changes found in Patchwork_19984 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
    - fi-snb-2600:        NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/fi-snb-2600/igt@amdgpu/amd_cs_nop@sync-fork-compute0.html

  * igt@gem_exec_suspend@basic-s0:
    - fi-tgl-u2:          [PASS][2] -> [FAIL][3] ([i915#1888])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@hangcheck:
    - fi-snb-2600:        [INCOMPLETE][4] ([i915#2782]) -> [PASS][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/fi-snb-2600/igt@i915_selftest@live@hangcheck.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/fi-snb-2600/igt@i915_selftest@live@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012
  [i915#3276]: https://gitlab.freedesktop.org/drm/intel/issues/3276
  [i915#3277]: https://gitlab.freedesktop.org/drm/intel/issues/3277
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3283]: https://gitlab.freedesktop.org/drm/intel/issues/3283
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Participating hosts (43 -> 40)
------------------------------

  Additional (1): fi-rkl-11500t 
  Missing    (4): fi-ilk-m540 fi-bsw-cyan fi-bdw-samus fi-hsw-4200u 


Build changes
-------------

  * IGT: IGT_6074 -> IGTPW_5761
  * Linux: CI_DRM_10005 -> Patchwork_19984

  CI-20190529: 20190529
  CI_DRM_10005: 7a27cb7ac19a95d801c391044cea5274677e7744 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_5761: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_5761/index.html
  IGT_6074: 3f43ae9fd22dc5a517786b984dc3aa717997664f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19984: 0c4562693de593aacfbe5dd6b27e69bd89403c15 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

0c4562693de5 drm/i915/gem: Roll all of context creation together
e1c0a99c4bc6 i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
1694fd69838b drm/i915/selftests: Take a VM in kernel_context()
9bd5fd34f557 drm/i915/gem: Don't allow changing the engine set on running contexts
40c7b0f86bc0 drm/i915/gem: Don't allow changing the VM on running contexts
6fce8cf246ee drm/i915/gem: Delay context creation
d0f063d4604e drm/i915/gt: Drop i915_address_space::file
70451a1734a2 drm/i915/gem: Return an error ptr from context_lookup
c50f4dd9ee3f drm/i915/gem: Add an intermediate proto_context struct
fa138a73374c drm/i915/gem: Add a separate validate_priority helper
ab3620b9adb5 drm/i915: Stop manually RCU banging in reset_stats_ioctl
b6ef4a4c6f47 drm/i915/request: Remove the hook from await_execution
21cb51520e4b drm/i915/gem: Disallow creating contexts with too many engines
d3ad59ed0a22 drm/i915/gem: Disallow bonding of virtual engines
65796fb10e50 drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
ebe52e477467 drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
4b9a8e315c1c drm/i915: Drop the CONTEXT_CLONE API
53f030af52d4 drm/i915/gem: Return void from context_apply_all
31e3478abfe0 drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
79a91e982ff7 drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
468456983a83 drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/index.html

[-- Attachment #1.2: Type: text/html, Size: 4877 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gem: ioctl clean-ups
  2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
                   ` (24 preceding siblings ...)
  (?)
@ 2021-04-24  2:14 ` Patchwork
  -1 siblings, 0 replies; 226+ messages in thread
From: Patchwork @ 2021-04-24  2:14 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 30248 bytes --]

== Series Details ==

Series: drm/i915/gem: ioctl clean-ups
URL   : https://patchwork.freedesktop.org/series/89443/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10005_full -> Patchwork_19984_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_19984_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19984_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19984_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_balancer@bonded-slice:
    - shard-kbl:          [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-kbl1/igt@gem_exec_balancer@bonded-slice.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-kbl4/igt@gem_exec_balancer@bonded-slice.html
    - shard-tglb:         [PASS][3] -> [FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-tglb3/igt@gem_exec_balancer@bonded-slice.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb3/igt@gem_exec_balancer@bonded-slice.html

  * igt@gem_exec_schedule@u-submit-golden-slice@rcs0:
    - shard-skl:          [PASS][5] -> [INCOMPLETE][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl7/igt@gem_exec_schedule@u-submit-golden-slice@rcs0.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl2/igt@gem_exec_schedule@u-submit-golden-slice@rcs0.html

  
#### Warnings ####

  * igt@runner@aborted:
    - shard-skl:          ([FAIL][7], [FAIL][8], [FAIL][9]) ([i915#1436] / [i915#2369] / [i915#3002]) -> ([FAIL][10], [FAIL][11], [FAIL][12], [FAIL][13]) ([i915#1814] / [i915#2029] / [i915#3002])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl2/igt@runner@aborted.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl1/igt@runner@aborted.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl10/igt@runner@aborted.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl2/igt@runner@aborted.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl8/igt@runner@aborted.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl2/igt@runner@aborted.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl2/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_19984_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_create@create-massive:
    - shard-iclb:         NOTRUN -> [DMESG-WARN][14] ([i915#3002])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb6/igt@gem_create@create-massive.html
    - shard-snb:          NOTRUN -> [DMESG-WARN][15] ([i915#3002])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb7/igt@gem_create@create-massive.html
    - shard-kbl:          NOTRUN -> [DMESG-WARN][16] ([i915#3002])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-kbl1/igt@gem_create@create-massive.html
    - shard-tglb:         NOTRUN -> [DMESG-WARN][17] ([i915#3002])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb2/igt@gem_create@create-massive.html
    - shard-glk:          NOTRUN -> [DMESG-WARN][18] ([i915#3002])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk1/igt@gem_create@create-massive.html
    - shard-apl:          NOTRUN -> [DMESG-WARN][19] ([i915#3002])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl2/igt@gem_create@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-queued:
    - shard-snb:          NOTRUN -> [SKIP][20] ([fdo#109271] / [i915#1099]) +6 similar issues
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb5/igt@gem_ctx_persistence@legacy-engines-queued.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-skl:          NOTRUN -> [FAIL][21] ([i915#2846])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl1/igt@gem_exec_fair@basic-deadline.html
    - shard-glk:          [PASS][22] -> [FAIL][23] ([i915#2846])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk1/igt@gem_exec_fair@basic-deadline.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk2/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-tglb:         [PASS][24] -> [FAIL][25] ([i915#2842]) +2 similar issues
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-tglb5/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb2/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs0:
    - shard-iclb:         [PASS][26] -> [FAIL][27] ([i915#2842])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-iclb7/igt@gem_exec_fair@basic-pace@vcs0.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb5/igt@gem_exec_fair@basic-pace@vcs0.html
    - shard-glk:          [PASS][28] -> [FAIL][29] ([i915#2842]) +1 similar issue
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk5/igt@gem_exec_fair@basic-pace@vcs0.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk8/igt@gem_exec_fair@basic-pace@vcs0.html

  * igt@gem_exec_whisper@basic-queues-forked-all:
    - shard-glk:          [PASS][30] -> [DMESG-WARN][31] ([i915#118] / [i915#95]) +1 similar issue
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk6/igt@gem_exec_whisper@basic-queues-forked-all.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk1/igt@gem_exec_whisper@basic-queues-forked-all.html

  * igt@gem_huc_copy@huc-copy:
    - shard-tglb:         [PASS][32] -> [SKIP][33] ([i915#2190])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-tglb5/igt@gem_huc_copy@huc-copy.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb6/igt@gem_huc_copy@huc-copy.html
    - shard-apl:          NOTRUN -> [SKIP][34] ([fdo#109271] / [i915#2190])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl7/igt@gem_huc_copy@huc-copy.html

  * igt@gem_pwrite@basic-exhaustion:
    - shard-snb:          NOTRUN -> [WARN][35] ([i915#2658]) +1 similar issue
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb2/igt@gem_pwrite@basic-exhaustion.html
    - shard-apl:          NOTRUN -> [WARN][36] ([i915#2658])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl6/igt@gem_pwrite@basic-exhaustion.html

  * igt@gem_render_copy@y-tiled-mc-ccs-to-y-tiled-ccs:
    - shard-iclb:         NOTRUN -> [SKIP][37] ([i915#768]) +1 similar issue
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb1/igt@gem_render_copy@y-tiled-mc-ccs-to-y-tiled-ccs.html

  * igt@gem_userptr_blits@unsync-overlap:
    - shard-tglb:         NOTRUN -> [SKIP][38] ([i915#3297])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb6/igt@gem_userptr_blits@unsync-overlap.html
    - shard-iclb:         NOTRUN -> [SKIP][39] ([i915#3297])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb7/igt@gem_userptr_blits@unsync-overlap.html

  * igt@gen3_render_tiledy_blits:
    - shard-tglb:         NOTRUN -> [SKIP][40] ([fdo#109289])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb6/igt@gen3_render_tiledy_blits.html
    - shard-iclb:         NOTRUN -> [SKIP][41] ([fdo#109289])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb7/igt@gen3_render_tiledy_blits.html

  * igt@gen9_exec_parse@batch-invalid-length:
    - shard-snb:          NOTRUN -> [SKIP][42] ([fdo#109271]) +349 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb7/igt@gen9_exec_parse@batch-invalid-length.html

  * igt@gen9_exec_parse@secure-batches:
    - shard-iclb:         NOTRUN -> [SKIP][43] ([fdo#112306])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb8/igt@gen9_exec_parse@secure-batches.html
    - shard-tglb:         NOTRUN -> [SKIP][44] ([fdo#112306])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb8/igt@gen9_exec_parse@secure-batches.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-skl:          NOTRUN -> [FAIL][45] ([i915#454]) +1 similar issue
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl6/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_suspend@forcewake:
    - shard-skl:          [PASS][46] -> [INCOMPLETE][47] ([i915#636])
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl2/igt@i915_suspend@forcewake.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl6/igt@i915_suspend@forcewake.html

  * igt@kms_async_flips@alternate-sync-async-flip:
    - shard-skl:          [PASS][48] -> [FAIL][49] ([i915#2521])
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl4/igt@kms_async_flips@alternate-sync-async-flip.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl4/igt@kms_async_flips@alternate-sync-async-flip.html
    - shard-tglb:         [PASS][50] -> [FAIL][51] ([i915#2521])
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-tglb6/igt@kms_async_flips@alternate-sync-async-flip.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb7/igt@kms_async_flips@alternate-sync-async-flip.html

  * igt@kms_big_fb@linear-8bpp-rotate-270:
    - shard-tglb:         NOTRUN -> [SKIP][52] ([fdo#111614]) +1 similar issue
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb2/igt@kms_big_fb@linear-8bpp-rotate-270.html
    - shard-iclb:         NOTRUN -> [SKIP][53] ([fdo#110725] / [fdo#111614]) +1 similar issue
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb6/igt@kms_big_fb@linear-8bpp-rotate-270.html

  * igt@kms_big_fb@yf-tiled-32bpp-rotate-90:
    - shard-tglb:         NOTRUN -> [SKIP][54] ([fdo#111615])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb6/igt@kms_big_fb@yf-tiled-32bpp-rotate-90.html

  * igt@kms_ccs@pipe-c-crc-primary-basic:
    - shard-skl:          NOTRUN -> [SKIP][55] ([fdo#109271] / [fdo#111304])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl10/igt@kms_ccs@pipe-c-crc-primary-basic.html

  * igt@kms_chamelium@hdmi-edid-change-during-suspend:
    - shard-apl:          NOTRUN -> [SKIP][56] ([fdo#109271] / [fdo#111827]) +19 similar issues
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl1/igt@kms_chamelium@hdmi-edid-change-during-suspend.html

  * igt@kms_chamelium@vga-hpd:
    - shard-skl:          NOTRUN -> [SKIP][57] ([fdo#109271] / [fdo#111827]) +12 similar issues
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl1/igt@kms_chamelium@vga-hpd.html

  * igt@kms_color_chamelium@pipe-a-ctm-blue-to-red:
    - shard-snb:          NOTRUN -> [SKIP][58] ([fdo#109271] / [fdo#111827]) +22 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb7/igt@kms_color_chamelium@pipe-a-ctm-blue-to-red.html

  * igt@kms_color_chamelium@pipe-a-ctm-limited-range:
    - shard-glk:          NOTRUN -> [SKIP][59] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk3/igt@kms_color_chamelium@pipe-a-ctm-limited-range.html
    - shard-iclb:         NOTRUN -> [SKIP][60] ([fdo#109284] / [fdo#111827]) +1 similar issue
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb4/igt@kms_color_chamelium@pipe-a-ctm-limited-range.html

  * igt@kms_color_chamelium@pipe-d-ctm-max:
    - shard-tglb:         NOTRUN -> [SKIP][61] ([fdo#109284] / [fdo#111827]) +2 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb6/igt@kms_color_chamelium@pipe-d-ctm-max.html
    - shard-kbl:          NOTRUN -> [SKIP][62] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-kbl1/igt@kms_color_chamelium@pipe-d-ctm-max.html
    - shard-iclb:         NOTRUN -> [SKIP][63] ([fdo#109278] / [fdo#109284] / [fdo#111827])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb8/igt@kms_color_chamelium@pipe-d-ctm-max.html

  * igt@kms_content_protection@srm:
    - shard-apl:          NOTRUN -> [TIMEOUT][64] ([i915#1319])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl8/igt@kms_content_protection@srm.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-random:
    - shard-skl:          [PASS][65] -> [FAIL][66] ([i915#54])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl7/igt@kms_cursor_crc@pipe-a-cursor-256x256-random.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl8/igt@kms_cursor_crc@pipe-a-cursor-256x256-random.html

  * igt@kms_cursor_crc@pipe-a-cursor-512x170-sliding:
    - shard-tglb:         NOTRUN -> [SKIP][67] ([fdo#109279] / [i915#3359])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb7/igt@kms_cursor_crc@pipe-a-cursor-512x170-sliding.html
    - shard-iclb:         NOTRUN -> [SKIP][68] ([fdo#109278] / [fdo#109279])
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb5/igt@kms_cursor_crc@pipe-a-cursor-512x170-sliding.html

  * igt@kms_cursor_crc@pipe-b-cursor-32x10-random:
    - shard-kbl:          NOTRUN -> [SKIP][69] ([fdo#109271]) +33 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-kbl7/igt@kms_cursor_crc@pipe-b-cursor-32x10-random.html
    - shard-tglb:         NOTRUN -> [SKIP][70] ([i915#3359]) +3 similar issues
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb2/igt@kms_cursor_crc@pipe-b-cursor-32x10-random.html

  * igt@kms_cursor_crc@pipe-b-cursor-32x32-onscreen:
    - shard-skl:          NOTRUN -> [SKIP][71] ([fdo#109271]) +110 similar issues
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-32x32-onscreen.html

  * igt@kms_cursor_crc@pipe-c-cursor-suspend:
    - shard-kbl:          [PASS][72] -> [DMESG-WARN][73] ([i915#180])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-kbl1/igt@kms_cursor_crc@pipe-c-cursor-suspend.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-kbl2/igt@kms_cursor_crc@pipe-c-cursor-suspend.html

  * igt@kms_cursor_edge_walk@pipe-a-64x64-right-edge:
    - shard-skl:          [PASS][74] -> [DMESG-WARN][75] ([i915#1982]) +2 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl6/igt@kms_cursor_edge_walk@pipe-a-64x64-right-edge.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl6/igt@kms_cursor_edge_walk@pipe-a-64x64-right-edge.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-skl:          NOTRUN -> [FAIL][76] ([i915#2346] / [i915#533])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_flip@2x-busy-flip:
    - shard-tglb:         NOTRUN -> [SKIP][77] ([fdo#111825]) +5 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb5/igt@kms_flip@2x-busy-flip.html

  * igt@kms_flip@2x-plain-flip-fb-recreate-interruptible:
    - shard-iclb:         NOTRUN -> [SKIP][78] ([fdo#109274]) +1 similar issue
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb8/igt@kms_flip@2x-plain-flip-fb-recreate-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1:
    - shard-glk:          [PASS][79] -> [FAIL][80] ([i915#79])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk4/igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk7/igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs:
    - shard-apl:          NOTRUN -> [SKIP][81] ([fdo#109271] / [i915#2672]) +1 similar issue
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl3/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff:
    - shard-glk:          [PASS][82] -> [FAIL][83] ([i915#49])
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk8/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk8/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-move:
    - shard-iclb:         NOTRUN -> [SKIP][84] ([fdo#109280]) +3 similar issues
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-move.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-fullscreen:
    - shard-skl:          [PASS][85] -> [FAIL][86] ([i915#49]) +1 similar issue
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl1/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-fullscreen.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl7/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-fullscreen.html

  * igt@kms_hdr@bpc-switch:
    - shard-skl:          [PASS][87] -> [FAIL][88] ([i915#1188])
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl4/igt@kms_hdr@bpc-switch.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl10/igt@kms_hdr@bpc-switch.html

  * igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d:
    - shard-apl:          NOTRUN -> [SKIP][89] ([fdo#109271] / [i915#533]) +1 similar issue
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl2/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb:
    - shard-apl:          NOTRUN -> [FAIL][90] ([fdo#108145] / [i915#265]) +3 similar issues
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl6/igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb:
    - shard-skl:          NOTRUN -> [FAIL][91] ([i915#265])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl10/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html
    - shard-apl:          NOTRUN -> [FAIL][92] ([i915#265])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl7/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html
    - shard-glk:          NOTRUN -> [FAIL][93] ([i915#265])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk6/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html
    - shard-kbl:          NOTRUN -> [FAIL][94] ([i915#265])
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-kbl4/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-d-constant-alpha-max:
    - shard-iclb:         NOTRUN -> [SKIP][95] ([fdo#109278]) +8 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb8/igt@kms_plane_alpha_blend@pipe-d-constant-alpha-max.html

  * igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping:
    - shard-apl:          NOTRUN -> [SKIP][96] ([fdo#109271] / [i915#2733])
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl1/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4:
    - shard-apl:          NOTRUN -> [SKIP][97] ([fdo#109271] / [i915#658]) +4 similar issues
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl8/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4.html

  * igt@kms_psr2_su@frontbuffer:
    - shard-skl:          NOTRUN -> [SKIP][98] ([fdo#109271] / [i915#658]) +2 similar issues
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl9/igt@kms_psr2_su@frontbuffer.html

  * igt@kms_psr@primary_page_flip:
    - shard-skl:          [PASS][99] -> [SKIP][100] ([fdo#109271]) +11 similar issues
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl1/igt@kms_psr@primary_page_flip.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl4/igt@kms_psr@primary_page_flip.html

  * igt@kms_psr@psr2_primary_render:
    - shard-iclb:         NOTRUN -> [SKIP][101] ([fdo#109441])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb4/igt@kms_psr@psr2_primary_render.html

  * igt@kms_psr@psr2_sprite_mmap_cpu:
    - shard-iclb:         [PASS][102] -> [SKIP][103] ([fdo#109441])
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_cpu.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb5/igt@kms_psr@psr2_sprite_mmap_cpu.html

  * igt@kms_setmode@basic:
    - shard-snb:          NOTRUN -> [FAIL][104] ([i915#31])
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb7/igt@kms_setmode@basic.html

  * igt@kms_sysfs_edid_timing:
    - shard-apl:          NOTRUN -> [FAIL][105] ([IGT#2])
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl1/igt@kms_sysfs_edid_timing.html
    - shard-skl:          NOTRUN -> [FAIL][106] ([IGT#2])
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl10/igt@kms_sysfs_edid_timing.html

  * igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend:
    - shard-skl:          [PASS][107] -> [INCOMPLETE][108] ([i915#198])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl8/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl1/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html

  * igt@kms_vblank@pipe-d-ts-continuation-idle:
    - shard-apl:          NOTRUN -> [SKIP][109] ([fdo#109271]) +197 similar issues
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl6/igt@kms_vblank@pipe-d-ts-continuation-idle.html

  * igt@prime_nv_api@i915_self_import:
    - shard-glk:          NOTRUN -> [SKIP][110] ([fdo#109271]) +30 similar issues
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk5/igt@prime_nv_api@i915_self_import.html
    - shard-tglb:         NOTRUN -> [SKIP][111] ([fdo#109291]) +1 similar issue
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb8/igt@prime_nv_api@i915_self_import.html
    - shard-iclb:         NOTRUN -> [SKIP][112] ([fdo#109291]) +1 similar issue
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb8/igt@prime_nv_api@i915_self_import.html

  * igt@sysfs_clients@create:
    - shard-apl:          NOTRUN -> [SKIP][113] ([fdo#109271] / [i915#2994]) +3 similar issues
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl6/igt@sysfs_clients@create.html

  * igt@sysfs_clients@fair-3:
    - shard-skl:          NOTRUN -> [SKIP][114] ([fdo#109271] / [i915#2994]) +1 similar issue
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl7/igt@sysfs_clients@fair-3.html

  
#### Possible fixes ####

  * igt@feature_discovery@psr2:
    - shard-iclb:         [SKIP][115] ([i915#658]) -> [PASS][116]
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-iclb7/igt@feature_discovery@psr2.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb2/igt@feature_discovery@psr2.html

  * igt@gem_exec_capture@pi@vecs0:
    - shard-skl:          [INCOMPLETE][117] ([i915#198] / [i915#2369] / [i915#2624]) -> [PASS][118]
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl2/igt@gem_exec_capture@pi@vecs0.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl4/igt@gem_exec_capture@pi@vecs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
    - shard-apl:          [SKIP][119] ([fdo#109271]) -> [PASS][120]
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-apl2/igt@gem_exec_fair@basic-none-share@rcs0.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl7/igt@gem_exec_fair@basic-none-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@bcs0:
    - shard-tglb:         [FAIL][121] ([i915#2842]) -> [PASS][122] +1 similar issue
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-tglb7/igt@gem_exec_fair@basic-pace@bcs0.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-tglb7/igt@gem_exec_fair@basic-pace@bcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-glk:          [FAIL][123] ([i915#2842]) -> [PASS][124] +1 similar issue
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk9/igt@gem_exec_fair@basic-throttle@rcs0.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk2/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_whisper@basic-fds:
    - shard-glk:          [DMESG-WARN][125] ([i915#118] / [i915#95]) -> [PASS][126] +1 similar issue
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk1/igt@gem_exec_whisper@basic-fds.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk5/igt@gem_exec_whisper@basic-fds.html

  * igt@gem_mmap_gtt@cpuset-big-copy:
    - shard-iclb:         [FAIL][127] ([i915#307]) -> [PASS][128]
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-iclb6/igt@gem_mmap_gtt@cpuset-big-copy.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb3/igt@gem_mmap_gtt@cpuset-big-copy.html

  * igt@gem_softpin@noreloc-s3:
    - shard-apl:          [DMESG-WARN][129] ([i915#180]) -> [PASS][130]
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-apl3/igt@gem_softpin@noreloc-s3.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-apl1/igt@gem_softpin@noreloc-s3.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-skl:          [DMESG-WARN][131] ([i915#1436] / [i915#716]) -> [PASS][132]
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl10/igt@gen9_exec_parse@allowed-single.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl8/igt@gen9_exec_parse@allowed-single.html

  * igt@kms_atomic@plane-primary-legacy:
    - shard-snb:          [SKIP][133] ([fdo#109271]) -> [PASS][134]
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-snb2/igt@kms_atomic@plane-primary-legacy.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-snb5/igt@kms_atomic@plane-primary-legacy.html

  * igt@kms_big_fb@y-tiled-16bpp-rotate-0:
    - shard-skl:          [SKIP][135] ([fdo#109271]) -> [PASS][136] +16 similar issues
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl2/igt@kms_big_fb@y-tiled-16bpp-rotate-0.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl9/igt@kms_big_fb@y-tiled-16bpp-rotate-0.html

  * igt@kms_ccs@pipe-a-random-ccs-data:
    - shard-iclb:         [DMESG-WARN][137] ([i915#3219]) -> [PASS][138]
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-iclb1/igt@kms_ccs@pipe-a-random-ccs-data.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb2/igt@kms_ccs@pipe-a-random-ccs-data.html

  * igt@kms_cursor_edge_walk@pipe-a-64x64-right-edge:
    - shard-glk:          [DMESG-FAIL][139] ([i915#118] / [i915#70] / [i915#95]) -> [PASS][140]
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk7/igt@kms_cursor_edge_walk@pipe-a-64x64-right-edge.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk3/igt@kms_cursor_edge_walk@pipe-a-64x64-right-edge.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-pri-indfb-draw-mmap-wc:
    - shard-glk:          [FAIL][141] ([i915#49]) -> [PASS][142]
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-glk9/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-pri-indfb-draw-mmap-wc.html
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-glk3/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@psr-shrfb-scaledprimary:
    - shard-iclb:         [SKIP][143] ([i915#668]) -> [PASS][144]
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-iclb1/igt@kms_frontbuffer_tracking@psr-shrfb-scaledprimary.html
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-iclb6/igt@kms_frontbuffer_tracking@psr-shrfb-scaledprimary.html

  * igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-c:
    - shard-skl:          [FAIL][145] ([i915#1036]) -> [PASS][146]
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10005/shard-skl2/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-c.html
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/shard-skl10/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-c.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19984/index.html

[-- Attachment #1.2: Type: text/html, Size: 33447 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
  (?)
@ 2021-04-24  3:21     ` kernel test robot
  -1 siblings, 0 replies; 226+ messages in thread
From: kernel test robot @ 2021-04-24  3:21 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel; +Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2501 bytes --]

Hi Jason,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423]
[cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a001-20210423 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
        git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        # save the attached .config to linux build tree
        make W=1 W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: error: no previous prototype for 'lazy_create_context_locked' [-Werror=missing-prototypes]
    2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors


vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c

  2437	
  2438	struct i915_gem_context *
> 2439	lazy_create_context_locked(struct drm_i915_file_private *file_priv,
  2440				   struct i915_gem_proto_context *pc, u32 id)
  2441	{
  2442		struct i915_gem_context *ctx;
  2443		void *old;
  2444	
  2445		ctx = i915_gem_create_context(file_priv->dev_priv, pc);
  2446		if (IS_ERR(ctx))
  2447			return ctx;
  2448	
  2449		gem_context_register(ctx, file_priv, id);
  2450	
  2451		old = xa_erase(&file_priv->proto_context_xa, id);
  2452		GEM_BUG_ON(old != pc);
  2453		proto_context_close(pc);
  2454	
  2455		/* One for the xarray and one for the caller */
  2456		return i915_gem_context_get(ctx);
  2457	}
  2458	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35211 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-24  3:21     ` kernel test robot
  0 siblings, 0 replies; 226+ messages in thread
From: kernel test robot @ 2021-04-24  3:21 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel; +Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2501 bytes --]

Hi Jason,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423]
[cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a001-20210423 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
        git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        # save the attached .config to linux build tree
        make W=1 W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: error: no previous prototype for 'lazy_create_context_locked' [-Werror=missing-prototypes]
    2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors


vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c

  2437	
  2438	struct i915_gem_context *
> 2439	lazy_create_context_locked(struct drm_i915_file_private *file_priv,
  2440				   struct i915_gem_proto_context *pc, u32 id)
  2441	{
  2442		struct i915_gem_context *ctx;
  2443		void *old;
  2444	
  2445		ctx = i915_gem_create_context(file_priv->dev_priv, pc);
  2446		if (IS_ERR(ctx))
  2447			return ctx;
  2448	
  2449		gem_context_register(ctx, file_priv, id);
  2450	
  2451		old = xa_erase(&file_priv->proto_context_xa, id);
  2452		GEM_BUG_ON(old != pc);
  2453		proto_context_close(pc);
  2454	
  2455		/* One for the xarray and one for the caller */
  2456		return i915_gem_context_get(ctx);
  2457	}
  2458	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35211 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-24  3:21     ` kernel test robot
  0 siblings, 0 replies; 226+ messages in thread
From: kernel test robot @ 2021-04-24  3:21 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2565 bytes --]

Hi Jason,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423]
[cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a001-20210423 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
        git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        # save the attached .config to linux build tree
        make W=1 W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: error: no previous prototype for 'lazy_create_context_locked' [-Werror=missing-prototypes]
    2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors


vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c

  2437	
  2438	struct i915_gem_context *
> 2439	lazy_create_context_locked(struct drm_i915_file_private *file_priv,
  2440				   struct i915_gem_proto_context *pc, u32 id)
  2441	{
  2442		struct i915_gem_context *ctx;
  2443		void *old;
  2444	
  2445		ctx = i915_gem_create_context(file_priv->dev_priv, pc);
  2446		if (IS_ERR(ctx))
  2447			return ctx;
  2448	
  2449		gem_context_register(ctx, file_priv, id);
  2450	
  2451		old = xa_erase(&file_priv->proto_context_xa, id);
  2452		GEM_BUG_ON(old != pc);
  2453		proto_context_close(pc);
  2454	
  2455		/* One for the xarray and one for the caller */
  2456		return i915_gem_context_get(ctx);
  2457	}
  2458	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35211 bytes --]

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
  (?)
@ 2021-04-24  3:24     ` kernel test robot
  -1 siblings, 0 replies; 226+ messages in thread
From: kernel test robot @ 2021-04-24  3:24 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel; +Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2446 bytes --]

Hi Jason,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423]
[cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel-8.3 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
        git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        # save the attached .config to linux build tree
        make W=1 W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: warning: no previous prototype for 'lazy_create_context_locked' [-Wmissing-prototypes]
    2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c

  2437	
  2438	struct i915_gem_context *
> 2439	lazy_create_context_locked(struct drm_i915_file_private *file_priv,
  2440				   struct i915_gem_proto_context *pc, u32 id)
  2441	{
  2442		struct i915_gem_context *ctx;
  2443		void *old;
  2444	
  2445		ctx = i915_gem_create_context(file_priv->dev_priv, pc);
  2446		if (IS_ERR(ctx))
  2447			return ctx;
  2448	
  2449		gem_context_register(ctx, file_priv, id);
  2450	
  2451		old = xa_erase(&file_priv->proto_context_xa, id);
  2452		GEM_BUG_ON(old != pc);
  2453		proto_context_close(pc);
  2454	
  2455		/* One for the xarray and one for the caller */
  2456		return i915_gem_context_get(ctx);
  2457	}
  2458	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41163 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-24  3:24     ` kernel test robot
  0 siblings, 0 replies; 226+ messages in thread
From: kernel test robot @ 2021-04-24  3:24 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel; +Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2446 bytes --]

Hi Jason,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423]
[cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel-8.3 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
        git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        # save the attached .config to linux build tree
        make W=1 W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: warning: no previous prototype for 'lazy_create_context_locked' [-Wmissing-prototypes]
    2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c

  2437	
  2438	struct i915_gem_context *
> 2439	lazy_create_context_locked(struct drm_i915_file_private *file_priv,
  2440				   struct i915_gem_proto_context *pc, u32 id)
  2441	{
  2442		struct i915_gem_context *ctx;
  2443		void *old;
  2444	
  2445		ctx = i915_gem_create_context(file_priv->dev_priv, pc);
  2446		if (IS_ERR(ctx))
  2447			return ctx;
  2448	
  2449		gem_context_register(ctx, file_priv, id);
  2450	
  2451		old = xa_erase(&file_priv->proto_context_xa, id);
  2452		GEM_BUG_ON(old != pc);
  2453		proto_context_close(pc);
  2454	
  2455		/* One for the xarray and one for the caller */
  2456		return i915_gem_context_get(ctx);
  2457	}
  2458	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41163 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-24  3:24     ` kernel test robot
  0 siblings, 0 replies; 226+ messages in thread
From: kernel test robot @ 2021-04-24  3:24 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2509 bytes --]

Hi Jason,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm-exynos/exynos-drm-next next-20210423]
[cannot apply to tegra-drm/drm/tegra/for-next drm/drm-next v5.12-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel-8.3 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jason-Ekstrand/drm-i915-gem-ioctl-clean-ups/20210424-063511
        git checkout e00622bd8a3f3eccbb22721c2f8857bdfb7d5d9d
        # save the attached .config to linux build tree
        make W=1 W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_context.c:2439:1: warning: no previous prototype for 'lazy_create_context_locked' [-Wmissing-prototypes]
    2439 | lazy_create_context_locked(struct drm_i915_file_private *file_priv,
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/lazy_create_context_locked +2439 drivers/gpu/drm/i915/gem/i915_gem_context.c

  2437	
  2438	struct i915_gem_context *
> 2439	lazy_create_context_locked(struct drm_i915_file_private *file_priv,
  2440				   struct i915_gem_proto_context *pc, u32 id)
  2441	{
  2442		struct i915_gem_context *ctx;
  2443		void *old;
  2444	
  2445		ctx = i915_gem_create_context(file_priv->dev_priv, pc);
  2446		if (IS_ERR(ctx))
  2447			return ctx;
  2448	
  2449		gem_context_register(ctx, file_priv, id);
  2450	
  2451		old = xa_erase(&file_priv->proto_context_xa, id);
  2452		GEM_BUG_ON(old != pc);
  2453		proto_context_close(pc);
  2454	
  2455		/* One for the xarray and one for the caller */
  2456		return i915_gem_context_get(ctx);
  2457	}
  2458	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 41163 bytes --]

^ permalink raw reply	[flat|nested] 226+ messages in thread

* [PATCH 08/20] drm/i915/gem: Disallow bonding of virtual engines (v2)
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-26 23:43     ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-26 23:43 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Jason Ekstrand

This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

Unless some userspace badly wants it, there's no good reason to support
this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
leave the validation code in place in case we ever decide we want to do
something interesting with the bonding information.

v2 (Jason Ekstrand):
 - Don't delete quite as much code.  Some of it was necessary.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
 .../drm/i915/gt/intel_execlists_submission.c  |  83 -------
 .../drm/i915/gt/intel_execlists_submission.h  |   4 -
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
 4 files changed, 6 insertions(+), 328 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e8179918fa306..5f8d0faf783aa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 	}
 	virtual = set->engines->engines[idx]->engine;
 
+	if (intel_engine_is_virtual(virtual)) {
+		drm_dbg(&i915->drm,
+			"Bonding with virtual engines not allowed\n");
+		return -EINVAL;
+	}
+
 	err = check_user_mbz(&ext->flags);
 	if (err)
 		return err;
@@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 				n, ci.engine_class, ci.engine_instance);
 			return -EINVAL;
 		}
-
-		/*
-		 * A non-virtual engine has no siblings to choose between; and
-		 * a submit fence will always be directed to the one engine.
-		 */
-		if (intel_engine_is_virtual(virtual)) {
-			err = intel_virtual_engine_attach_bond(virtual,
-							       master,
-							       bond);
-			if (err)
-				return err;
-		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de124870af44d..a6204c60b59cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -181,18 +181,6 @@ struct virtual_engine {
 		int prio;
 	} nodes[I915_NUM_ENGINES];
 
-	/*
-	 * Keep track of bonded pairs -- restrictions upon on our selection
-	 * of physical engines any particular request may be submitted to.
-	 * If we receive a submit-fence from a master engine, we will only
-	 * use one of sibling_mask physical engines.
-	 */
-	struct ve_bond {
-		const struct intel_engine_cs *master;
-		intel_engine_mask_t sibling_mask;
-	} *bonds;
-	unsigned int num_bonds;
-
 	/* And finally, which physical engines this virtual engine maps onto. */
 	unsigned int num_siblings;
 	struct intel_engine_cs *siblings[];
@@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 	intel_breadcrumbs_free(ve->base.breadcrumbs);
 	intel_engine_free_request_pool(&ve->base);
 
-	kfree(ve->bonds);
 	kfree(ve);
 }
 
@@ -3560,33 +3547,13 @@ static void virtual_submit_request(struct i915_request *rq)
 	spin_unlock_irqrestore(&ve->base.active.lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
-		  const struct intel_engine_cs *master)
-{
-	int i;
-
-	for (i = 0; i < ve->num_bonds; i++) {
-		if (ve->bonds[i].master == master)
-			return &ve->bonds[i];
-	}
-
-	return NULL;
-}
-
 static void
 virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 {
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
 	intel_engine_mask_t allowed, exec;
-	struct ve_bond *bond;
 
 	allowed = ~to_request(signal)->engine->mask;
 
-	bond = virtual_find_bond(ve, to_request(signal)->engine);
-	if (bond)
-		allowed &= bond->sibling_mask;
-
 	/* Restrict the bonded request to run on only the available engines */
 	exec = READ_ONCE(rq->execution_mask);
 	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
@@ -3747,59 +3714,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
 	if (IS_ERR(dst))
 		return dst;
 
-	if (se->num_bonds) {
-		struct virtual_engine *de = to_virtual_engine(dst->engine);
-
-		de->bonds = kmemdup(se->bonds,
-				    sizeof(*se->bonds) * se->num_bonds,
-				    GFP_KERNEL);
-		if (!de->bonds) {
-			intel_context_put(dst);
-			return ERR_PTR(-ENOMEM);
-		}
-
-		de->num_bonds = se->num_bonds;
-	}
-
 	return dst;
 }
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling)
-{
-	struct virtual_engine *ve = to_virtual_engine(engine);
-	struct ve_bond *bond;
-	int n;
-
-	/* Sanity check the sibling is part of the virtual engine */
-	for (n = 0; n < ve->num_siblings; n++)
-		if (sibling == ve->siblings[n])
-			break;
-	if (n == ve->num_siblings)
-		return -EINVAL;
-
-	bond = virtual_find_bond(ve, master);
-	if (bond) {
-		bond->sibling_mask |= sibling->mask;
-		return 0;
-	}
-
-	bond = krealloc(ve->bonds,
-			sizeof(*bond) * (ve->num_bonds + 1),
-			GFP_KERNEL);
-	if (!bond)
-		return -ENOMEM;
-
-	bond[ve->num_bonds].master = master;
-	bond[ve->num_bonds].sibling_mask = sibling->mask;
-
-	ve->bonds = bond;
-	ve->num_bonds++;
-
-	return 0;
-}
-
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index fd61dae820e9e..80cec37a56ba9 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 struct intel_context *
 intel_execlists_clone_virtual(struct intel_engine_cs *src);
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling);
-
 bool
 intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 1081cd36a2bd3..f03446d587160 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
 	return 0;
 }
 
-static int bond_virtual_engine(struct intel_gt *gt,
-			       unsigned int class,
-			       struct intel_engine_cs **siblings,
-			       unsigned int nsibling,
-			       unsigned int flags)
-#define BOND_SCHEDULE BIT(0)
-{
-	struct intel_engine_cs *master;
-	struct i915_request *rq[16];
-	enum intel_engine_id id;
-	struct igt_spinner spin;
-	unsigned long n;
-	int err;
-
-	/*
-	 * A set of bonded requests is intended to be run concurrently
-	 * across a number of engines. We use one request per-engine
-	 * and a magic fence to schedule each of the bonded requests
-	 * at the same time. A consequence of our current scheduler is that
-	 * we only move requests to the HW ready queue when the request
-	 * becomes ready, that is when all of its prerequisite fences have
-	 * been signaled. As one of those fences is the master submit fence,
-	 * there is a delay on all secondary fences as the HW may be
-	 * currently busy. Equally, as all the requests are independent,
-	 * they may have other fences that delay individual request
-	 * submission to HW. Ergo, we do not guarantee that all requests are
-	 * immediately submitted to HW at the same time, just that if the
-	 * rules are abided by, they are ready at the same time as the
-	 * first is submitted. Userspace can embed semaphores in its batch
-	 * to ensure parallel execution of its phases as it requires.
-	 * Though naturally it gets requested that perhaps the scheduler should
-	 * take care of parallel execution, even across preemption events on
-	 * different HW. (The proper answer is of course "lalalala".)
-	 *
-	 * With the submit-fence, we have identified three possible phases
-	 * of synchronisation depending on the master fence: queued (not
-	 * ready), executing, and signaled. The first two are quite simple
-	 * and checked below. However, the signaled master fence handling is
-	 * contentious. Currently we do not distinguish between a signaled
-	 * fence and an expired fence, as once signaled it does not convey
-	 * any information about the previous execution. It may even be freed
-	 * and hence checking later it may not exist at all. Ergo we currently
-	 * do not apply the bonding constraint for an already signaled fence,
-	 * as our expectation is that it should not constrain the secondaries
-	 * and is outside of the scope of the bonded request API (i.e. all
-	 * userspace requests are meant to be running in parallel). As
-	 * it imposes no constraint, and is effectively a no-op, we do not
-	 * check below as normal execution flows are checked extensively above.
-	 *
-	 * XXX Is the degenerate handling of signaled submit fences the
-	 * expected behaviour for userpace?
-	 */
-
-	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
-
-	if (igt_spinner_init(&spin, gt))
-		return -ENOMEM;
-
-	err = 0;
-	rq[0] = ERR_PTR(-ENOMEM);
-	for_each_engine(master, gt, id) {
-		struct i915_sw_fence fence = {};
-		struct intel_context *ce;
-
-		if (master->class == class)
-			continue;
-
-		ce = intel_context_create(master);
-		if (IS_ERR(ce)) {
-			err = PTR_ERR(ce);
-			goto out;
-		}
-
-		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
-
-		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
-		intel_context_put(ce);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			goto out;
-		}
-		i915_request_get(rq[0]);
-
-		if (flags & BOND_SCHEDULE) {
-			onstack_fence_init(&fence);
-			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
-							       &fence,
-							       GFP_KERNEL);
-		}
-
-		i915_request_add(rq[0]);
-		if (err < 0)
-			goto out;
-
-		if (!(flags & BOND_SCHEDULE) &&
-		    !igt_wait_for_spinner(&spin, rq[0])) {
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			struct intel_context *ve;
-
-			ve = intel_execlists_create_virtual(siblings, nsibling);
-			if (IS_ERR(ve)) {
-				err = PTR_ERR(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_virtual_engine_attach_bond(ve->engine,
-							       master,
-							       siblings[n]);
-			if (err) {
-				intel_context_put(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_context_pin(ve);
-			intel_context_put(ve);
-			if (err) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			rq[n + 1] = i915_request_create(ve);
-			intel_context_unpin(ve);
-			if (IS_ERR(rq[n + 1])) {
-				err = PTR_ERR(rq[n + 1]);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-			i915_request_get(rq[n + 1]);
-
-			err = i915_request_await_execution(rq[n + 1],
-							   &rq[0]->fence,
-							   ve->engine->bond_execute);
-			i915_request_add(rq[n + 1]);
-			if (err < 0) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-		}
-		onstack_fence_fini(&fence);
-		intel_engine_flush_submission(master);
-		igt_spinner_end(&spin);
-
-		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
-			pr_err("Master request did not execute (on %s)!\n",
-			       rq[0]->engine->name);
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			if (i915_request_wait(rq[n + 1], 0,
-					      MAX_SCHEDULE_TIMEOUT) < 0) {
-				err = -EIO;
-				goto out;
-			}
-
-			if (rq[n + 1]->engine != siblings[n]) {
-				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
-				       siblings[n]->name,
-				       rq[n + 1]->engine->name,
-				       rq[0]->engine->name);
-				err = -EINVAL;
-				goto out;
-			}
-		}
-
-		for (n = 0; !IS_ERR(rq[n]); n++)
-			i915_request_put(rq[n]);
-		rq[0] = ERR_PTR(-ENOMEM);
-	}
-
-out:
-	for (n = 0; !IS_ERR(rq[n]); n++)
-		i915_request_put(rq[n]);
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_virtual_bond(void *arg)
-{
-	static const struct phase {
-		const char *name;
-		unsigned int flags;
-	} phases[] = {
-		{ "", 0 },
-		{ "schedule", BOND_SCHEDULE },
-		{ },
-	};
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class;
-	int err;
-
-	if (intel_uc_uses_guc_submission(&gt->uc))
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		const struct phase *p;
-		int nsibling;
-
-		nsibling = select_siblings(gt, class, siblings);
-		if (nsibling < 2)
-			continue;
-
-		for (p = phases; p->name; p++) {
-			err = bond_virtual_engine(gt,
-						  class, siblings, nsibling,
-						  p->flags);
-			if (err) {
-				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
-				       __func__, p->name, class, nsibling, err);
-				return err;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int reset_virtual_engine(struct intel_gt *gt,
 				struct intel_engine_cs **siblings,
 				unsigned int nsibling)
@@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_virtual_mask),
 		SUBTEST(live_virtual_preserved),
 		SUBTEST(live_virtual_slice),
-		SUBTEST(live_virtual_bond),
 		SUBTEST(live_virtual_reset),
 	};
 
-- 
2.31.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* [Intel-gfx] [PATCH 08/20] drm/i915/gem: Disallow bonding of virtual engines (v2)
@ 2021-04-26 23:43     ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-26 23:43 UTC (permalink / raw)
  To: dri-devel, intel-gfx

This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

Unless some userspace badly wants it, there's no good reason to support
this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
leave the validation code in place in case we ever decide we want to do
something interesting with the bonding information.

v2 (Jason Ekstrand):
 - Don't delete quite as much code.  Some of it was necessary.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
 .../drm/i915/gt/intel_execlists_submission.c  |  83 -------
 .../drm/i915/gt/intel_execlists_submission.h  |   4 -
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
 4 files changed, 6 insertions(+), 328 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e8179918fa306..5f8d0faf783aa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 	}
 	virtual = set->engines->engines[idx]->engine;
 
+	if (intel_engine_is_virtual(virtual)) {
+		drm_dbg(&i915->drm,
+			"Bonding with virtual engines not allowed\n");
+		return -EINVAL;
+	}
+
 	err = check_user_mbz(&ext->flags);
 	if (err)
 		return err;
@@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
 				n, ci.engine_class, ci.engine_instance);
 			return -EINVAL;
 		}
-
-		/*
-		 * A non-virtual engine has no siblings to choose between; and
-		 * a submit fence will always be directed to the one engine.
-		 */
-		if (intel_engine_is_virtual(virtual)) {
-			err = intel_virtual_engine_attach_bond(virtual,
-							       master,
-							       bond);
-			if (err)
-				return err;
-		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index de124870af44d..a6204c60b59cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -181,18 +181,6 @@ struct virtual_engine {
 		int prio;
 	} nodes[I915_NUM_ENGINES];
 
-	/*
-	 * Keep track of bonded pairs -- restrictions upon on our selection
-	 * of physical engines any particular request may be submitted to.
-	 * If we receive a submit-fence from a master engine, we will only
-	 * use one of sibling_mask physical engines.
-	 */
-	struct ve_bond {
-		const struct intel_engine_cs *master;
-		intel_engine_mask_t sibling_mask;
-	} *bonds;
-	unsigned int num_bonds;
-
 	/* And finally, which physical engines this virtual engine maps onto. */
 	unsigned int num_siblings;
 	struct intel_engine_cs *siblings[];
@@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
 	intel_breadcrumbs_free(ve->base.breadcrumbs);
 	intel_engine_free_request_pool(&ve->base);
 
-	kfree(ve->bonds);
 	kfree(ve);
 }
 
@@ -3560,33 +3547,13 @@ static void virtual_submit_request(struct i915_request *rq)
 	spin_unlock_irqrestore(&ve->base.active.lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
-		  const struct intel_engine_cs *master)
-{
-	int i;
-
-	for (i = 0; i < ve->num_bonds; i++) {
-		if (ve->bonds[i].master == master)
-			return &ve->bonds[i];
-	}
-
-	return NULL;
-}
-
 static void
 virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 {
-	struct virtual_engine *ve = to_virtual_engine(rq->engine);
 	intel_engine_mask_t allowed, exec;
-	struct ve_bond *bond;
 
 	allowed = ~to_request(signal)->engine->mask;
 
-	bond = virtual_find_bond(ve, to_request(signal)->engine);
-	if (bond)
-		allowed &= bond->sibling_mask;
-
 	/* Restrict the bonded request to run on only the available engines */
 	exec = READ_ONCE(rq->execution_mask);
 	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
@@ -3747,59 +3714,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
 	if (IS_ERR(dst))
 		return dst;
 
-	if (se->num_bonds) {
-		struct virtual_engine *de = to_virtual_engine(dst->engine);
-
-		de->bonds = kmemdup(se->bonds,
-				    sizeof(*se->bonds) * se->num_bonds,
-				    GFP_KERNEL);
-		if (!de->bonds) {
-			intel_context_put(dst);
-			return ERR_PTR(-ENOMEM);
-		}
-
-		de->num_bonds = se->num_bonds;
-	}
-
 	return dst;
 }
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling)
-{
-	struct virtual_engine *ve = to_virtual_engine(engine);
-	struct ve_bond *bond;
-	int n;
-
-	/* Sanity check the sibling is part of the virtual engine */
-	for (n = 0; n < ve->num_siblings; n++)
-		if (sibling == ve->siblings[n])
-			break;
-	if (n == ve->num_siblings)
-		return -EINVAL;
-
-	bond = virtual_find_bond(ve, master);
-	if (bond) {
-		bond->sibling_mask |= sibling->mask;
-		return 0;
-	}
-
-	bond = krealloc(ve->bonds,
-			sizeof(*bond) * (ve->num_bonds + 1),
-			GFP_KERNEL);
-	if (!bond)
-		return -ENOMEM;
-
-	bond[ve->num_bonds].master = master;
-	bond[ve->num_bonds].sibling_mask = sibling->mask;
-
-	ve->bonds = bond;
-	ve->num_bonds++;
-
-	return 0;
-}
-
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index fd61dae820e9e..80cec37a56ba9 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
 struct intel_context *
 intel_execlists_clone_virtual(struct intel_engine_cs *src);
 
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
-				     const struct intel_engine_cs *master,
-				     const struct intel_engine_cs *sibling);
-
 bool
 intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 1081cd36a2bd3..f03446d587160 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
 	return 0;
 }
 
-static int bond_virtual_engine(struct intel_gt *gt,
-			       unsigned int class,
-			       struct intel_engine_cs **siblings,
-			       unsigned int nsibling,
-			       unsigned int flags)
-#define BOND_SCHEDULE BIT(0)
-{
-	struct intel_engine_cs *master;
-	struct i915_request *rq[16];
-	enum intel_engine_id id;
-	struct igt_spinner spin;
-	unsigned long n;
-	int err;
-
-	/*
-	 * A set of bonded requests is intended to be run concurrently
-	 * across a number of engines. We use one request per-engine
-	 * and a magic fence to schedule each of the bonded requests
-	 * at the same time. A consequence of our current scheduler is that
-	 * we only move requests to the HW ready queue when the request
-	 * becomes ready, that is when all of its prerequisite fences have
-	 * been signaled. As one of those fences is the master submit fence,
-	 * there is a delay on all secondary fences as the HW may be
-	 * currently busy. Equally, as all the requests are independent,
-	 * they may have other fences that delay individual request
-	 * submission to HW. Ergo, we do not guarantee that all requests are
-	 * immediately submitted to HW at the same time, just that if the
-	 * rules are abided by, they are ready at the same time as the
-	 * first is submitted. Userspace can embed semaphores in its batch
-	 * to ensure parallel execution of its phases as it requires.
-	 * Though naturally it gets requested that perhaps the scheduler should
-	 * take care of parallel execution, even across preemption events on
-	 * different HW. (The proper answer is of course "lalalala".)
-	 *
-	 * With the submit-fence, we have identified three possible phases
-	 * of synchronisation depending on the master fence: queued (not
-	 * ready), executing, and signaled. The first two are quite simple
-	 * and checked below. However, the signaled master fence handling is
-	 * contentious. Currently we do not distinguish between a signaled
-	 * fence and an expired fence, as once signaled it does not convey
-	 * any information about the previous execution. It may even be freed
-	 * and hence checking later it may not exist at all. Ergo we currently
-	 * do not apply the bonding constraint for an already signaled fence,
-	 * as our expectation is that it should not constrain the secondaries
-	 * and is outside of the scope of the bonded request API (i.e. all
-	 * userspace requests are meant to be running in parallel). As
-	 * it imposes no constraint, and is effectively a no-op, we do not
-	 * check below as normal execution flows are checked extensively above.
-	 *
-	 * XXX Is the degenerate handling of signaled submit fences the
-	 * expected behaviour for userpace?
-	 */
-
-	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
-
-	if (igt_spinner_init(&spin, gt))
-		return -ENOMEM;
-
-	err = 0;
-	rq[0] = ERR_PTR(-ENOMEM);
-	for_each_engine(master, gt, id) {
-		struct i915_sw_fence fence = {};
-		struct intel_context *ce;
-
-		if (master->class == class)
-			continue;
-
-		ce = intel_context_create(master);
-		if (IS_ERR(ce)) {
-			err = PTR_ERR(ce);
-			goto out;
-		}
-
-		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
-
-		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
-		intel_context_put(ce);
-		if (IS_ERR(rq[0])) {
-			err = PTR_ERR(rq[0]);
-			goto out;
-		}
-		i915_request_get(rq[0]);
-
-		if (flags & BOND_SCHEDULE) {
-			onstack_fence_init(&fence);
-			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
-							       &fence,
-							       GFP_KERNEL);
-		}
-
-		i915_request_add(rq[0]);
-		if (err < 0)
-			goto out;
-
-		if (!(flags & BOND_SCHEDULE) &&
-		    !igt_wait_for_spinner(&spin, rq[0])) {
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			struct intel_context *ve;
-
-			ve = intel_execlists_create_virtual(siblings, nsibling);
-			if (IS_ERR(ve)) {
-				err = PTR_ERR(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_virtual_engine_attach_bond(ve->engine,
-							       master,
-							       siblings[n]);
-			if (err) {
-				intel_context_put(ve);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			err = intel_context_pin(ve);
-			intel_context_put(ve);
-			if (err) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-
-			rq[n + 1] = i915_request_create(ve);
-			intel_context_unpin(ve);
-			if (IS_ERR(rq[n + 1])) {
-				err = PTR_ERR(rq[n + 1]);
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-			i915_request_get(rq[n + 1]);
-
-			err = i915_request_await_execution(rq[n + 1],
-							   &rq[0]->fence,
-							   ve->engine->bond_execute);
-			i915_request_add(rq[n + 1]);
-			if (err < 0) {
-				onstack_fence_fini(&fence);
-				goto out;
-			}
-		}
-		onstack_fence_fini(&fence);
-		intel_engine_flush_submission(master);
-		igt_spinner_end(&spin);
-
-		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
-			pr_err("Master request did not execute (on %s)!\n",
-			       rq[0]->engine->name);
-			err = -EIO;
-			goto out;
-		}
-
-		for (n = 0; n < nsibling; n++) {
-			if (i915_request_wait(rq[n + 1], 0,
-					      MAX_SCHEDULE_TIMEOUT) < 0) {
-				err = -EIO;
-				goto out;
-			}
-
-			if (rq[n + 1]->engine != siblings[n]) {
-				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
-				       siblings[n]->name,
-				       rq[n + 1]->engine->name,
-				       rq[0]->engine->name);
-				err = -EINVAL;
-				goto out;
-			}
-		}
-
-		for (n = 0; !IS_ERR(rq[n]); n++)
-			i915_request_put(rq[n]);
-		rq[0] = ERR_PTR(-ENOMEM);
-	}
-
-out:
-	for (n = 0; !IS_ERR(rq[n]); n++)
-		i915_request_put(rq[n]);
-	if (igt_flush_test(gt->i915))
-		err = -EIO;
-
-	igt_spinner_fini(&spin);
-	return err;
-}
-
-static int live_virtual_bond(void *arg)
-{
-	static const struct phase {
-		const char *name;
-		unsigned int flags;
-	} phases[] = {
-		{ "", 0 },
-		{ "schedule", BOND_SCHEDULE },
-		{ },
-	};
-	struct intel_gt *gt = arg;
-	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
-	unsigned int class;
-	int err;
-
-	if (intel_uc_uses_guc_submission(&gt->uc))
-		return 0;
-
-	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
-		const struct phase *p;
-		int nsibling;
-
-		nsibling = select_siblings(gt, class, siblings);
-		if (nsibling < 2)
-			continue;
-
-		for (p = phases; p->name; p++) {
-			err = bond_virtual_engine(gt,
-						  class, siblings, nsibling,
-						  p->flags);
-			if (err) {
-				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
-				       __func__, p->name, class, nsibling, err);
-				return err;
-			}
-		}
-	}
-
-	return 0;
-}
-
 static int reset_virtual_engine(struct intel_gt *gt,
 				struct intel_engine_cs **siblings,
 				unsigned int nsibling)
@@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_virtual_mask),
 		SUBTEST(live_virtual_preserved),
 		SUBTEST(live_virtual_slice),
-		SUBTEST(live_virtual_bond),
 		SUBTEST(live_virtual_reset),
 	};
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 226+ messages in thread

* Re: [PATCH 10/21] drm/i915/request: Remove the hook from await_execution
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-26 23:44     ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-26 23:44 UTC (permalink / raw)
  To: Intel GFX, Maling list - DRI developers

Sadly, we can't have this patch as long as we support SUBMIT_FENCE.
Turns out this is used for something real. :-(

--Jason

On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> This was only ever used for bonded virtual engine execution.  Since
> that's no longer allowed, this is dead code.
>
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  3 +-
>  drivers/gpu/drm/i915/i915_request.c           | 42 ++++---------------
>  drivers/gpu/drm/i915/i915_request.h           |  4 +-
>  3 files changed, 9 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index efb2fa3522a42..7024adcd5cf15 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3473,8 +3473,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>         if (in_fence) {
>                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
>                         err = i915_request_await_execution(eb.request,
> -                                                          in_fence,
> -                                                          NULL);
> +                                                          in_fence);
>                 else
>                         err = i915_request_await_dma_fence(eb.request,
>                                                            in_fence);
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index bec9c3652188b..7e00218b8c105 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -49,7 +49,6 @@
>  struct execute_cb {
>         struct irq_work work;
>         struct i915_sw_fence *fence;
> -       void (*hook)(struct i915_request *rq, struct dma_fence *signal);
>         struct i915_request *signal;
>  };
>
> @@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk)
>         kmem_cache_free(global.slab_execute_cbs, cb);
>  }
>
> -static void irq_execute_cb_hook(struct irq_work *wrk)
> -{
> -       struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
> -
> -       cb->hook(container_of(cb->fence, struct i915_request, submit),
> -                &cb->signal->fence);
> -       i915_request_put(cb->signal);
> -
> -       irq_execute_cb(wrk);
> -}
> -
>  static __always_inline void
>  __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk))
>  {
> @@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request *signal)
>  static int
>  __await_execution(struct i915_request *rq,
>                   struct i915_request *signal,
> -                 void (*hook)(struct i915_request *rq,
> -                              struct dma_fence *signal),
>                   gfp_t gfp)
>  {
>         struct execute_cb *cb;
>
> -       if (i915_request_is_active(signal)) {
> -               if (hook)
> -                       hook(rq, &signal->fence);
> +       if (i915_request_is_active(signal))
>                 return 0;
> -       }
>
>         cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
>         if (!cb)
> @@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq,
>         i915_sw_fence_await(cb->fence);
>         init_irq_work(&cb->work, irq_execute_cb);
>
> -       if (hook) {
> -               cb->hook = hook;
> -               cb->signal = i915_request_get(signal);
> -               cb->work.func = irq_execute_cb_hook;
> -       }
> -
>         /*
>          * Register the callback first, then see if the signaler is already
>          * active. This ensures that if we race with the
> @@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to,
>                 goto await_fence;
>
>         /* Only submit our spinner after the signaler is running! */
> -       if (__await_execution(to, from, NULL, gfp))
> +       if (__await_execution(to, from, gfp))
>                 goto await_fence;
>
>         if (__emit_semaphore_wait(to, from, from->fence.seqno))
> @@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct intel_timeline *tl,
>
>  static int
>  __i915_request_await_execution(struct i915_request *to,
> -                              struct i915_request *from,
> -                              void (*hook)(struct i915_request *rq,
> -                                           struct dma_fence *signal))
> +                              struct i915_request *from)
>  {
>         int err;
>
>         GEM_BUG_ON(intel_context_is_barrier(from->context));
>
>         /* Submit both requests at the same time */
> -       err = __await_execution(to, from, hook, I915_FENCE_GFP);
> +       err = __await_execution(to, from, I915_FENCE_GFP);
>         if (err)
>                 return err;
>
> @@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
>
>  int
>  i915_request_await_execution(struct i915_request *rq,
> -                            struct dma_fence *fence,
> -                            void (*hook)(struct i915_request *rq,
> -                                         struct dma_fence *signal))
> +                            struct dma_fence *fence)
>  {
>         struct dma_fence **child = &fence;
>         unsigned int nchild = 1;
> @@ -1441,8 +1414,7 @@ i915_request_await_execution(struct i915_request *rq,
>
>                 if (dma_fence_is_i915(fence))
>                         ret = __i915_request_await_execution(rq,
> -                                                            to_request(fence),
> -                                                            hook);
> +                                                            to_request(fence));
>                 else
>                         ret = i915_request_await_external(rq, fence);
>                 if (ret < 0)
> @@ -1468,7 +1440,7 @@ await_request_submit(struct i915_request *to, struct i915_request *from)
>                                                         &from->submit,
>                                                         I915_FENCE_GFP);
>         else
> -               return __i915_request_await_execution(to, from, NULL);
> +               return __i915_request_await_execution(to, from);
>  }
>
>  static int
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 270f6cd37650c..63b087a7f5707 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -352,9 +352,7 @@ int i915_request_await_object(struct i915_request *to,
>  int i915_request_await_dma_fence(struct i915_request *rq,
>                                  struct dma_fence *fence);
>  int i915_request_await_execution(struct i915_request *rq,
> -                                struct dma_fence *fence,
> -                                void (*hook)(struct i915_request *rq,
> -                                             struct dma_fence *signal));
> +                                struct dma_fence *fence);
>
>  void i915_request_add(struct i915_request *rq);
>
> --
> 2.31.1
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 10/21] drm/i915/request: Remove the hook from await_execution
@ 2021-04-26 23:44     ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-26 23:44 UTC (permalink / raw)
  To: Intel GFX, Maling list - DRI developers

Sadly, we can't have this patch as long as we support SUBMIT_FENCE.
Turns out this is used for something real. :-(

--Jason

On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> This was only ever used for bonded virtual engine execution.  Since
> that's no longer allowed, this is dead code.
>
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  3 +-
>  drivers/gpu/drm/i915/i915_request.c           | 42 ++++---------------
>  drivers/gpu/drm/i915/i915_request.h           |  4 +-
>  3 files changed, 9 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index efb2fa3522a42..7024adcd5cf15 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3473,8 +3473,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>         if (in_fence) {
>                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
>                         err = i915_request_await_execution(eb.request,
> -                                                          in_fence,
> -                                                          NULL);
> +                                                          in_fence);
>                 else
>                         err = i915_request_await_dma_fence(eb.request,
>                                                            in_fence);
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index bec9c3652188b..7e00218b8c105 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -49,7 +49,6 @@
>  struct execute_cb {
>         struct irq_work work;
>         struct i915_sw_fence *fence;
> -       void (*hook)(struct i915_request *rq, struct dma_fence *signal);
>         struct i915_request *signal;
>  };
>
> @@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk)
>         kmem_cache_free(global.slab_execute_cbs, cb);
>  }
>
> -static void irq_execute_cb_hook(struct irq_work *wrk)
> -{
> -       struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
> -
> -       cb->hook(container_of(cb->fence, struct i915_request, submit),
> -                &cb->signal->fence);
> -       i915_request_put(cb->signal);
> -
> -       irq_execute_cb(wrk);
> -}
> -
>  static __always_inline void
>  __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk))
>  {
> @@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request *signal)
>  static int
>  __await_execution(struct i915_request *rq,
>                   struct i915_request *signal,
> -                 void (*hook)(struct i915_request *rq,
> -                              struct dma_fence *signal),
>                   gfp_t gfp)
>  {
>         struct execute_cb *cb;
>
> -       if (i915_request_is_active(signal)) {
> -               if (hook)
> -                       hook(rq, &signal->fence);
> +       if (i915_request_is_active(signal))
>                 return 0;
> -       }
>
>         cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
>         if (!cb)
> @@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq,
>         i915_sw_fence_await(cb->fence);
>         init_irq_work(&cb->work, irq_execute_cb);
>
> -       if (hook) {
> -               cb->hook = hook;
> -               cb->signal = i915_request_get(signal);
> -               cb->work.func = irq_execute_cb_hook;
> -       }
> -
>         /*
>          * Register the callback first, then see if the signaler is already
>          * active. This ensures that if we race with the
> @@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to,
>                 goto await_fence;
>
>         /* Only submit our spinner after the signaler is running! */
> -       if (__await_execution(to, from, NULL, gfp))
> +       if (__await_execution(to, from, gfp))
>                 goto await_fence;
>
>         if (__emit_semaphore_wait(to, from, from->fence.seqno))
> @@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct intel_timeline *tl,
>
>  static int
>  __i915_request_await_execution(struct i915_request *to,
> -                              struct i915_request *from,
> -                              void (*hook)(struct i915_request *rq,
> -                                           struct dma_fence *signal))
> +                              struct i915_request *from)
>  {
>         int err;
>
>         GEM_BUG_ON(intel_context_is_barrier(from->context));
>
>         /* Submit both requests at the same time */
> -       err = __await_execution(to, from, hook, I915_FENCE_GFP);
> +       err = __await_execution(to, from, I915_FENCE_GFP);
>         if (err)
>                 return err;
>
> @@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
>
>  int
>  i915_request_await_execution(struct i915_request *rq,
> -                            struct dma_fence *fence,
> -                            void (*hook)(struct i915_request *rq,
> -                                         struct dma_fence *signal))
> +                            struct dma_fence *fence)
>  {
>         struct dma_fence **child = &fence;
>         unsigned int nchild = 1;
> @@ -1441,8 +1414,7 @@ i915_request_await_execution(struct i915_request *rq,
>
>                 if (dma_fence_is_i915(fence))
>                         ret = __i915_request_await_execution(rq,
> -                                                            to_request(fence),
> -                                                            hook);
> +                                                            to_request(fence));
>                 else
>                         ret = i915_request_await_external(rq, fence);
>                 if (ret < 0)
> @@ -1468,7 +1440,7 @@ await_request_submit(struct i915_request *to, struct i915_request *from)
>                                                         &from->submit,
>                                                         I915_FENCE_GFP);
>         else
> -               return __i915_request_await_execution(to, from, NULL);
> +               return __i915_request_await_execution(to, from);
>  }
>
>  static int
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 270f6cd37650c..63b087a7f5707 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -352,9 +352,7 @@ int i915_request_await_object(struct i915_request *to,
>  int i915_request_await_dma_fence(struct i915_request *rq,
>                                  struct dma_fence *fence);
>  int i915_request_await_execution(struct i915_request *rq,
> -                                struct dma_fence *fence,
> -                                void (*hook)(struct i915_request *rq,
> -                                             struct dma_fence *signal));
> +                                struct dma_fence *fence);
>
>  void i915_request_add(struct i915_request *rq);
>
> --
> 2.31.1
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:32     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:32 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:11PM -0500, Jason Ekstrand wrote:
> This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
> ringsize on construction").  This API was originally added for OpenCL
> but the compute-runtime PR has sat open for a year without action so we
> can still pull it out if we want.  I argue we should drop it for three
> reasons:
> 
>  1. If the compute-runtime PR has sat open for a year, this clearly
>     isn't that important.
> 
>  2. It's a very leaky API.  Ring size is an implementation detail of the
>     current execlist scheduler and really only makes sense there.  It
>     can't apply to the older ring-buffer scheduler on pre-execlist
>     hardware because that's shared across all contexts and it won't
>     apply to the GuC scheduler that's in the pipeline.
> 
>  3. Having userspace set a ring size in bytes is a bad solution to the
>     problem of having too small a ring.  There is no way that userspace
>     has the information to know how to properly set the ring size so
>     it's just going to detect the feature and always set it to the
>     maximum of 512K.  This is what the compute-runtime PR does.  The
>     scheduler in i915, on the other hand, does have the information to
>     make an informed choice.  It could detect if the ring size is a
>     problem and grow it itself.  Or, if that's too hard, we could just
>     increase the default size from 16K to 32K or even 64K instead of
>     relying on userspace to do it.
> 
> Let's drop this API for now and, if someone decides they really care
> about solving this problem, they can do it properly.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Two things:
- I'm assuming you have an igt change to make sure we get EINVAL for both
  set and getparam now? Just to make sure.

- intel_context->ring is either a ring pointer when CONTEXT_ALLOC_BIT is
  set in ce->flags, or the size of the ring stored in the pointer if not.
  I'm seriously hoping you get rid of this complexity with your
  proto-context series, and also delete __intel_context_ring_size() in the
  end. That function has no business existing imo.

  If not, please make sure that's the case.

Aside from these patch looks good.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/Makefile                 |  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +------------------
>  drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --------------
>  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
>  include/uapi/drm/i915_drm.h                   | 20 +----
>  5 files changed, 4 insertions(+), 168 deletions(-)
>  delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index d0d936d9137bc..afa22338fa343 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -88,7 +88,6 @@ gt-y += \
>  	gt/gen8_ppgtt.o \
>  	gt/intel_breadcrumbs.o \
>  	gt/intel_context.o \
> -	gt/intel_context_param.o \
>  	gt/intel_context_sseu.o \
>  	gt/intel_engine_cs.o \
>  	gt/intel_engine_heartbeat.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index fd8ee52e17a47..e52b85b8f923d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
>  	return err;
>  }
>  
> -static int __apply_ringsize(struct intel_context *ce, void *sz)
> -{
> -	return intel_context_set_ring_size(ce, (unsigned long)sz);
> -}
> -
> -static int set_ringsize(struct i915_gem_context *ctx,
> -			struct drm_i915_gem_context_param *args)
> -{
> -	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> -		return -ENODEV;
> -
> -	if (args->size)
> -		return -EINVAL;
> -
> -	if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
> -		return -EINVAL;
> -
> -	if (args->value < I915_GTT_PAGE_SIZE)
> -		return -EINVAL;
> -
> -	if (args->value > 128 * I915_GTT_PAGE_SIZE)
> -		return -EINVAL;
> -
> -	return context_apply_all(ctx,
> -				 __apply_ringsize,
> -				 __intel_context_ring_size(args->value));
> -}
> -
> -static int __get_ringsize(struct intel_context *ce, void *arg)
> -{
> -	long sz;
> -
> -	sz = intel_context_get_ring_size(ce);
> -	GEM_BUG_ON(sz > INT_MAX);
> -
> -	return sz; /* stop on first engine */
> -}
> -
> -static int get_ringsize(struct i915_gem_context *ctx,
> -			struct drm_i915_gem_context_param *args)
> -{
> -	int sz;
> -
> -	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> -		return -ENODEV;
> -
> -	if (args->size)
> -		return -EINVAL;
> -
> -	sz = context_apply_all(ctx, __get_ringsize, NULL);
> -	if (sz < 0)
> -		return sz;
> -
> -	args->value = sz;
> -	return 0;
> -}
> -
>  int
>  i915_gem_user_to_context_sseu(struct intel_gt *gt,
>  			      const struct drm_i915_gem_context_param_sseu *user,
> @@ -2037,11 +1980,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  		ret = set_persistence(ctx, args);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_RINGSIZE:
> -		ret = set_ringsize(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
>  		ret = -EINVAL;
>  		break;
> @@ -2069,18 +2009,6 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>  	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>  }
>  
> -static int copy_ring_size(struct intel_context *dst,
> -			  struct intel_context *src)
> -{
> -	long sz;
> -
> -	sz = intel_context_get_ring_size(src);
> -	if (sz < 0)
> -		return sz;
> -
> -	return intel_context_set_ring_size(dst, sz);
> -}
> -
>  static int clone_engines(struct i915_gem_context *dst,
>  			 struct i915_gem_context *src)
>  {
> @@ -2125,12 +2053,6 @@ static int clone_engines(struct i915_gem_context *dst,
>  		}
>  
>  		intel_context_set_gem(clone->engines[n], dst);
> -
> -		/* Copy across the preferred ringsize */
> -		if (copy_ring_size(clone->engines[n], e->engines[n])) {
> -			__free_engines(clone, n + 1);
> -			goto err_unlock;
> -		}
>  	}
>  	clone->num_engines = n;
>  	i915_sw_fence_complete(&e->fence);
> @@ -2490,11 +2412,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		args->value = i915_gem_context_is_persistent(ctx);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_RINGSIZE:
> -		ret = get_ringsize(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
>  		ret = -EINVAL;
>  		break;
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
> deleted file mode 100644
> index 65dcd090245d6..0000000000000
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.c
> +++ /dev/null
> @@ -1,63 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_active.h"
> -#include "intel_context.h"
> -#include "intel_context_param.h"
> -#include "intel_ring.h"
> -
> -int intel_context_set_ring_size(struct intel_context *ce, long sz)
> -{
> -	int err;
> -
> -	if (intel_context_lock_pinned(ce))
> -		return -EINTR;
> -
> -	err = i915_active_wait(&ce->active);
> -	if (err < 0)
> -		goto unlock;
> -
> -	if (intel_context_is_pinned(ce)) {
> -		err = -EBUSY; /* In active use, come back later! */
> -		goto unlock;
> -	}
> -
> -	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> -		struct intel_ring *ring;
> -
> -		/* Replace the existing ringbuffer */
> -		ring = intel_engine_create_ring(ce->engine, sz);
> -		if (IS_ERR(ring)) {
> -			err = PTR_ERR(ring);
> -			goto unlock;
> -		}
> -
> -		intel_ring_put(ce->ring);
> -		ce->ring = ring;
> -
> -		/* Context image will be updated on next pin */
> -	} else {
> -		ce->ring = __intel_context_ring_size(sz);
> -	}
> -
> -unlock:
> -	intel_context_unlock_pinned(ce);
> -	return err;
> -}
> -
> -long intel_context_get_ring_size(struct intel_context *ce)
> -{
> -	long sz = (unsigned long)READ_ONCE(ce->ring);
> -
> -	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> -		if (intel_context_lock_pinned(ce))
> -			return -EINTR;
> -
> -		sz = ce->ring->size;
> -		intel_context_unlock_pinned(ce);
> -	}
> -
> -	return sz;
> -}
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index 3ecacc675f414..dffedd983693d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,9 +10,6 @@
>  
>  #include "intel_context.h"
>  
> -int intel_context_set_ring_size(struct intel_context *ce, long sz);
> -long intel_context_get_ring_size(struct intel_context *ce);
> -
>  static inline int
>  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>  {
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 6a34243a7646a..6eefbc6dec01f 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1721,24 +1721,8 @@ struct drm_i915_gem_context_param {
>   */
>  #define I915_CONTEXT_PARAM_PERSISTENCE	0xb
>  
> -/*
> - * I915_CONTEXT_PARAM_RINGSIZE:
> - *
> - * Sets the size of the CS ringbuffer to use for logical ring contexts. This
> - * applies a limit of how many batches can be queued to HW before the caller
> - * is blocked due to lack of space for more commands.
> - *
> - * Only reliably possible to be set prior to first use, i.e. during
> - * construction. At any later point, the current execution must be flushed as
> - * the ring can only be changed while the context is idle. Note, the ringsize
> - * can be specified as a constructor property, see
> - * I915_CONTEXT_CREATE_EXT_SETPARAM, but can also be set later if required.
> - *
> - * Only applies to the current set of engine and lost when those engines
> - * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
> - *
> - * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
> - * Default is 16 KiB.
> +/* This API has been removed.  On the off chance someone somewhere has
> + * attempted to use it, never re-use this context param number.
>   */
>  #define I915_CONTEXT_PARAM_RINGSIZE	0xc
>  /* Must be kept compact -- no holes and well documented */
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
@ 2021-04-27  9:32     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:32 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:11PM -0500, Jason Ekstrand wrote:
> This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
> ringsize on construction").  This API was originally added for OpenCL
> but the compute-runtime PR has sat open for a year without action so we
> can still pull it out if we want.  I argue we should drop it for three
> reasons:
> 
>  1. If the compute-runtime PR has sat open for a year, this clearly
>     isn't that important.
> 
>  2. It's a very leaky API.  Ring size is an implementation detail of the
>     current execlist scheduler and really only makes sense there.  It
>     can't apply to the older ring-buffer scheduler on pre-execlist
>     hardware because that's shared across all contexts and it won't
>     apply to the GuC scheduler that's in the pipeline.
> 
>  3. Having userspace set a ring size in bytes is a bad solution to the
>     problem of having too small a ring.  There is no way that userspace
>     has the information to know how to properly set the ring size so
>     it's just going to detect the feature and always set it to the
>     maximum of 512K.  This is what the compute-runtime PR does.  The
>     scheduler in i915, on the other hand, does have the information to
>     make an informed choice.  It could detect if the ring size is a
>     problem and grow it itself.  Or, if that's too hard, we could just
>     increase the default size from 16K to 32K or even 64K instead of
>     relying on userspace to do it.
> 
> Let's drop this API for now and, if someone decides they really care
> about solving this problem, they can do it properly.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Two things:
- I'm assuming you have an igt change to make sure we get EINVAL for both
  set and getparam now? Just to make sure.

- intel_context->ring is either a ring pointer when CONTEXT_ALLOC_BIT is
  set in ce->flags, or the size of the ring stored in the pointer if not.
  I'm seriously hoping you get rid of this complexity with your
  proto-context series, and also delete __intel_context_ring_size() in the
  end. That function has no business existing imo.

  If not, please make sure that's the case.

Aside from these patch looks good.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/Makefile                 |  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +------------------
>  drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --------------
>  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
>  include/uapi/drm/i915_drm.h                   | 20 +----
>  5 files changed, 4 insertions(+), 168 deletions(-)
>  delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index d0d936d9137bc..afa22338fa343 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -88,7 +88,6 @@ gt-y += \
>  	gt/gen8_ppgtt.o \
>  	gt/intel_breadcrumbs.o \
>  	gt/intel_context.o \
> -	gt/intel_context_param.o \
>  	gt/intel_context_sseu.o \
>  	gt/intel_engine_cs.o \
>  	gt/intel_engine_heartbeat.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index fd8ee52e17a47..e52b85b8f923d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
>  	return err;
>  }
>  
> -static int __apply_ringsize(struct intel_context *ce, void *sz)
> -{
> -	return intel_context_set_ring_size(ce, (unsigned long)sz);
> -}
> -
> -static int set_ringsize(struct i915_gem_context *ctx,
> -			struct drm_i915_gem_context_param *args)
> -{
> -	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> -		return -ENODEV;
> -
> -	if (args->size)
> -		return -EINVAL;
> -
> -	if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
> -		return -EINVAL;
> -
> -	if (args->value < I915_GTT_PAGE_SIZE)
> -		return -EINVAL;
> -
> -	if (args->value > 128 * I915_GTT_PAGE_SIZE)
> -		return -EINVAL;
> -
> -	return context_apply_all(ctx,
> -				 __apply_ringsize,
> -				 __intel_context_ring_size(args->value));
> -}
> -
> -static int __get_ringsize(struct intel_context *ce, void *arg)
> -{
> -	long sz;
> -
> -	sz = intel_context_get_ring_size(ce);
> -	GEM_BUG_ON(sz > INT_MAX);
> -
> -	return sz; /* stop on first engine */
> -}
> -
> -static int get_ringsize(struct i915_gem_context *ctx,
> -			struct drm_i915_gem_context_param *args)
> -{
> -	int sz;
> -
> -	if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> -		return -ENODEV;
> -
> -	if (args->size)
> -		return -EINVAL;
> -
> -	sz = context_apply_all(ctx, __get_ringsize, NULL);
> -	if (sz < 0)
> -		return sz;
> -
> -	args->value = sz;
> -	return 0;
> -}
> -
>  int
>  i915_gem_user_to_context_sseu(struct intel_gt *gt,
>  			      const struct drm_i915_gem_context_param_sseu *user,
> @@ -2037,11 +1980,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  		ret = set_persistence(ctx, args);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_RINGSIZE:
> -		ret = set_ringsize(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
>  		ret = -EINVAL;
>  		break;
> @@ -2069,18 +2009,6 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>  	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>  }
>  
> -static int copy_ring_size(struct intel_context *dst,
> -			  struct intel_context *src)
> -{
> -	long sz;
> -
> -	sz = intel_context_get_ring_size(src);
> -	if (sz < 0)
> -		return sz;
> -
> -	return intel_context_set_ring_size(dst, sz);
> -}
> -
>  static int clone_engines(struct i915_gem_context *dst,
>  			 struct i915_gem_context *src)
>  {
> @@ -2125,12 +2053,6 @@ static int clone_engines(struct i915_gem_context *dst,
>  		}
>  
>  		intel_context_set_gem(clone->engines[n], dst);
> -
> -		/* Copy across the preferred ringsize */
> -		if (copy_ring_size(clone->engines[n], e->engines[n])) {
> -			__free_engines(clone, n + 1);
> -			goto err_unlock;
> -		}
>  	}
>  	clone->num_engines = n;
>  	i915_sw_fence_complete(&e->fence);
> @@ -2490,11 +2412,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		args->value = i915_gem_context_is_persistent(ctx);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_RINGSIZE:
> -		ret = get_ringsize(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
>  		ret = -EINVAL;
>  		break;
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
> deleted file mode 100644
> index 65dcd090245d6..0000000000000
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.c
> +++ /dev/null
> @@ -1,63 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_active.h"
> -#include "intel_context.h"
> -#include "intel_context_param.h"
> -#include "intel_ring.h"
> -
> -int intel_context_set_ring_size(struct intel_context *ce, long sz)
> -{
> -	int err;
> -
> -	if (intel_context_lock_pinned(ce))
> -		return -EINTR;
> -
> -	err = i915_active_wait(&ce->active);
> -	if (err < 0)
> -		goto unlock;
> -
> -	if (intel_context_is_pinned(ce)) {
> -		err = -EBUSY; /* In active use, come back later! */
> -		goto unlock;
> -	}
> -
> -	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> -		struct intel_ring *ring;
> -
> -		/* Replace the existing ringbuffer */
> -		ring = intel_engine_create_ring(ce->engine, sz);
> -		if (IS_ERR(ring)) {
> -			err = PTR_ERR(ring);
> -			goto unlock;
> -		}
> -
> -		intel_ring_put(ce->ring);
> -		ce->ring = ring;
> -
> -		/* Context image will be updated on next pin */
> -	} else {
> -		ce->ring = __intel_context_ring_size(sz);
> -	}
> -
> -unlock:
> -	intel_context_unlock_pinned(ce);
> -	return err;
> -}
> -
> -long intel_context_get_ring_size(struct intel_context *ce)
> -{
> -	long sz = (unsigned long)READ_ONCE(ce->ring);
> -
> -	if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> -		if (intel_context_lock_pinned(ce))
> -			return -EINTR;
> -
> -		sz = ce->ring->size;
> -		intel_context_unlock_pinned(ce);
> -	}
> -
> -	return sz;
> -}
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index 3ecacc675f414..dffedd983693d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,9 +10,6 @@
>  
>  #include "intel_context.h"
>  
> -int intel_context_set_ring_size(struct intel_context *ce, long sz);
> -long intel_context_get_ring_size(struct intel_context *ce);
> -
>  static inline int
>  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>  {
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 6a34243a7646a..6eefbc6dec01f 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1721,24 +1721,8 @@ struct drm_i915_gem_context_param {
>   */
>  #define I915_CONTEXT_PARAM_PERSISTENCE	0xb
>  
> -/*
> - * I915_CONTEXT_PARAM_RINGSIZE:
> - *
> - * Sets the size of the CS ringbuffer to use for logical ring contexts. This
> - * applies a limit of how many batches can be queued to HW before the caller
> - * is blocked due to lack of space for more commands.
> - *
> - * Only reliably possible to be set prior to first use, i.e. during
> - * construction. At any later point, the current execution must be flushed as
> - * the ring can only be changed while the context is idle. Note, the ringsize
> - * can be specified as a constructor property, see
> - * I915_CONTEXT_CREATE_EXT_SETPARAM, but can also be set later if required.
> - *
> - * Only applies to the current set of engine and lost when those engines
> - * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
> - *
> - * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
> - * Default is 16 KiB.
> +/* This API has been removed.  On the off chance someone somewhere has
> + * attempted to use it, never re-use this context param number.
>   */
>  #define I915_CONTEXT_PARAM_RINGSIZE	0xc
>  /* Must be kept compact -- no holes and well documented */
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 02/21] drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:38     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:38 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:12PM -0500, Jason Ekstrand wrote:
> The idea behind this param is to support OpenCL drivers with relocations
> because OpenCL reserves 0x0 for NULL and, if we placed memory there, it
> would confuse CL kernels.  It was originally sent out as part of a patch
> series including libdrm [1] and Beignet [2] support.  However, the
> libdrm and Beignet patches never landed in their respective upstream
> projects so this API has never been used.  It's never been used in Mesa
> or any other driver, either.
> 
> Dropping this API allows us to delete a small bit of code.
> 
> [1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html
> [2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Same thing about an igt making sure we reject these. Maybe an entire
wash-up igt which validates all the new restrictions on get/setparam
(including that after execbuf it's even more strict).
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c      | 16 ++--------------
>  .../gpu/drm/i915/gem/i915_gem_context_types.h    |  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   |  8 --------
>  include/uapi/drm/i915_drm.h                      |  4 ++++
>  4 files changed, 6 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e52b85b8f923d..35bcdeddfbf3f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1922,15 +1922,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  	int ret = 0;
>  
>  	switch (args->param) {
> -	case I915_CONTEXT_PARAM_NO_ZEROMAP:
> -		if (args->size)
> -			ret = -EINVAL;
> -		else if (args->value)
> -			set_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> -		else
> -			clear_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
>  		if (args->size)
>  			ret = -EINVAL;
> @@ -1980,6 +1971,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  		ret = set_persistence(ctx, args);
>  		break;
>  
> +	case I915_CONTEXT_PARAM_NO_ZEROMAP:
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
> @@ -2360,11 +2352,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		return -ENOENT;
>  
>  	switch (args->param) {
> -	case I915_CONTEXT_PARAM_NO_ZEROMAP:
> -		args->size = 0;
> -		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_GTT_SIZE:
>  		args->size = 0;
>  		rcu_read_lock();
> @@ -2412,6 +2399,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		args->value = i915_gem_context_is_persistent(ctx);
>  		break;
>  
> +	case I915_CONTEXT_PARAM_NO_ZEROMAP:
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 340473aa70de0..5ae71ec936f7c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -129,7 +129,6 @@ struct i915_gem_context {
>  	 * @user_flags: small set of booleans controlled by the user
>  	 */
>  	unsigned long user_flags;
> -#define UCONTEXT_NO_ZEROMAP		0
>  #define UCONTEXT_NO_ERROR_CAPTURE	1
>  #define UCONTEXT_BANNABLE		2
>  #define UCONTEXT_RECOVERABLE		3
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 297143511f99b..b812f313422a9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -290,7 +290,6 @@ struct i915_execbuffer {
>  	struct intel_context *reloc_context;
>  
>  	u64 invalid_flags; /** Set of execobj.flags that are invalid */
> -	u32 context_flags; /** Set of execobj.flags to insert from the ctx */
>  
>  	u64 batch_len; /** Length of batch within object */
>  	u32 batch_start_offset; /** Location within object of batch */
> @@ -541,9 +540,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
>  			entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
>  	}
>  
> -	if (!(entry->flags & EXEC_OBJECT_PINNED))
> -		entry->flags |= eb->context_flags;
> -
>  	return 0;
>  }
>  
> @@ -750,10 +746,6 @@ static int eb_select_context(struct i915_execbuffer *eb)
>  	if (rcu_access_pointer(ctx->vm))
>  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
>  
> -	eb->context_flags = 0;
> -	if (test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags))
> -		eb->context_flags |= __EXEC_OBJECT_NEEDS_BIAS;
> -
>  	return 0;
>  }
>  
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 6eefbc6dec01f..a0aaa8298f28d 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1637,6 +1637,10 @@ struct drm_i915_gem_context_param {
>  	__u32 size;
>  	__u64 param;
>  #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
> +/* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed.  On the off chance
> + * someone somewhere has attempted to use it, never re-use this context
> + * param number.
> + */
>  #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
>  #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
>  #define I915_CONTEXT_PARAM_NO_ERROR_CAPTURE	0x4
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 02/21] drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
@ 2021-04-27  9:38     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:38 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:12PM -0500, Jason Ekstrand wrote:
> The idea behind this param is to support OpenCL drivers with relocations
> because OpenCL reserves 0x0 for NULL and, if we placed memory there, it
> would confuse CL kernels.  It was originally sent out as part of a patch
> series including libdrm [1] and Beignet [2] support.  However, the
> libdrm and Beignet patches never landed in their respective upstream
> projects so this API has never been used.  It's never been used in Mesa
> or any other driver, either.
> 
> Dropping this API allows us to delete a small bit of code.
> 
> [1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html
> [2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Same thing about an igt making sure we reject these. Maybe an entire
wash-up igt which validates all the new restrictions on get/setparam
(including that after execbuf it's even more strict).
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c      | 16 ++--------------
>  .../gpu/drm/i915/gem/i915_gem_context_types.h    |  1 -
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c   |  8 --------
>  include/uapi/drm/i915_drm.h                      |  4 ++++
>  4 files changed, 6 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e52b85b8f923d..35bcdeddfbf3f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1922,15 +1922,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  	int ret = 0;
>  
>  	switch (args->param) {
> -	case I915_CONTEXT_PARAM_NO_ZEROMAP:
> -		if (args->size)
> -			ret = -EINVAL;
> -		else if (args->value)
> -			set_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> -		else
> -			clear_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
>  		if (args->size)
>  			ret = -EINVAL;
> @@ -1980,6 +1971,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  		ret = set_persistence(ctx, args);
>  		break;
>  
> +	case I915_CONTEXT_PARAM_NO_ZEROMAP:
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
> @@ -2360,11 +2352,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		return -ENOENT;
>  
>  	switch (args->param) {
> -	case I915_CONTEXT_PARAM_NO_ZEROMAP:
> -		args->size = 0;
> -		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_GTT_SIZE:
>  		args->size = 0;
>  		rcu_read_lock();
> @@ -2412,6 +2399,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		args->value = i915_gem_context_is_persistent(ctx);
>  		break;
>  
> +	case I915_CONTEXT_PARAM_NO_ZEROMAP:
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
>  	case I915_CONTEXT_PARAM_RINGSIZE:
>  	default:
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 340473aa70de0..5ae71ec936f7c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -129,7 +129,6 @@ struct i915_gem_context {
>  	 * @user_flags: small set of booleans controlled by the user
>  	 */
>  	unsigned long user_flags;
> -#define UCONTEXT_NO_ZEROMAP		0
>  #define UCONTEXT_NO_ERROR_CAPTURE	1
>  #define UCONTEXT_BANNABLE		2
>  #define UCONTEXT_RECOVERABLE		3
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 297143511f99b..b812f313422a9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -290,7 +290,6 @@ struct i915_execbuffer {
>  	struct intel_context *reloc_context;
>  
>  	u64 invalid_flags; /** Set of execobj.flags that are invalid */
> -	u32 context_flags; /** Set of execobj.flags to insert from the ctx */
>  
>  	u64 batch_len; /** Length of batch within object */
>  	u32 batch_start_offset; /** Location within object of batch */
> @@ -541,9 +540,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
>  			entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
>  	}
>  
> -	if (!(entry->flags & EXEC_OBJECT_PINNED))
> -		entry->flags |= eb->context_flags;
> -
>  	return 0;
>  }
>  
> @@ -750,10 +746,6 @@ static int eb_select_context(struct i915_execbuffer *eb)
>  	if (rcu_access_pointer(ctx->vm))
>  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
>  
> -	eb->context_flags = 0;
> -	if (test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags))
> -		eb->context_flags |= __EXEC_OBJECT_NEEDS_BIAS;
> -
>  	return 0;
>  }
>  
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 6eefbc6dec01f..a0aaa8298f28d 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1637,6 +1637,10 @@ struct drm_i915_gem_context_param {
>  	__u32 size;
>  	__u64 param;
>  #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
> +/* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed.  On the off chance
> + * someone somewhere has attempted to use it, never re-use this context
> + * param number.
> + */
>  #define I915_CONTEXT_PARAM_NO_ZEROMAP	0x2
>  #define I915_CONTEXT_PARAM_GTT_SIZE	0x3
>  #define I915_CONTEXT_PARAM_NO_ERROR_CAPTURE	0x4
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:42     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:42 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:13PM -0500, Jason Ekstrand wrote:
> Instead of handling it like a context param, unconditionally set it when
> intel_contexts are created.  This doesn't fix anything but does simplify
> the code a bit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

So the idea here is that since years we've had a watchdog uapi floating
about. Aim was for media, so that they could set very tight deadlines for
their transcodes jobs, so that if you have a corrupt bitstream (especially
for decoding) you don't hang your desktop unecessarily wrong.

But it's been stuck in limbo since forever, plus I get how this gets a bit
in the way of the proto ctx work, so makes sense to remove this prep work
again.

Maybe include the above in the commit message for a notch more context.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>  3 files changed, 6 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 35bcdeddfbf3f..1091cc04a242a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>  	    intel_engine_has_timeslices(ce->engine))
>  		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>  
> -	intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> +	if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> +	    ctx->i915->params.request_timeout_ms) {
> +		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> +		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> +	}
>  }
>  
>  static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
>  	context_apply_all(ctx, __apply_timeline, timeline);
>  }
>  
> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> -{
> -	return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> -}
> -
> -static int
> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> -{
> -	int ret;
> -
> -	ret = context_apply_all(ctx, __apply_watchdog,
> -				(void *)(uintptr_t)timeout_us);
> -	if (!ret)
> -		ctx->watchdog.timeout_us = timeout_us;
> -
> -	return ret;
> -}
> -
> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> -{
> -	struct drm_i915_private *i915 = ctx->i915;
> -	int ret;
> -
> -	if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> -	    !i915->params.request_timeout_ms)
> -		return;
> -
> -	/* Default expiry for user fences. */
> -	ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> -	if (ret)
> -		drm_notice(&i915->drm,
> -			   "Failed to configure default fence expiry! (%d)",
> -			   ret);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  {
> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  		intel_timeline_put(timeline);
>  	}
>  
> -	__set_default_fence_expiry(ctx);
> -
>  	trace_i915_context_create(ctx);
>  
>  	return ctx;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 5ae71ec936f7c..676592e27e7d2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -153,10 +153,6 @@ struct i915_gem_context {
>  	 */
>  	atomic_t active_count;
>  
> -	struct {
> -		u64 timeout_us;
> -	} watchdog;
> -
>  	/**
>  	 * @hang_timestamp: The last time(s) this context caused a GPU hang
>  	 */
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index dffedd983693d..0c69cb42d075c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,11 +10,10 @@
>  
>  #include "intel_context.h"
>  
> -static inline int
> +static inline void
>  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>  {
>  	ce->watchdog.timeout_us = timeout_us;
> -	return 0;
>  }
>  
>  #endif /* INTEL_CONTEXT_PARAM_H */
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-27  9:42     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:42 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:13PM -0500, Jason Ekstrand wrote:
> Instead of handling it like a context param, unconditionally set it when
> intel_contexts are created.  This doesn't fix anything but does simplify
> the code a bit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

So the idea here is that since years we've had a watchdog uapi floating
about. Aim was for media, so that they could set very tight deadlines for
their transcodes jobs, so that if you have a corrupt bitstream (especially
for decoding) you don't hang your desktop unecessarily wrong.

But it's been stuck in limbo since forever, plus I get how this gets a bit
in the way of the proto ctx work, so makes sense to remove this prep work
again.

Maybe include the above in the commit message for a notch more context.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>  3 files changed, 6 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 35bcdeddfbf3f..1091cc04a242a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>  	    intel_engine_has_timeslices(ce->engine))
>  		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>  
> -	intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> +	if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> +	    ctx->i915->params.request_timeout_ms) {
> +		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> +		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> +	}
>  }
>  
>  static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
>  	context_apply_all(ctx, __apply_timeline, timeline);
>  }
>  
> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> -{
> -	return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> -}
> -
> -static int
> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> -{
> -	int ret;
> -
> -	ret = context_apply_all(ctx, __apply_watchdog,
> -				(void *)(uintptr_t)timeout_us);
> -	if (!ret)
> -		ctx->watchdog.timeout_us = timeout_us;
> -
> -	return ret;
> -}
> -
> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> -{
> -	struct drm_i915_private *i915 = ctx->i915;
> -	int ret;
> -
> -	if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> -	    !i915->params.request_timeout_ms)
> -		return;
> -
> -	/* Default expiry for user fences. */
> -	ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> -	if (ret)
> -		drm_notice(&i915->drm,
> -			   "Failed to configure default fence expiry! (%d)",
> -			   ret);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  {
> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  		intel_timeline_put(timeline);
>  	}
>  
> -	__set_default_fence_expiry(ctx);
> -
>  	trace_i915_context_create(ctx);
>  
>  	return ctx;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 5ae71ec936f7c..676592e27e7d2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -153,10 +153,6 @@ struct i915_gem_context {
>  	 */
>  	atomic_t active_count;
>  
> -	struct {
> -		u64 timeout_us;
> -	} watchdog;
> -
>  	/**
>  	 * @hang_timestamp: The last time(s) this context caused a GPU hang
>  	 */
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index dffedd983693d..0c69cb42d075c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,11 +10,10 @@
>  
>  #include "intel_context.h"
>  
> -static inline int
> +static inline void
>  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>  {
>  	ce->watchdog.timeout_us = timeout_us;
> -	return 0;
>  }
>  
>  #endif /* INTEL_CONTEXT_PARAM_H */
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 04/21] drm/i915/gem: Return void from context_apply_all
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:42     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:42 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:14PM -0500, Jason Ekstrand wrote:
> None of the callbacks we use with it return an error code anymore; they
> all return 0 unconditionally.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Nice.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 26 +++++++--------------
>  1 file changed, 8 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 1091cc04a242a..8a77855123cec 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -718,32 +718,25 @@ __context_engines_await(const struct i915_gem_context *ctx,
>  	return engines;
>  }
>  
> -static int
> +static void
>  context_apply_all(struct i915_gem_context *ctx,
> -		  int (*fn)(struct intel_context *ce, void *data),
> +		  void (*fn)(struct intel_context *ce, void *data),
>  		  void *data)
>  {
>  	struct i915_gem_engines_iter it;
>  	struct i915_gem_engines *e;
>  	struct intel_context *ce;
> -	int err = 0;
>  
>  	e = __context_engines_await(ctx, NULL);
> -	for_each_gem_engine(ce, e, it) {
> -		err = fn(ce, data);
> -		if (err)
> -			break;
> -	}
> +	for_each_gem_engine(ce, e, it)
> +		fn(ce, data);
>  	i915_sw_fence_complete(&e->fence);
> -
> -	return err;
>  }
>  
> -static int __apply_ppgtt(struct intel_context *ce, void *vm)
> +static void __apply_ppgtt(struct intel_context *ce, void *vm)
>  {
>  	i915_vm_put(ce->vm);
>  	ce->vm = i915_vm_get(vm);
> -	return 0;
>  }
>  
>  static struct i915_address_space *
> @@ -783,10 +776,9 @@ static void __set_timeline(struct intel_timeline **dst,
>  		intel_timeline_put(old);
>  }
>  
> -static int __apply_timeline(struct intel_context *ce, void *timeline)
> +static void __apply_timeline(struct intel_context *ce, void *timeline)
>  {
>  	__set_timeline(&ce->timeline, timeline);
> -	return 0;
>  }
>  
>  static void __assign_timeline(struct i915_gem_context *ctx,
> @@ -1842,19 +1834,17 @@ set_persistence(struct i915_gem_context *ctx,
>  	return __context_set_persistence(ctx, args->value);
>  }
>  
> -static int __apply_priority(struct intel_context *ce, void *arg)
> +static void __apply_priority(struct intel_context *ce, void *arg)
>  {
>  	struct i915_gem_context *ctx = arg;
>  
>  	if (!intel_engine_has_timeslices(ce->engine))
> -		return 0;
> +		return;
>  
>  	if (ctx->sched.priority >= I915_PRIORITY_NORMAL)
>  		intel_context_set_use_semaphores(ce);
>  	else
>  		intel_context_clear_use_semaphores(ce);
> -
> -	return 0;
>  }
>  
>  static int set_priority(struct i915_gem_context *ctx,
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 04/21] drm/i915/gem: Return void from context_apply_all
@ 2021-04-27  9:42     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:42 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:14PM -0500, Jason Ekstrand wrote:
> None of the callbacks we use with it return an error code anymore; they
> all return 0 unconditionally.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Nice.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 26 +++++++--------------
>  1 file changed, 8 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 1091cc04a242a..8a77855123cec 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -718,32 +718,25 @@ __context_engines_await(const struct i915_gem_context *ctx,
>  	return engines;
>  }
>  
> -static int
> +static void
>  context_apply_all(struct i915_gem_context *ctx,
> -		  int (*fn)(struct intel_context *ce, void *data),
> +		  void (*fn)(struct intel_context *ce, void *data),
>  		  void *data)
>  {
>  	struct i915_gem_engines_iter it;
>  	struct i915_gem_engines *e;
>  	struct intel_context *ce;
> -	int err = 0;
>  
>  	e = __context_engines_await(ctx, NULL);
> -	for_each_gem_engine(ce, e, it) {
> -		err = fn(ce, data);
> -		if (err)
> -			break;
> -	}
> +	for_each_gem_engine(ce, e, it)
> +		fn(ce, data);
>  	i915_sw_fence_complete(&e->fence);
> -
> -	return err;
>  }
>  
> -static int __apply_ppgtt(struct intel_context *ce, void *vm)
> +static void __apply_ppgtt(struct intel_context *ce, void *vm)
>  {
>  	i915_vm_put(ce->vm);
>  	ce->vm = i915_vm_get(vm);
> -	return 0;
>  }
>  
>  static struct i915_address_space *
> @@ -783,10 +776,9 @@ static void __set_timeline(struct intel_timeline **dst,
>  		intel_timeline_put(old);
>  }
>  
> -static int __apply_timeline(struct intel_context *ce, void *timeline)
> +static void __apply_timeline(struct intel_context *ce, void *timeline)
>  {
>  	__set_timeline(&ce->timeline, timeline);
> -	return 0;
>  }
>  
>  static void __assign_timeline(struct i915_gem_context *ctx,
> @@ -1842,19 +1834,17 @@ set_persistence(struct i915_gem_context *ctx,
>  	return __context_set_persistence(ctx, args->value);
>  }
>  
> -static int __apply_priority(struct intel_context *ce, void *arg)
> +static void __apply_priority(struct intel_context *ce, void *arg)
>  {
>  	struct i915_gem_context *ctx = arg;
>  
>  	if (!intel_engine_has_timeslices(ce->engine))
> -		return 0;
> +		return;
>  
>  	if (ctx->sched.priority >= I915_PRIORITY_NORMAL)
>  		intel_context_set_use_semaphores(ce);
>  	else
>  		intel_context_clear_use_semaphores(ce);
> -
> -	return 0;
>  }
>  
>  static int set_priority(struct i915_gem_context *ctx,
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:49     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:49 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:15PM -0500, Jason Ekstrand wrote:
> This API allows one context to grab bits out of another context upon
> creation.  It can be used as a short-cut for setparam(getparam()) for
> things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> real userspace.  It's used by a few IGT tests and that's it.  Since it
> doesn't add any real value (most of the stuff you can CLONE you can copy
> in other ways), drop it.
> 
> There is one thing that this API allows you to clone which you cannot
> clone via getparam/setparam: timelines.  However, timelines are an
> implementation detail of i915 and not really something that needs to be
> exposed to userspace.  Also, sharing timelines between contexts isn't
> obviously useful and supporting it has the potential to complicate i915
> internally.  It also doesn't add any functionality that the client can't
> get in other ways.  If a client really wants a shared timeline, they can
> use a syncobj and set it as an in and out fence on every submit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
>  include/uapi/drm/i915_drm.h                 |  16 +-
>  2 files changed, 6 insertions(+), 209 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a77855123cec..2c2fefa912805 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>  	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>  }
>  
> -static int clone_engines(struct i915_gem_context *dst,
> -			 struct i915_gem_context *src)
> +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
>  {
> -	struct i915_gem_engines *clone, *e;
> -	bool user_engines;
> -	unsigned long n;
> -
> -	e = __context_engines_await(src, &user_engines);
> -	if (!e)
> -		return -ENOENT;
> -
> -	clone = alloc_engines(e->num_engines);
> -	if (!clone)
> -		goto err_unlock;
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_engine_cs *engine;
> -
> -		if (!e->engines[n]) {
> -			clone->engines[n] = NULL;
> -			continue;
> -		}
> -		engine = e->engines[n]->engine;
> -
> -		/*
> -		 * Virtual engines are singletons; they can only exist
> -		 * inside a single context, because they embed their
> -		 * HW context... As each virtual context implies a single
> -		 * timeline (each engine can only dequeue a single request
> -		 * at any time), it would be surprising for two contexts
> -		 * to use the same engine. So let's create a copy of
> -		 * the virtual engine instead.
> -		 */
> -		if (intel_engine_is_virtual(engine))
> -			clone->engines[n] =
> -				intel_execlists_clone_virtual(engine);

You forgot to gc this function here ^^

> -		else
> -			clone->engines[n] = intel_context_create(engine);
> -		if (IS_ERR_OR_NULL(clone->engines[n])) {
> -			__free_engines(clone, n);
> -			goto err_unlock;
> -		}
> -
> -		intel_context_set_gem(clone->engines[n], dst);

Not peeked ahead, but I'm really hoping intel_context_set_gem gets removed
eventually too ...

> -	}
> -	clone->num_engines = n;
> -	i915_sw_fence_complete(&e->fence);
> -
> -	/* Serialised by constructor */
> -	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> -	if (user_engines)
> -		i915_gem_context_set_user_engines(dst);
> -	else
> -		i915_gem_context_clear_user_engines(dst);
> -	return 0;
> -
> -err_unlock:
> -	i915_sw_fence_complete(&e->fence);
> -	return -ENOMEM;
> -}
> -
> -static int clone_flags(struct i915_gem_context *dst,
> -		       struct i915_gem_context *src)
> -{
> -	dst->user_flags = src->user_flags;
> -	return 0;
> -}
> -
> -static int clone_schedattr(struct i915_gem_context *dst,
> -			   struct i915_gem_context *src)
> -{
> -	dst->sched = src->sched;
> -	return 0;
> -}
> -
> -static int clone_sseu(struct i915_gem_context *dst,
> -		      struct i915_gem_context *src)
> -{
> -	struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> -	struct i915_gem_engines *clone;
> -	unsigned long n;
> -	int err;
> -
> -	/* no locking required; sole access under constructor*/
> -	clone = __context_engines_static(dst);
> -	if (e->num_engines != clone->num_engines) {
> -		err = -EINVAL;
> -		goto unlock;
> -	}
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_context *ce = e->engines[n];
> -
> -		if (clone->engines[n]->engine->class != ce->engine->class) {
> -			/* Must have compatible engine maps! */
> -			err = -EINVAL;
> -			goto unlock;
> -		}
> -
> -		/* serialises with set_sseu */
> -		err = intel_context_lock_pinned(ce);
> -		if (err)
> -			goto unlock;
> -
> -		clone->engines[n]->sseu = ce->sseu;
> -		intel_context_unlock_pinned(ce);
> -	}
> -
> -	err = 0;
> -unlock:
> -	i915_gem_context_unlock_engines(src);
> -	return err;
> -}
> -
> -static int clone_timeline(struct i915_gem_context *dst,
> -			  struct i915_gem_context *src)
> -{
> -	if (src->timeline)
> -		__assign_timeline(dst, src->timeline);
> -
> -	return 0;
> -}
> -
> -static int clone_vm(struct i915_gem_context *dst,
> -		    struct i915_gem_context *src)
> -{
> -	struct i915_address_space *vm;
> -	int err = 0;
> -
> -	if (!rcu_access_pointer(src->vm))
> -		return 0;
> -
> -	rcu_read_lock();
> -	vm = context_get_vm_rcu(src);
> -	rcu_read_unlock();
> -
> -	if (!mutex_lock_interruptible(&dst->mutex)) {
> -		__assign_ppgtt(dst, vm);
> -		mutex_unlock(&dst->mutex);
> -	} else {
> -		err = -EINTR;
> -	}
> -
> -	i915_vm_put(vm);
> -	return err;
> -}
> -
> -static int create_clone(struct i915_user_extension __user *ext, void *data)
> -{
> -	static int (* const fn[])(struct i915_gem_context *dst,
> -				  struct i915_gem_context *src) = {
> -#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
> -		MAP(ENGINES, clone_engines),
> -		MAP(FLAGS, clone_flags),
> -		MAP(SCHEDATTR, clone_schedattr),
> -		MAP(SSEU, clone_sseu),
> -		MAP(TIMELINE, clone_timeline),
> -		MAP(VM, clone_vm),
> -#undef MAP
> -	};
> -	struct drm_i915_gem_context_create_ext_clone local;
> -	const struct create_ext *arg = data;
> -	struct i915_gem_context *dst = arg->ctx;
> -	struct i915_gem_context *src;
> -	int err, bit;
> -
> -	if (copy_from_user(&local, ext, sizeof(local)))
> -		return -EFAULT;
> -
> -	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
> -		     I915_CONTEXT_CLONE_UNKNOWN);
> -
> -	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
> -		return -EINVAL;
> -
> -	if (local.rsvd)
> -		return -EINVAL;
> -
> -	rcu_read_lock();
> -	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
> -	rcu_read_unlock();
> -	if (!src)
> -		return -ENOENT;
> -
> -	GEM_BUG_ON(src == dst);
> -
> -	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
> -		if (!(local.flags & BIT(bit)))
> -			continue;
> -
> -		err = fn[bit](dst, src);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> +	return -EINVAL;
>  }
>  
>  static const i915_user_extension_fn create_extensions[] = {
>  	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
> -	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
> +	[I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
>  };
>  
>  static bool client_is_banned(struct drm_i915_file_private *file_priv)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a0aaa8298f28d..75a71b6756ed8 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
>  	struct drm_i915_gem_context_param param;
>  };
>  
> -struct drm_i915_gem_context_create_ext_clone {
> +/* This API has been removed.  On the off chance someone somewhere has
> + * attempted to use it, never re-use this extension number.
> + */
>  #define I915_CONTEXT_CREATE_EXT_CLONE 1

I think we need to put these somewhere else now, here it's just plain
lost. I think in the kerneldoc for
drm_i915_gem_context_create_ext_setparam would be best, with the #define
right above and in the kerneldoc an enumeration of all the values and what
they're for.

I think I'll need to sign up Matt B or you for doing some kerneldoc polish
on these so they're all collected together.
-Daniel

> -	struct i915_user_extension base;
> -	__u32 clone_id;
> -	__u32 flags;
> -#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
> -#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
> -#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
> -#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
> -#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
> -#define I915_CONTEXT_CLONE_VM		(1u << 5)
> -#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
> -	__u64 rsvd;
> -};
>  
>  struct drm_i915_gem_context_destroy {
>  	__u32 ctx_id;
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
@ 2021-04-27  9:49     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:49 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:15PM -0500, Jason Ekstrand wrote:
> This API allows one context to grab bits out of another context upon
> creation.  It can be used as a short-cut for setparam(getparam()) for
> things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> real userspace.  It's used by a few IGT tests and that's it.  Since it
> doesn't add any real value (most of the stuff you can CLONE you can copy
> in other ways), drop it.
> 
> There is one thing that this API allows you to clone which you cannot
> clone via getparam/setparam: timelines.  However, timelines are an
> implementation detail of i915 and not really something that needs to be
> exposed to userspace.  Also, sharing timelines between contexts isn't
> obviously useful and supporting it has the potential to complicate i915
> internally.  It also doesn't add any functionality that the client can't
> get in other ways.  If a client really wants a shared timeline, they can
> use a syncobj and set it as an in and out fence on every submit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
>  include/uapi/drm/i915_drm.h                 |  16 +-
>  2 files changed, 6 insertions(+), 209 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a77855123cec..2c2fefa912805 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>  	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>  }
>  
> -static int clone_engines(struct i915_gem_context *dst,
> -			 struct i915_gem_context *src)
> +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
>  {
> -	struct i915_gem_engines *clone, *e;
> -	bool user_engines;
> -	unsigned long n;
> -
> -	e = __context_engines_await(src, &user_engines);
> -	if (!e)
> -		return -ENOENT;
> -
> -	clone = alloc_engines(e->num_engines);
> -	if (!clone)
> -		goto err_unlock;
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_engine_cs *engine;
> -
> -		if (!e->engines[n]) {
> -			clone->engines[n] = NULL;
> -			continue;
> -		}
> -		engine = e->engines[n]->engine;
> -
> -		/*
> -		 * Virtual engines are singletons; they can only exist
> -		 * inside a single context, because they embed their
> -		 * HW context... As each virtual context implies a single
> -		 * timeline (each engine can only dequeue a single request
> -		 * at any time), it would be surprising for two contexts
> -		 * to use the same engine. So let's create a copy of
> -		 * the virtual engine instead.
> -		 */
> -		if (intel_engine_is_virtual(engine))
> -			clone->engines[n] =
> -				intel_execlists_clone_virtual(engine);

You forgot to gc this function here ^^

> -		else
> -			clone->engines[n] = intel_context_create(engine);
> -		if (IS_ERR_OR_NULL(clone->engines[n])) {
> -			__free_engines(clone, n);
> -			goto err_unlock;
> -		}
> -
> -		intel_context_set_gem(clone->engines[n], dst);

Not peeked ahead, but I'm really hoping intel_context_set_gem gets removed
eventually too ...

> -	}
> -	clone->num_engines = n;
> -	i915_sw_fence_complete(&e->fence);
> -
> -	/* Serialised by constructor */
> -	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> -	if (user_engines)
> -		i915_gem_context_set_user_engines(dst);
> -	else
> -		i915_gem_context_clear_user_engines(dst);
> -	return 0;
> -
> -err_unlock:
> -	i915_sw_fence_complete(&e->fence);
> -	return -ENOMEM;
> -}
> -
> -static int clone_flags(struct i915_gem_context *dst,
> -		       struct i915_gem_context *src)
> -{
> -	dst->user_flags = src->user_flags;
> -	return 0;
> -}
> -
> -static int clone_schedattr(struct i915_gem_context *dst,
> -			   struct i915_gem_context *src)
> -{
> -	dst->sched = src->sched;
> -	return 0;
> -}
> -
> -static int clone_sseu(struct i915_gem_context *dst,
> -		      struct i915_gem_context *src)
> -{
> -	struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> -	struct i915_gem_engines *clone;
> -	unsigned long n;
> -	int err;
> -
> -	/* no locking required; sole access under constructor*/
> -	clone = __context_engines_static(dst);
> -	if (e->num_engines != clone->num_engines) {
> -		err = -EINVAL;
> -		goto unlock;
> -	}
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_context *ce = e->engines[n];
> -
> -		if (clone->engines[n]->engine->class != ce->engine->class) {
> -			/* Must have compatible engine maps! */
> -			err = -EINVAL;
> -			goto unlock;
> -		}
> -
> -		/* serialises with set_sseu */
> -		err = intel_context_lock_pinned(ce);
> -		if (err)
> -			goto unlock;
> -
> -		clone->engines[n]->sseu = ce->sseu;
> -		intel_context_unlock_pinned(ce);
> -	}
> -
> -	err = 0;
> -unlock:
> -	i915_gem_context_unlock_engines(src);
> -	return err;
> -}
> -
> -static int clone_timeline(struct i915_gem_context *dst,
> -			  struct i915_gem_context *src)
> -{
> -	if (src->timeline)
> -		__assign_timeline(dst, src->timeline);
> -
> -	return 0;
> -}
> -
> -static int clone_vm(struct i915_gem_context *dst,
> -		    struct i915_gem_context *src)
> -{
> -	struct i915_address_space *vm;
> -	int err = 0;
> -
> -	if (!rcu_access_pointer(src->vm))
> -		return 0;
> -
> -	rcu_read_lock();
> -	vm = context_get_vm_rcu(src);
> -	rcu_read_unlock();
> -
> -	if (!mutex_lock_interruptible(&dst->mutex)) {
> -		__assign_ppgtt(dst, vm);
> -		mutex_unlock(&dst->mutex);
> -	} else {
> -		err = -EINTR;
> -	}
> -
> -	i915_vm_put(vm);
> -	return err;
> -}
> -
> -static int create_clone(struct i915_user_extension __user *ext, void *data)
> -{
> -	static int (* const fn[])(struct i915_gem_context *dst,
> -				  struct i915_gem_context *src) = {
> -#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
> -		MAP(ENGINES, clone_engines),
> -		MAP(FLAGS, clone_flags),
> -		MAP(SCHEDATTR, clone_schedattr),
> -		MAP(SSEU, clone_sseu),
> -		MAP(TIMELINE, clone_timeline),
> -		MAP(VM, clone_vm),
> -#undef MAP
> -	};
> -	struct drm_i915_gem_context_create_ext_clone local;
> -	const struct create_ext *arg = data;
> -	struct i915_gem_context *dst = arg->ctx;
> -	struct i915_gem_context *src;
> -	int err, bit;
> -
> -	if (copy_from_user(&local, ext, sizeof(local)))
> -		return -EFAULT;
> -
> -	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
> -		     I915_CONTEXT_CLONE_UNKNOWN);
> -
> -	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
> -		return -EINVAL;
> -
> -	if (local.rsvd)
> -		return -EINVAL;
> -
> -	rcu_read_lock();
> -	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
> -	rcu_read_unlock();
> -	if (!src)
> -		return -ENOENT;
> -
> -	GEM_BUG_ON(src == dst);
> -
> -	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
> -		if (!(local.flags & BIT(bit)))
> -			continue;
> -
> -		err = fn[bit](dst, src);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> +	return -EINVAL;
>  }
>  
>  static const i915_user_extension_fn create_extensions[] = {
>  	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
> -	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
> +	[I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
>  };
>  
>  static bool client_is_banned(struct drm_i915_file_private *file_priv)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a0aaa8298f28d..75a71b6756ed8 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
>  	struct drm_i915_gem_context_param param;
>  };
>  
> -struct drm_i915_gem_context_create_ext_clone {
> +/* This API has been removed.  On the off chance someone somewhere has
> + * attempted to use it, never re-use this extension number.
> + */
>  #define I915_CONTEXT_CREATE_EXT_CLONE 1

I think we need to put these somewhere else now, here it's just plain
lost. I think in the kerneldoc for
drm_i915_gem_context_create_ext_setparam would be best, with the #define
right above and in the kerneldoc an enumeration of all the values and what
they're for.

I think I'll need to sign up Matt B or you for doing some kerneldoc polish
on these so they're all collected together.
-Daniel

> -	struct i915_user_extension base;
> -	__u32 clone_id;
> -	__u32 flags;
> -#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
> -#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
> -#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
> -#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
> -#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
> -#define I915_CONTEXT_CLONE_VM		(1u << 5)
> -#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
> -	__u64 rsvd;
> -};
>  
>  struct drm_i915_gem_context_destroy {
>  	__u32 ctx_id;
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:55     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:55 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:16PM -0500, Jason Ekstrand wrote:
> This API is entirely unnecessary and I'd love to get rid of it.  If
> userspace wants a single timeline across multiple contexts, they can
> either use implicit synchronization or a syncobj, both of which existed
> at the time this feature landed.  The justification given at the time
> was that it would help GL drivers which are inherently single-timeline.
> However, neither of our GL drivers actually wanted the feature.  i965
> was already in maintenance mode at the time and iris uses syncobj for
> everything.
> 
> Unfortunately, as much as I'd love to get rid of it, it is used by the
> media driver so we can't do that.  We can, however, do the next-best
> thing which is to embed a syncobj in the context and do exactly what
> we'd expect from userspace internally.  This isn't an entirely identical
> implementation because it's no longer atomic if userspace races with
> itself by calling execbuffer2 twice simultaneously from different
> threads.  It won't crash in that case; it just doesn't guarantee any
> ordering between those two submits.
> 
> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> advantages beyond mere annoyance.  One is that intel_timeline is no
> longer an api-visible object and can remain entirely an implementation
> detail.  This may be advantageous as we make scheduler changes going
> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> we should now have a 1:1 mapping between intel_context and
> intel_timeline which may help us reduce locking.
> 
> v2 (Jason Ekstrand):
>  - Update the comment on i915_gem_context::syncobj to mention that it's
>    an emulation and the possible race if userspace calls execbuffer2
>    twice on the same context concurrently.
>  - Wrap the checks for eb.gem_context->syncobj in unlikely()
>  - Drop the dma_fence reference
>  - Improved commit message
> 
> v3 (Jason Ekstrand):
>  - Move the dma_fence_put() to before the error exit
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I'm assuming that igt coverage is good enough. Otoh if CI didn't catch
that racing execbuf are now unsynced maybe it wasn't good enough, but
whatever :-)
-Daniel


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
>  .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
>  3 files changed, 40 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 2c2fefa912805..a72c9b256723b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -67,6 +67,8 @@
>  #include <linux/log2.h>
>  #include <linux/nospec.h>
>  
> +#include <drm/drm_syncobj.h>
> +
>  #include "gt/gen6_ppgtt.h"
>  #include "gt/intel_context.h"
>  #include "gt/intel_context_param.h"
> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
>  		ce->vm = vm;
>  	}
>  
> -	GEM_BUG_ON(ce->timeline);
> -	if (ctx->timeline)
> -		ce->timeline = intel_timeline_get(ctx->timeline);
> -
>  	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>  	    intel_engine_has_timeslices(ce->engine))
>  		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>  	mutex_destroy(&ctx->engines_mutex);
>  	mutex_destroy(&ctx->lut_mutex);
>  
> -	if (ctx->timeline)
> -		intel_timeline_put(ctx->timeline);
> -
>  	put_pid(ctx->pid);
>  	mutex_destroy(&ctx->mutex);
>  
> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>  	if (vm)
>  		i915_vm_close(vm);
>  
> +	if (ctx->syncobj)
> +		drm_syncobj_put(ctx->syncobj);
> +
>  	ctx->file_priv = ERR_PTR(-EBADF);
>  
>  	/*
> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>  		i915_vm_close(vm);
>  }
>  
> -static void __set_timeline(struct intel_timeline **dst,
> -			   struct intel_timeline *src)
> -{
> -	struct intel_timeline *old = *dst;
> -
> -	*dst = src ? intel_timeline_get(src) : NULL;
> -
> -	if (old)
> -		intel_timeline_put(old);
> -}
> -
> -static void __apply_timeline(struct intel_context *ce, void *timeline)
> -{
> -	__set_timeline(&ce->timeline, timeline);
> -}
> -
> -static void __assign_timeline(struct i915_gem_context *ctx,
> -			      struct intel_timeline *timeline)
> -{
> -	__set_timeline(&ctx->timeline, timeline);
> -	context_apply_all(ctx, __apply_timeline, timeline);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  {
>  	struct i915_gem_context *ctx;
> +	int ret;
>  
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
>  	    !HAS_EXECLISTS(i915))
> @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  	}
>  
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> -		struct intel_timeline *timeline;
> -
> -		timeline = intel_timeline_create(&i915->gt);
> -		if (IS_ERR(timeline)) {
> +		ret = drm_syncobj_create(&ctx->syncobj,
> +					 DRM_SYNCOBJ_CREATE_SIGNALED,
> +					 NULL);
> +		if (ret) {
>  			context_close(ctx);
> -			return ERR_CAST(timeline);
> +			return ERR_PTR(ret);
>  		}
> -
> -		__assign_timeline(ctx, timeline);
> -		intel_timeline_put(timeline);
>  	}
>  
>  	trace_i915_context_create(ctx);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 676592e27e7d2..df76767f0c41b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -83,7 +83,19 @@ struct i915_gem_context {
>  	struct i915_gem_engines __rcu *engines;
>  	struct mutex engines_mutex; /* guards writes to engines */
>  
> -	struct intel_timeline *timeline;
> +	/**
> +	 * @syncobj: Shared timeline syncobj
> +	 *
> +	 * When the SHARED_TIMELINE flag is set on context creation, we
> +	 * emulate a single timeline across all engines using this syncobj.
> +	 * For every execbuffer2 call, this syncobj is used as both an in-
> +	 * and out-fence.  Unlike the real intel_timeline, this doesn't
> +	 * provide perfect atomic in-order guarantees if the client races
> +	 * with itself by calling execbuffer2 twice concurrently.  However,
> +	 * if userspace races with itself, that's not likely to yield well-
> +	 * defined results anyway so we choose to not care.
> +	 */
> +	struct drm_syncobj *syncobj;
>  
>  	/**
>  	 * @vm: unique address space (GTT)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b812f313422a9..d640bba6ad9ab 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  		goto err_vma;
>  	}
>  
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		struct dma_fence *fence;
> +
> +		fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> +		err = i915_request_await_dma_fence(eb.request, fence);
> +		dma_fence_put(fence);
> +		if (err)
> +			goto err_ext;
> +	}
> +
>  	if (in_fence) {
>  		if (args->flags & I915_EXEC_FENCE_SUBMIT)
>  			err = i915_request_await_execution(eb.request,
> @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  			fput(out_fence->file);
>  		}
>  	}
> +
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		drm_syncobj_replace_fence(eb.gem_context->syncobj,
> +					  &eb.request->fence);
> +	}
> +
>  	i915_request_put(eb.request);
>  
>  err_vma:
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-27  9:55     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:55 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:16PM -0500, Jason Ekstrand wrote:
> This API is entirely unnecessary and I'd love to get rid of it.  If
> userspace wants a single timeline across multiple contexts, they can
> either use implicit synchronization or a syncobj, both of which existed
> at the time this feature landed.  The justification given at the time
> was that it would help GL drivers which are inherently single-timeline.
> However, neither of our GL drivers actually wanted the feature.  i965
> was already in maintenance mode at the time and iris uses syncobj for
> everything.
> 
> Unfortunately, as much as I'd love to get rid of it, it is used by the
> media driver so we can't do that.  We can, however, do the next-best
> thing which is to embed a syncobj in the context and do exactly what
> we'd expect from userspace internally.  This isn't an entirely identical
> implementation because it's no longer atomic if userspace races with
> itself by calling execbuffer2 twice simultaneously from different
> threads.  It won't crash in that case; it just doesn't guarantee any
> ordering between those two submits.
> 
> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> advantages beyond mere annoyance.  One is that intel_timeline is no
> longer an api-visible object and can remain entirely an implementation
> detail.  This may be advantageous as we make scheduler changes going
> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> we should now have a 1:1 mapping between intel_context and
> intel_timeline which may help us reduce locking.
> 
> v2 (Jason Ekstrand):
>  - Update the comment on i915_gem_context::syncobj to mention that it's
>    an emulation and the possible race if userspace calls execbuffer2
>    twice on the same context concurrently.
>  - Wrap the checks for eb.gem_context->syncobj in unlikely()
>  - Drop the dma_fence reference
>  - Improved commit message
> 
> v3 (Jason Ekstrand):
>  - Move the dma_fence_put() to before the error exit
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I'm assuming that igt coverage is good enough. Otoh if CI didn't catch
that racing execbuf are now unsynced maybe it wasn't good enough, but
whatever :-)
-Daniel


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
>  .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
>  3 files changed, 40 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 2c2fefa912805..a72c9b256723b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -67,6 +67,8 @@
>  #include <linux/log2.h>
>  #include <linux/nospec.h>
>  
> +#include <drm/drm_syncobj.h>
> +
>  #include "gt/gen6_ppgtt.h"
>  #include "gt/intel_context.h"
>  #include "gt/intel_context_param.h"
> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
>  		ce->vm = vm;
>  	}
>  
> -	GEM_BUG_ON(ce->timeline);
> -	if (ctx->timeline)
> -		ce->timeline = intel_timeline_get(ctx->timeline);
> -
>  	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>  	    intel_engine_has_timeslices(ce->engine))
>  		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>  	mutex_destroy(&ctx->engines_mutex);
>  	mutex_destroy(&ctx->lut_mutex);
>  
> -	if (ctx->timeline)
> -		intel_timeline_put(ctx->timeline);
> -
>  	put_pid(ctx->pid);
>  	mutex_destroy(&ctx->mutex);
>  
> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>  	if (vm)
>  		i915_vm_close(vm);
>  
> +	if (ctx->syncobj)
> +		drm_syncobj_put(ctx->syncobj);
> +
>  	ctx->file_priv = ERR_PTR(-EBADF);
>  
>  	/*
> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>  		i915_vm_close(vm);
>  }
>  
> -static void __set_timeline(struct intel_timeline **dst,
> -			   struct intel_timeline *src)
> -{
> -	struct intel_timeline *old = *dst;
> -
> -	*dst = src ? intel_timeline_get(src) : NULL;
> -
> -	if (old)
> -		intel_timeline_put(old);
> -}
> -
> -static void __apply_timeline(struct intel_context *ce, void *timeline)
> -{
> -	__set_timeline(&ce->timeline, timeline);
> -}
> -
> -static void __assign_timeline(struct i915_gem_context *ctx,
> -			      struct intel_timeline *timeline)
> -{
> -	__set_timeline(&ctx->timeline, timeline);
> -	context_apply_all(ctx, __apply_timeline, timeline);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  {
>  	struct i915_gem_context *ctx;
> +	int ret;
>  
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
>  	    !HAS_EXECLISTS(i915))
> @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>  	}
>  
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> -		struct intel_timeline *timeline;
> -
> -		timeline = intel_timeline_create(&i915->gt);
> -		if (IS_ERR(timeline)) {
> +		ret = drm_syncobj_create(&ctx->syncobj,
> +					 DRM_SYNCOBJ_CREATE_SIGNALED,
> +					 NULL);
> +		if (ret) {
>  			context_close(ctx);
> -			return ERR_CAST(timeline);
> +			return ERR_PTR(ret);
>  		}
> -
> -		__assign_timeline(ctx, timeline);
> -		intel_timeline_put(timeline);
>  	}
>  
>  	trace_i915_context_create(ctx);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 676592e27e7d2..df76767f0c41b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -83,7 +83,19 @@ struct i915_gem_context {
>  	struct i915_gem_engines __rcu *engines;
>  	struct mutex engines_mutex; /* guards writes to engines */
>  
> -	struct intel_timeline *timeline;
> +	/**
> +	 * @syncobj: Shared timeline syncobj
> +	 *
> +	 * When the SHARED_TIMELINE flag is set on context creation, we
> +	 * emulate a single timeline across all engines using this syncobj.
> +	 * For every execbuffer2 call, this syncobj is used as both an in-
> +	 * and out-fence.  Unlike the real intel_timeline, this doesn't
> +	 * provide perfect atomic in-order guarantees if the client races
> +	 * with itself by calling execbuffer2 twice concurrently.  However,
> +	 * if userspace races with itself, that's not likely to yield well-
> +	 * defined results anyway so we choose to not care.
> +	 */
> +	struct drm_syncobj *syncobj;
>  
>  	/**
>  	 * @vm: unique address space (GTT)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b812f313422a9..d640bba6ad9ab 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  		goto err_vma;
>  	}
>  
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		struct dma_fence *fence;
> +
> +		fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> +		err = i915_request_await_dma_fence(eb.request, fence);
> +		dma_fence_put(fence);
> +		if (err)
> +			goto err_ext;
> +	}
> +
>  	if (in_fence) {
>  		if (args->flags & I915_EXEC_FENCE_SUBMIT)
>  			err = i915_request_await_execution(eb.request,
> @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  			fput(out_fence->file);
>  		}
>  	}
> +
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		drm_syncobj_replace_fence(eb.gem_context->syncobj,
> +					  &eb.request->fence);
> +	}
> +
>  	i915_request_put(eb.request);
>  
>  err_vma:
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 07/21] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27  9:58     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:58 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:17PM -0500, Jason Ekstrand wrote:
> This has never been used by any userspace except IGT and provides no
> real functionality beyond parroting back parameters userspace passed in
> as part of context creation or via setparam.  If the context is in
> legacy mode (where you use I915_EXEC_RENDER and friends), it returns
> success with zero data so it's not useful for discovering what engines
> are in the context.  It's also not a replacement for the recently
> removed I915_CONTEXT_CLONE_ENGINES because it doesn't return any of the
> balancing or bonding information.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 77 +--------------------
>  1 file changed, 1 insertion(+), 76 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index a72c9b256723b..e8179918fa306 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1725,78 +1725,6 @@ set_engines(struct i915_gem_context *ctx,
>  	return 0;
>  }
>  
> -static int
> -get_engines(struct i915_gem_context *ctx,
> -	    struct drm_i915_gem_context_param *args)
> -{
> -	struct i915_context_param_engines __user *user;
> -	struct i915_gem_engines *e;
> -	size_t n, count, size;
> -	bool user_engines;
> -	int err = 0;
> -
> -	e = __context_engines_await(ctx, &user_engines);
> -	if (!e)
> -		return -ENOENT;
> -
> -	if (!user_engines) {
> -		i915_sw_fence_complete(&e->fence);
> -		args->size = 0;
> -		return 0;
> -	}
> -
> -	count = e->num_engines;
> -
> -	/* Be paranoid in case we have an impedance mismatch */
> -	if (!check_struct_size(user, engines, count, &size)) {
> -		err = -EINVAL;
> -		goto err_free;
> -	}
> -	if (overflows_type(size, args->size)) {
> -		err = -EINVAL;
> -		goto err_free;
> -	}
> -
> -	if (!args->size) {
> -		args->size = size;
> -		goto err_free;
> -	}
> -
> -	if (args->size < size) {
> -		err = -EINVAL;
> -		goto err_free;
> -	}
> -
> -	user = u64_to_user_ptr(args->value);
> -	if (put_user(0, &user->extensions)) {
> -		err = -EFAULT;
> -		goto err_free;
> -	}
> -
> -	for (n = 0; n < count; n++) {
> -		struct i915_engine_class_instance ci = {
> -			.engine_class = I915_ENGINE_CLASS_INVALID,
> -			.engine_instance = I915_ENGINE_CLASS_INVALID_NONE,
> -		};
> -
> -		if (e->engines[n]) {
> -			ci.engine_class = e->engines[n]->engine->uabi_class;
> -			ci.engine_instance = e->engines[n]->engine->uabi_instance;
> -		}
> -
> -		if (copy_to_user(&user->engines[n], &ci, sizeof(ci))) {
> -			err = -EFAULT;
> -			goto err_free;
> -		}
> -	}
> -
> -	args->size = size;
> -
> -err_free:
> -	i915_sw_fence_complete(&e->fence);
> -	return err;
> -}
> -
>  static int
>  set_persistence(struct i915_gem_context *ctx,
>  		const struct drm_i915_gem_context_param *args)
> @@ -2127,10 +2055,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		ret = get_ppgtt(file_priv, ctx, args);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_ENGINES:
> -		ret = get_engines(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_PERSISTENCE:
>  		args->size = 0;
>  		args->value = i915_gem_context_is_persistent(ctx);
> @@ -2138,6 +2062,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  
>  	case I915_CONTEXT_PARAM_NO_ZEROMAP:
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_ENGINES:
>  	case I915_CONTEXT_PARAM_RINGSIZE:

I like how this list keeps growing. Same thing as usual about "pls check
igt coverage".

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	default:
>  		ret = -EINVAL;
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 07/21] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES
@ 2021-04-27  9:58     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27  9:58 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:17PM -0500, Jason Ekstrand wrote:
> This has never been used by any userspace except IGT and provides no
> real functionality beyond parroting back parameters userspace passed in
> as part of context creation or via setparam.  If the context is in
> legacy mode (where you use I915_EXEC_RENDER and friends), it returns
> success with zero data so it's not useful for discovering what engines
> are in the context.  It's also not a replacement for the recently
> removed I915_CONTEXT_CLONE_ENGINES because it doesn't return any of the
> balancing or bonding information.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 77 +--------------------
>  1 file changed, 1 insertion(+), 76 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index a72c9b256723b..e8179918fa306 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1725,78 +1725,6 @@ set_engines(struct i915_gem_context *ctx,
>  	return 0;
>  }
>  
> -static int
> -get_engines(struct i915_gem_context *ctx,
> -	    struct drm_i915_gem_context_param *args)
> -{
> -	struct i915_context_param_engines __user *user;
> -	struct i915_gem_engines *e;
> -	size_t n, count, size;
> -	bool user_engines;
> -	int err = 0;
> -
> -	e = __context_engines_await(ctx, &user_engines);
> -	if (!e)
> -		return -ENOENT;
> -
> -	if (!user_engines) {
> -		i915_sw_fence_complete(&e->fence);
> -		args->size = 0;
> -		return 0;
> -	}
> -
> -	count = e->num_engines;
> -
> -	/* Be paranoid in case we have an impedance mismatch */
> -	if (!check_struct_size(user, engines, count, &size)) {
> -		err = -EINVAL;
> -		goto err_free;
> -	}
> -	if (overflows_type(size, args->size)) {
> -		err = -EINVAL;
> -		goto err_free;
> -	}
> -
> -	if (!args->size) {
> -		args->size = size;
> -		goto err_free;
> -	}
> -
> -	if (args->size < size) {
> -		err = -EINVAL;
> -		goto err_free;
> -	}
> -
> -	user = u64_to_user_ptr(args->value);
> -	if (put_user(0, &user->extensions)) {
> -		err = -EFAULT;
> -		goto err_free;
> -	}
> -
> -	for (n = 0; n < count; n++) {
> -		struct i915_engine_class_instance ci = {
> -			.engine_class = I915_ENGINE_CLASS_INVALID,
> -			.engine_instance = I915_ENGINE_CLASS_INVALID_NONE,
> -		};
> -
> -		if (e->engines[n]) {
> -			ci.engine_class = e->engines[n]->engine->uabi_class;
> -			ci.engine_instance = e->engines[n]->engine->uabi_instance;
> -		}
> -
> -		if (copy_to_user(&user->engines[n], &ci, sizeof(ci))) {
> -			err = -EFAULT;
> -			goto err_free;
> -		}
> -	}
> -
> -	args->size = size;
> -
> -err_free:
> -	i915_sw_fence_complete(&e->fence);
> -	return err;
> -}
> -
>  static int
>  set_persistence(struct i915_gem_context *ctx,
>  		const struct drm_i915_gem_context_param *args)
> @@ -2127,10 +2055,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  		ret = get_ppgtt(file_priv, ctx, args);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_ENGINES:
> -		ret = get_engines(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_PERSISTENCE:
>  		args->size = 0;
>  		args->value = i915_gem_context_is_persistent(ctx);
> @@ -2138,6 +2062,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  
>  	case I915_CONTEXT_PARAM_NO_ZEROMAP:
>  	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_ENGINES:
>  	case I915_CONTEXT_PARAM_RINGSIZE:

I like how this list keeps growing. Same thing as usual about "pls check
igt coverage".

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	default:
>  		ret = -EINVAL;
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27 13:51     ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-27 13:51 UTC (permalink / raw)
  To: Intel GFX, Maling list - DRI developers

On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.
>
> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
>
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
>  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
>  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
>  6 files changed, 7 insertions(+), 353 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>         }
>         virtual = set->engines->engines[idx]->engine;
>
> +       if (intel_engine_is_virtual(virtual)) {
> +               drm_dbg(&i915->drm,
> +                       "Bonding with virtual engines not allowed\n");
> +               return -EINVAL;
> +       }
> +
>         err = check_user_mbz(&ext->flags);
>         if (err)
>                 return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>                                 n, ci.engine_class, ci.engine_instance);
>                         return -EINVAL;
>                 }
> -
> -               /*
> -                * A non-virtual engine has no siblings to choose between; and
> -                * a submit fence will always be directed to the one engine.
> -                */
> -               if (intel_engine_is_virtual(virtual)) {
> -                       err = intel_virtual_engine_attach_bond(virtual,
> -                                                              master,
> -                                                              bond);
> -                       if (err)
> -                               return err;
> -               }
>         }
>
>         return 0;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d640bba6ad9ab..efb2fa3522a42 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
>                         err = i915_request_await_execution(eb.request,
>                                                            in_fence,
> -                                                          eb.engine->bond_execute);
> +                                                          NULL);
>                 else
>                         err = i915_request_await_dma_fence(eb.request,
>                                                            in_fence);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 883bafc449024..68cfe5080325c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -446,13 +446,6 @@ struct intel_engine_cs {
>          */
>         void            (*submit_request)(struct i915_request *rq);
>
> -       /*
> -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> -        * request down to the bonded pairs.
> -        */
> -       void            (*bond_execute)(struct i915_request *rq,
> -                                       struct dma_fence *signal);
> -
>         /*
>          * Call when the priority on a request has changed and it and its
>          * dependencies may need rescheduling. Note the request itself may
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..b6e2b59f133b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>                 int prio;
>         } nodes[I915_NUM_ENGINES];
>
> -       /*
> -        * Keep track of bonded pairs -- restrictions upon on our selection
> -        * of physical engines any particular request may be submitted to.
> -        * If we receive a submit-fence from a master engine, we will only
> -        * use one of sibling_mask physical engines.
> -        */
> -       struct ve_bond {
> -               const struct intel_engine_cs *master;
> -               intel_engine_mask_t sibling_mask;
> -       } *bonds;
> -       unsigned int num_bonds;
> -
>         /* And finally, which physical engines this virtual engine maps onto. */
>         unsigned int num_siblings;
>         struct intel_engine_cs *siblings[];
> @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
>         intel_breadcrumbs_free(ve->base.breadcrumbs);
>         intel_engine_free_request_pool(&ve->base);
>
> -       kfree(ve->bonds);
>         kfree(ve);
>  }
>
> @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
>         spin_unlock_irqrestore(&ve->base.active.lock, flags);
>  }
>
> -static struct ve_bond *
> -virtual_find_bond(struct virtual_engine *ve,
> -                 const struct intel_engine_cs *master)
> -{
> -       int i;
> -
> -       for (i = 0; i < ve->num_bonds; i++) {
> -               if (ve->bonds[i].master == master)
> -                       return &ve->bonds[i];
> -       }
> -
> -       return NULL;
> -}
> -
> -static void
> -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> -{
> -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> -       intel_engine_mask_t allowed, exec;
> -       struct ve_bond *bond;
> -
> -       allowed = ~to_request(signal)->engine->mask;
> -
> -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> -       if (bond)
> -               allowed &= bond->sibling_mask;
> -
> -       /* Restrict the bonded request to run on only the available engines */
> -       exec = READ_ONCE(rq->execution_mask);
> -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> -               ;
> -
> -       /* Prevent the master from being re-run on the bonded engines */
> -       to_request(signal)->execution_mask &= ~allowed;

I sent a v2 of this patch because it turns out I deleted a bit too
much code.  This function in particular, has to stay, unfortunately.
When a batch is submitted with a SUBMIT_FENCE, this is used to push
the work onto a different engine than than the one it's supposed to
run in parallel with.  This means we can't dead-code this function or
the bond_execution function pointer and related stuff.

--Jason


> -}
> -
>  struct intel_context *
>  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>                                unsigned int count)
> @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>
>         ve->base.schedule = i915_schedule;
>         ve->base.submit_request = virtual_submit_request;
> -       ve->base.bond_execute = virtual_bond_execute;
>
>         INIT_LIST_HEAD(virtual_queue(ve));
>         ve->base.execlists.queue_priority_hint = INT_MIN;
> @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
>         if (IS_ERR(dst))
>                 return dst;
>
> -       if (se->num_bonds) {
> -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> -
> -               de->bonds = kmemdup(se->bonds,
> -                                   sizeof(*se->bonds) * se->num_bonds,
> -                                   GFP_KERNEL);
> -               if (!de->bonds) {
> -                       intel_context_put(dst);
> -                       return ERR_PTR(-ENOMEM);
> -               }
> -
> -               de->num_bonds = se->num_bonds;
> -       }
> -
>         return dst;
>  }
>
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -                                    const struct intel_engine_cs *master,
> -                                    const struct intel_engine_cs *sibling)
> -{
> -       struct virtual_engine *ve = to_virtual_engine(engine);
> -       struct ve_bond *bond;
> -       int n;
> -
> -       /* Sanity check the sibling is part of the virtual engine */
> -       for (n = 0; n < ve->num_siblings; n++)
> -               if (sibling == ve->siblings[n])
> -                       break;
> -       if (n == ve->num_siblings)
> -               return -EINVAL;
> -
> -       bond = virtual_find_bond(ve, master);
> -       if (bond) {
> -               bond->sibling_mask |= sibling->mask;
> -               return 0;
> -       }
> -
> -       bond = krealloc(ve->bonds,
> -                       sizeof(*bond) * (ve->num_bonds + 1),
> -                       GFP_KERNEL);
> -       if (!bond)
> -               return -ENOMEM;
> -
> -       bond[ve->num_bonds].master = master;
> -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> -
> -       ve->bonds = bond;
> -       ve->num_bonds++;
> -
> -       return 0;
> -}
> -
>  void intel_execlists_show_requests(struct intel_engine_cs *engine,
>                                    struct drm_printer *m,
>                                    void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> index fd61dae820e9e..80cec37a56ba9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>  struct intel_context *
>  intel_execlists_clone_virtual(struct intel_engine_cs *src);
>
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -                                    const struct intel_engine_cs *master,
> -                                    const struct intel_engine_cs *sibling);
> -
>  bool
>  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 1081cd36a2bd3..f03446d587160 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
>         return 0;
>  }
>
> -static int bond_virtual_engine(struct intel_gt *gt,
> -                              unsigned int class,
> -                              struct intel_engine_cs **siblings,
> -                              unsigned int nsibling,
> -                              unsigned int flags)
> -#define BOND_SCHEDULE BIT(0)
> -{
> -       struct intel_engine_cs *master;
> -       struct i915_request *rq[16];
> -       enum intel_engine_id id;
> -       struct igt_spinner spin;
> -       unsigned long n;
> -       int err;
> -
> -       /*
> -        * A set of bonded requests is intended to be run concurrently
> -        * across a number of engines. We use one request per-engine
> -        * and a magic fence to schedule each of the bonded requests
> -        * at the same time. A consequence of our current scheduler is that
> -        * we only move requests to the HW ready queue when the request
> -        * becomes ready, that is when all of its prerequisite fences have
> -        * been signaled. As one of those fences is the master submit fence,
> -        * there is a delay on all secondary fences as the HW may be
> -        * currently busy. Equally, as all the requests are independent,
> -        * they may have other fences that delay individual request
> -        * submission to HW. Ergo, we do not guarantee that all requests are
> -        * immediately submitted to HW at the same time, just that if the
> -        * rules are abided by, they are ready at the same time as the
> -        * first is submitted. Userspace can embed semaphores in its batch
> -        * to ensure parallel execution of its phases as it requires.
> -        * Though naturally it gets requested that perhaps the scheduler should
> -        * take care of parallel execution, even across preemption events on
> -        * different HW. (The proper answer is of course "lalalala".)
> -        *
> -        * With the submit-fence, we have identified three possible phases
> -        * of synchronisation depending on the master fence: queued (not
> -        * ready), executing, and signaled. The first two are quite simple
> -        * and checked below. However, the signaled master fence handling is
> -        * contentious. Currently we do not distinguish between a signaled
> -        * fence and an expired fence, as once signaled it does not convey
> -        * any information about the previous execution. It may even be freed
> -        * and hence checking later it may not exist at all. Ergo we currently
> -        * do not apply the bonding constraint for an already signaled fence,
> -        * as our expectation is that it should not constrain the secondaries
> -        * and is outside of the scope of the bonded request API (i.e. all
> -        * userspace requests are meant to be running in parallel). As
> -        * it imposes no constraint, and is effectively a no-op, we do not
> -        * check below as normal execution flows are checked extensively above.
> -        *
> -        * XXX Is the degenerate handling of signaled submit fences the
> -        * expected behaviour for userpace?
> -        */
> -
> -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> -
> -       if (igt_spinner_init(&spin, gt))
> -               return -ENOMEM;
> -
> -       err = 0;
> -       rq[0] = ERR_PTR(-ENOMEM);
> -       for_each_engine(master, gt, id) {
> -               struct i915_sw_fence fence = {};
> -               struct intel_context *ce;
> -
> -               if (master->class == class)
> -                       continue;
> -
> -               ce = intel_context_create(master);
> -               if (IS_ERR(ce)) {
> -                       err = PTR_ERR(ce);
> -                       goto out;
> -               }
> -
> -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> -
> -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> -               intel_context_put(ce);
> -               if (IS_ERR(rq[0])) {
> -                       err = PTR_ERR(rq[0]);
> -                       goto out;
> -               }
> -               i915_request_get(rq[0]);
> -
> -               if (flags & BOND_SCHEDULE) {
> -                       onstack_fence_init(&fence);
> -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> -                                                              &fence,
> -                                                              GFP_KERNEL);
> -               }
> -
> -               i915_request_add(rq[0]);
> -               if (err < 0)
> -                       goto out;
> -
> -               if (!(flags & BOND_SCHEDULE) &&
> -                   !igt_wait_for_spinner(&spin, rq[0])) {
> -                       err = -EIO;
> -                       goto out;
> -               }
> -
> -               for (n = 0; n < nsibling; n++) {
> -                       struct intel_context *ve;
> -
> -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> -                       if (IS_ERR(ve)) {
> -                               err = PTR_ERR(ve);
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -
> -                       err = intel_virtual_engine_attach_bond(ve->engine,
> -                                                              master,
> -                                                              siblings[n]);
> -                       if (err) {
> -                               intel_context_put(ve);
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -
> -                       err = intel_context_pin(ve);
> -                       intel_context_put(ve);
> -                       if (err) {
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -
> -                       rq[n + 1] = i915_request_create(ve);
> -                       intel_context_unpin(ve);
> -                       if (IS_ERR(rq[n + 1])) {
> -                               err = PTR_ERR(rq[n + 1]);
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -                       i915_request_get(rq[n + 1]);
> -
> -                       err = i915_request_await_execution(rq[n + 1],
> -                                                          &rq[0]->fence,
> -                                                          ve->engine->bond_execute);
> -                       i915_request_add(rq[n + 1]);
> -                       if (err < 0) {
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -               }
> -               onstack_fence_fini(&fence);
> -               intel_engine_flush_submission(master);
> -               igt_spinner_end(&spin);
> -
> -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> -                       pr_err("Master request did not execute (on %s)!\n",
> -                              rq[0]->engine->name);
> -                       err = -EIO;
> -                       goto out;
> -               }
> -
> -               for (n = 0; n < nsibling; n++) {
> -                       if (i915_request_wait(rq[n + 1], 0,
> -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> -                               err = -EIO;
> -                               goto out;
> -                       }
> -
> -                       if (rq[n + 1]->engine != siblings[n]) {
> -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> -                                      siblings[n]->name,
> -                                      rq[n + 1]->engine->name,
> -                                      rq[0]->engine->name);
> -                               err = -EINVAL;
> -                               goto out;
> -                       }
> -               }
> -
> -               for (n = 0; !IS_ERR(rq[n]); n++)
> -                       i915_request_put(rq[n]);
> -               rq[0] = ERR_PTR(-ENOMEM);
> -       }
> -
> -out:
> -       for (n = 0; !IS_ERR(rq[n]); n++)
> -               i915_request_put(rq[n]);
> -       if (igt_flush_test(gt->i915))
> -               err = -EIO;
> -
> -       igt_spinner_fini(&spin);
> -       return err;
> -}
> -
> -static int live_virtual_bond(void *arg)
> -{
> -       static const struct phase {
> -               const char *name;
> -               unsigned int flags;
> -       } phases[] = {
> -               { "", 0 },
> -               { "schedule", BOND_SCHEDULE },
> -               { },
> -       };
> -       struct intel_gt *gt = arg;
> -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> -       unsigned int class;
> -       int err;
> -
> -       if (intel_uc_uses_guc_submission(&gt->uc))
> -               return 0;
> -
> -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> -               const struct phase *p;
> -               int nsibling;
> -
> -               nsibling = select_siblings(gt, class, siblings);
> -               if (nsibling < 2)
> -                       continue;
> -
> -               for (p = phases; p->name; p++) {
> -                       err = bond_virtual_engine(gt,
> -                                                 class, siblings, nsibling,
> -                                                 p->flags);
> -                       if (err) {
> -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> -                                      __func__, p->name, class, nsibling, err);
> -                               return err;
> -                       }
> -               }
> -       }
> -
> -       return 0;
> -}
> -
>  static int reset_virtual_engine(struct intel_gt *gt,
>                                 struct intel_engine_cs **siblings,
>                                 unsigned int nsibling)
> @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>                 SUBTEST(live_virtual_mask),
>                 SUBTEST(live_virtual_preserved),
>                 SUBTEST(live_virtual_slice),
> -               SUBTEST(live_virtual_bond),
>                 SUBTEST(live_virtual_reset),
>         };
>
> --
> 2.31.1
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-27 13:51     ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-27 13:51 UTC (permalink / raw)
  To: Intel GFX, Maling list - DRI developers

On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.
>
> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
>
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
>  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
>  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
>  6 files changed, 7 insertions(+), 353 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>         }
>         virtual = set->engines->engines[idx]->engine;
>
> +       if (intel_engine_is_virtual(virtual)) {
> +               drm_dbg(&i915->drm,
> +                       "Bonding with virtual engines not allowed\n");
> +               return -EINVAL;
> +       }
> +
>         err = check_user_mbz(&ext->flags);
>         if (err)
>                 return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>                                 n, ci.engine_class, ci.engine_instance);
>                         return -EINVAL;
>                 }
> -
> -               /*
> -                * A non-virtual engine has no siblings to choose between; and
> -                * a submit fence will always be directed to the one engine.
> -                */
> -               if (intel_engine_is_virtual(virtual)) {
> -                       err = intel_virtual_engine_attach_bond(virtual,
> -                                                              master,
> -                                                              bond);
> -                       if (err)
> -                               return err;
> -               }
>         }
>
>         return 0;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d640bba6ad9ab..efb2fa3522a42 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
>                         err = i915_request_await_execution(eb.request,
>                                                            in_fence,
> -                                                          eb.engine->bond_execute);
> +                                                          NULL);
>                 else
>                         err = i915_request_await_dma_fence(eb.request,
>                                                            in_fence);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 883bafc449024..68cfe5080325c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -446,13 +446,6 @@ struct intel_engine_cs {
>          */
>         void            (*submit_request)(struct i915_request *rq);
>
> -       /*
> -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> -        * request down to the bonded pairs.
> -        */
> -       void            (*bond_execute)(struct i915_request *rq,
> -                                       struct dma_fence *signal);
> -
>         /*
>          * Call when the priority on a request has changed and it and its
>          * dependencies may need rescheduling. Note the request itself may
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..b6e2b59f133b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>                 int prio;
>         } nodes[I915_NUM_ENGINES];
>
> -       /*
> -        * Keep track of bonded pairs -- restrictions upon on our selection
> -        * of physical engines any particular request may be submitted to.
> -        * If we receive a submit-fence from a master engine, we will only
> -        * use one of sibling_mask physical engines.
> -        */
> -       struct ve_bond {
> -               const struct intel_engine_cs *master;
> -               intel_engine_mask_t sibling_mask;
> -       } *bonds;
> -       unsigned int num_bonds;
> -
>         /* And finally, which physical engines this virtual engine maps onto. */
>         unsigned int num_siblings;
>         struct intel_engine_cs *siblings[];
> @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
>         intel_breadcrumbs_free(ve->base.breadcrumbs);
>         intel_engine_free_request_pool(&ve->base);
>
> -       kfree(ve->bonds);
>         kfree(ve);
>  }
>
> @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
>         spin_unlock_irqrestore(&ve->base.active.lock, flags);
>  }
>
> -static struct ve_bond *
> -virtual_find_bond(struct virtual_engine *ve,
> -                 const struct intel_engine_cs *master)
> -{
> -       int i;
> -
> -       for (i = 0; i < ve->num_bonds; i++) {
> -               if (ve->bonds[i].master == master)
> -                       return &ve->bonds[i];
> -       }
> -
> -       return NULL;
> -}
> -
> -static void
> -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> -{
> -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> -       intel_engine_mask_t allowed, exec;
> -       struct ve_bond *bond;
> -
> -       allowed = ~to_request(signal)->engine->mask;
> -
> -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> -       if (bond)
> -               allowed &= bond->sibling_mask;
> -
> -       /* Restrict the bonded request to run on only the available engines */
> -       exec = READ_ONCE(rq->execution_mask);
> -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> -               ;
> -
> -       /* Prevent the master from being re-run on the bonded engines */
> -       to_request(signal)->execution_mask &= ~allowed;

I sent a v2 of this patch because it turns out I deleted a bit too
much code.  This function in particular, has to stay, unfortunately.
When a batch is submitted with a SUBMIT_FENCE, this is used to push
the work onto a different engine than than the one it's supposed to
run in parallel with.  This means we can't dead-code this function or
the bond_execution function pointer and related stuff.

--Jason


> -}
> -
>  struct intel_context *
>  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>                                unsigned int count)
> @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>
>         ve->base.schedule = i915_schedule;
>         ve->base.submit_request = virtual_submit_request;
> -       ve->base.bond_execute = virtual_bond_execute;
>
>         INIT_LIST_HEAD(virtual_queue(ve));
>         ve->base.execlists.queue_priority_hint = INT_MIN;
> @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
>         if (IS_ERR(dst))
>                 return dst;
>
> -       if (se->num_bonds) {
> -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> -
> -               de->bonds = kmemdup(se->bonds,
> -                                   sizeof(*se->bonds) * se->num_bonds,
> -                                   GFP_KERNEL);
> -               if (!de->bonds) {
> -                       intel_context_put(dst);
> -                       return ERR_PTR(-ENOMEM);
> -               }
> -
> -               de->num_bonds = se->num_bonds;
> -       }
> -
>         return dst;
>  }
>
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -                                    const struct intel_engine_cs *master,
> -                                    const struct intel_engine_cs *sibling)
> -{
> -       struct virtual_engine *ve = to_virtual_engine(engine);
> -       struct ve_bond *bond;
> -       int n;
> -
> -       /* Sanity check the sibling is part of the virtual engine */
> -       for (n = 0; n < ve->num_siblings; n++)
> -               if (sibling == ve->siblings[n])
> -                       break;
> -       if (n == ve->num_siblings)
> -               return -EINVAL;
> -
> -       bond = virtual_find_bond(ve, master);
> -       if (bond) {
> -               bond->sibling_mask |= sibling->mask;
> -               return 0;
> -       }
> -
> -       bond = krealloc(ve->bonds,
> -                       sizeof(*bond) * (ve->num_bonds + 1),
> -                       GFP_KERNEL);
> -       if (!bond)
> -               return -ENOMEM;
> -
> -       bond[ve->num_bonds].master = master;
> -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> -
> -       ve->bonds = bond;
> -       ve->num_bonds++;
> -
> -       return 0;
> -}
> -
>  void intel_execlists_show_requests(struct intel_engine_cs *engine,
>                                    struct drm_printer *m,
>                                    void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> index fd61dae820e9e..80cec37a56ba9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>  struct intel_context *
>  intel_execlists_clone_virtual(struct intel_engine_cs *src);
>
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -                                    const struct intel_engine_cs *master,
> -                                    const struct intel_engine_cs *sibling);
> -
>  bool
>  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 1081cd36a2bd3..f03446d587160 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
>         return 0;
>  }
>
> -static int bond_virtual_engine(struct intel_gt *gt,
> -                              unsigned int class,
> -                              struct intel_engine_cs **siblings,
> -                              unsigned int nsibling,
> -                              unsigned int flags)
> -#define BOND_SCHEDULE BIT(0)
> -{
> -       struct intel_engine_cs *master;
> -       struct i915_request *rq[16];
> -       enum intel_engine_id id;
> -       struct igt_spinner spin;
> -       unsigned long n;
> -       int err;
> -
> -       /*
> -        * A set of bonded requests is intended to be run concurrently
> -        * across a number of engines. We use one request per-engine
> -        * and a magic fence to schedule each of the bonded requests
> -        * at the same time. A consequence of our current scheduler is that
> -        * we only move requests to the HW ready queue when the request
> -        * becomes ready, that is when all of its prerequisite fences have
> -        * been signaled. As one of those fences is the master submit fence,
> -        * there is a delay on all secondary fences as the HW may be
> -        * currently busy. Equally, as all the requests are independent,
> -        * they may have other fences that delay individual request
> -        * submission to HW. Ergo, we do not guarantee that all requests are
> -        * immediately submitted to HW at the same time, just that if the
> -        * rules are abided by, they are ready at the same time as the
> -        * first is submitted. Userspace can embed semaphores in its batch
> -        * to ensure parallel execution of its phases as it requires.
> -        * Though naturally it gets requested that perhaps the scheduler should
> -        * take care of parallel execution, even across preemption events on
> -        * different HW. (The proper answer is of course "lalalala".)
> -        *
> -        * With the submit-fence, we have identified three possible phases
> -        * of synchronisation depending on the master fence: queued (not
> -        * ready), executing, and signaled. The first two are quite simple
> -        * and checked below. However, the signaled master fence handling is
> -        * contentious. Currently we do not distinguish between a signaled
> -        * fence and an expired fence, as once signaled it does not convey
> -        * any information about the previous execution. It may even be freed
> -        * and hence checking later it may not exist at all. Ergo we currently
> -        * do not apply the bonding constraint for an already signaled fence,
> -        * as our expectation is that it should not constrain the secondaries
> -        * and is outside of the scope of the bonded request API (i.e. all
> -        * userspace requests are meant to be running in parallel). As
> -        * it imposes no constraint, and is effectively a no-op, we do not
> -        * check below as normal execution flows are checked extensively above.
> -        *
> -        * XXX Is the degenerate handling of signaled submit fences the
> -        * expected behaviour for userpace?
> -        */
> -
> -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> -
> -       if (igt_spinner_init(&spin, gt))
> -               return -ENOMEM;
> -
> -       err = 0;
> -       rq[0] = ERR_PTR(-ENOMEM);
> -       for_each_engine(master, gt, id) {
> -               struct i915_sw_fence fence = {};
> -               struct intel_context *ce;
> -
> -               if (master->class == class)
> -                       continue;
> -
> -               ce = intel_context_create(master);
> -               if (IS_ERR(ce)) {
> -                       err = PTR_ERR(ce);
> -                       goto out;
> -               }
> -
> -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> -
> -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> -               intel_context_put(ce);
> -               if (IS_ERR(rq[0])) {
> -                       err = PTR_ERR(rq[0]);
> -                       goto out;
> -               }
> -               i915_request_get(rq[0]);
> -
> -               if (flags & BOND_SCHEDULE) {
> -                       onstack_fence_init(&fence);
> -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> -                                                              &fence,
> -                                                              GFP_KERNEL);
> -               }
> -
> -               i915_request_add(rq[0]);
> -               if (err < 0)
> -                       goto out;
> -
> -               if (!(flags & BOND_SCHEDULE) &&
> -                   !igt_wait_for_spinner(&spin, rq[0])) {
> -                       err = -EIO;
> -                       goto out;
> -               }
> -
> -               for (n = 0; n < nsibling; n++) {
> -                       struct intel_context *ve;
> -
> -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> -                       if (IS_ERR(ve)) {
> -                               err = PTR_ERR(ve);
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -
> -                       err = intel_virtual_engine_attach_bond(ve->engine,
> -                                                              master,
> -                                                              siblings[n]);
> -                       if (err) {
> -                               intel_context_put(ve);
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -
> -                       err = intel_context_pin(ve);
> -                       intel_context_put(ve);
> -                       if (err) {
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -
> -                       rq[n + 1] = i915_request_create(ve);
> -                       intel_context_unpin(ve);
> -                       if (IS_ERR(rq[n + 1])) {
> -                               err = PTR_ERR(rq[n + 1]);
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -                       i915_request_get(rq[n + 1]);
> -
> -                       err = i915_request_await_execution(rq[n + 1],
> -                                                          &rq[0]->fence,
> -                                                          ve->engine->bond_execute);
> -                       i915_request_add(rq[n + 1]);
> -                       if (err < 0) {
> -                               onstack_fence_fini(&fence);
> -                               goto out;
> -                       }
> -               }
> -               onstack_fence_fini(&fence);
> -               intel_engine_flush_submission(master);
> -               igt_spinner_end(&spin);
> -
> -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> -                       pr_err("Master request did not execute (on %s)!\n",
> -                              rq[0]->engine->name);
> -                       err = -EIO;
> -                       goto out;
> -               }
> -
> -               for (n = 0; n < nsibling; n++) {
> -                       if (i915_request_wait(rq[n + 1], 0,
> -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> -                               err = -EIO;
> -                               goto out;
> -                       }
> -
> -                       if (rq[n + 1]->engine != siblings[n]) {
> -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> -                                      siblings[n]->name,
> -                                      rq[n + 1]->engine->name,
> -                                      rq[0]->engine->name);
> -                               err = -EINVAL;
> -                               goto out;
> -                       }
> -               }
> -
> -               for (n = 0; !IS_ERR(rq[n]); n++)
> -                       i915_request_put(rq[n]);
> -               rq[0] = ERR_PTR(-ENOMEM);
> -       }
> -
> -out:
> -       for (n = 0; !IS_ERR(rq[n]); n++)
> -               i915_request_put(rq[n]);
> -       if (igt_flush_test(gt->i915))
> -               err = -EIO;
> -
> -       igt_spinner_fini(&spin);
> -       return err;
> -}
> -
> -static int live_virtual_bond(void *arg)
> -{
> -       static const struct phase {
> -               const char *name;
> -               unsigned int flags;
> -       } phases[] = {
> -               { "", 0 },
> -               { "schedule", BOND_SCHEDULE },
> -               { },
> -       };
> -       struct intel_gt *gt = arg;
> -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> -       unsigned int class;
> -       int err;
> -
> -       if (intel_uc_uses_guc_submission(&gt->uc))
> -               return 0;
> -
> -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> -               const struct phase *p;
> -               int nsibling;
> -
> -               nsibling = select_siblings(gt, class, siblings);
> -               if (nsibling < 2)
> -                       continue;
> -
> -               for (p = phases; p->name; p++) {
> -                       err = bond_virtual_engine(gt,
> -                                                 class, siblings, nsibling,
> -                                                 p->flags);
> -                       if (err) {
> -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> -                                      __func__, p->name, class, nsibling, err);
> -                               return err;
> -                       }
> -               }
> -       }
> -
> -       return 0;
> -}
> -
>  static int reset_virtual_engine(struct intel_gt *gt,
>                                 struct intel_engine_cs **siblings,
>                                 unsigned int nsibling)
> @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>                 SUBTEST(live_virtual_mask),
>                 SUBTEST(live_virtual_preserved),
>                 SUBTEST(live_virtual_slice),
> -               SUBTEST(live_virtual_bond),
>                 SUBTEST(live_virtual_reset),
>         };
>
> --
> 2.31.1
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/20] drm/i915/gem: Disallow bonding of virtual engines (v2)
  2021-04-26 23:43     ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-27 13:58       ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27 13:58 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Mon, Apr 26, 2021 at 06:43:30PM -0500, Jason Ekstrand wrote:
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.

Have you tripled checked this by running media stack with bonding? Also
this needs acks from media side, pls Cc Carl&Tony.

I think you should also explain a bit more indetail why exactly the bonded
submit thing is a no-op and what the implications are, since it took me a
while to get that. Plus you missed the entire SUBMIT_FENCE entertainment,
so obviously this isn't very obvious :-)
 
> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
> 
> v2 (Jason Ekstrand):
>  - Don't delete quite as much code.  Some of it was necessary.

Please explain the details here, after all this is rather tricky ...

> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

So this just stops the uapi and immediate things. But since I've looked
around in how this works I think it'd be worth it to throw a backend
cleanup task on top. Not the entire thing, but just the most egregious
detail:

One thing the submit fence does, aside from holding up the subsequent
batches until the first one is scheduled, is limit the set of engines to
the right pair - which we know once the engine is selected for the first
batch. That's done with some lockless trickery in the await fence callback
(iirc, would need to double-check) with cmpxchg. If we can delete that in
a follow-up, assuming it's really not pulling in an entire string of
things, I think that would be rather nice clarification on what's possible
or not possible wrt execlist backend scheduling.

I'd like to do this now because unlike all the rcu stuff it's a lot harder
to find it again and realize it's all dead code now. With the rcu/locking
stuff I'm much less worried about leaving complexity behind that we don't
realize isn't needed anymore.

Also we really need to make sure we can get away with this before we
commit to anything I think ...

Code itself looks reasonable, but I'll wait for r-b stamping until the
commit message is more polished.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>  .../drm/i915/gt/intel_execlists_submission.c  |  83 -------
>  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
>  4 files changed, 6 insertions(+), 328 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>  	}
>  	virtual = set->engines->engines[idx]->engine;
>  
> +	if (intel_engine_is_virtual(virtual)) {
> +		drm_dbg(&i915->drm,
> +			"Bonding with virtual engines not allowed\n");
> +		return -EINVAL;
> +	}
> +
>  	err = check_user_mbz(&ext->flags);
>  	if (err)
>  		return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>  				n, ci.engine_class, ci.engine_instance);
>  			return -EINVAL;
>  		}
> -
> -		/*
> -		 * A non-virtual engine has no siblings to choose between; and
> -		 * a submit fence will always be directed to the one engine.
> -		 */
> -		if (intel_engine_is_virtual(virtual)) {
> -			err = intel_virtual_engine_attach_bond(virtual,
> -							       master,
> -							       bond);
> -			if (err)
> -				return err;
> -		}
>  	}
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..a6204c60b59cb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>  		int prio;
>  	} nodes[I915_NUM_ENGINES];
>  
> -	/*
> -	 * Keep track of bonded pairs -- restrictions upon on our selection
> -	 * of physical engines any particular request may be submitted to.
> -	 * If we receive a submit-fence from a master engine, we will only
> -	 * use one of sibling_mask physical engines.
> -	 */
> -	struct ve_bond {
> -		const struct intel_engine_cs *master;
> -		intel_engine_mask_t sibling_mask;
> -	} *bonds;
> -	unsigned int num_bonds;
> -
>  	/* And finally, which physical engines this virtual engine maps onto. */
>  	unsigned int num_siblings;
>  	struct intel_engine_cs *siblings[];
> @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
>  	intel_breadcrumbs_free(ve->base.breadcrumbs);
>  	intel_engine_free_request_pool(&ve->base);
>  
> -	kfree(ve->bonds);
>  	kfree(ve);
>  }
>  
> @@ -3560,33 +3547,13 @@ static void virtual_submit_request(struct i915_request *rq)
>  	spin_unlock_irqrestore(&ve->base.active.lock, flags);
>  }
>  
> -static struct ve_bond *
> -virtual_find_bond(struct virtual_engine *ve,
> -		  const struct intel_engine_cs *master)
> -{
> -	int i;
> -
> -	for (i = 0; i < ve->num_bonds; i++) {
> -		if (ve->bonds[i].master == master)
> -			return &ve->bonds[i];
> -	}
> -
> -	return NULL;
> -}
> -
>  static void
>  virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
>  {
> -	struct virtual_engine *ve = to_virtual_engine(rq->engine);
>  	intel_engine_mask_t allowed, exec;
> -	struct ve_bond *bond;
>  
>  	allowed = ~to_request(signal)->engine->mask;
>  
> -	bond = virtual_find_bond(ve, to_request(signal)->engine);
> -	if (bond)
> -		allowed &= bond->sibling_mask;
> -
>  	/* Restrict the bonded request to run on only the available engines */
>  	exec = READ_ONCE(rq->execution_mask);
>  	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> @@ -3747,59 +3714,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
>  	if (IS_ERR(dst))
>  		return dst;
>  
> -	if (se->num_bonds) {
> -		struct virtual_engine *de = to_virtual_engine(dst->engine);
> -
> -		de->bonds = kmemdup(se->bonds,
> -				    sizeof(*se->bonds) * se->num_bonds,
> -				    GFP_KERNEL);
> -		if (!de->bonds) {
> -			intel_context_put(dst);
> -			return ERR_PTR(-ENOMEM);
> -		}
> -
> -		de->num_bonds = se->num_bonds;
> -	}
> -
>  	return dst;
>  }
>  
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling)
> -{
> -	struct virtual_engine *ve = to_virtual_engine(engine);
> -	struct ve_bond *bond;
> -	int n;
> -
> -	/* Sanity check the sibling is part of the virtual engine */
> -	for (n = 0; n < ve->num_siblings; n++)
> -		if (sibling == ve->siblings[n])
> -			break;
> -	if (n == ve->num_siblings)
> -		return -EINVAL;
> -
> -	bond = virtual_find_bond(ve, master);
> -	if (bond) {
> -		bond->sibling_mask |= sibling->mask;
> -		return 0;
> -	}
> -
> -	bond = krealloc(ve->bonds,
> -			sizeof(*bond) * (ve->num_bonds + 1),
> -			GFP_KERNEL);
> -	if (!bond)
> -		return -ENOMEM;
> -
> -	bond[ve->num_bonds].master = master;
> -	bond[ve->num_bonds].sibling_mask = sibling->mask;
> -
> -	ve->bonds = bond;
> -	ve->num_bonds++;
> -
> -	return 0;
> -}
> -
>  void intel_execlists_show_requests(struct intel_engine_cs *engine,
>  				   struct drm_printer *m,
>  				   void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> index fd61dae820e9e..80cec37a56ba9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>  struct intel_context *
>  intel_execlists_clone_virtual(struct intel_engine_cs *src);
>  
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling);
> -
>  bool
>  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
>  
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 1081cd36a2bd3..f03446d587160 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
>  	return 0;
>  }
>  
> -static int bond_virtual_engine(struct intel_gt *gt,
> -			       unsigned int class,
> -			       struct intel_engine_cs **siblings,
> -			       unsigned int nsibling,
> -			       unsigned int flags)
> -#define BOND_SCHEDULE BIT(0)
> -{
> -	struct intel_engine_cs *master;
> -	struct i915_request *rq[16];
> -	enum intel_engine_id id;
> -	struct igt_spinner spin;
> -	unsigned long n;
> -	int err;
> -
> -	/*
> -	 * A set of bonded requests is intended to be run concurrently
> -	 * across a number of engines. We use one request per-engine
> -	 * and a magic fence to schedule each of the bonded requests
> -	 * at the same time. A consequence of our current scheduler is that
> -	 * we only move requests to the HW ready queue when the request
> -	 * becomes ready, that is when all of its prerequisite fences have
> -	 * been signaled. As one of those fences is the master submit fence,
> -	 * there is a delay on all secondary fences as the HW may be
> -	 * currently busy. Equally, as all the requests are independent,
> -	 * they may have other fences that delay individual request
> -	 * submission to HW. Ergo, we do not guarantee that all requests are
> -	 * immediately submitted to HW at the same time, just that if the
> -	 * rules are abided by, they are ready at the same time as the
> -	 * first is submitted. Userspace can embed semaphores in its batch
> -	 * to ensure parallel execution of its phases as it requires.
> -	 * Though naturally it gets requested that perhaps the scheduler should
> -	 * take care of parallel execution, even across preemption events on
> -	 * different HW. (The proper answer is of course "lalalala".)
> -	 *
> -	 * With the submit-fence, we have identified three possible phases
> -	 * of synchronisation depending on the master fence: queued (not
> -	 * ready), executing, and signaled. The first two are quite simple
> -	 * and checked below. However, the signaled master fence handling is
> -	 * contentious. Currently we do not distinguish between a signaled
> -	 * fence and an expired fence, as once signaled it does not convey
> -	 * any information about the previous execution. It may even be freed
> -	 * and hence checking later it may not exist at all. Ergo we currently
> -	 * do not apply the bonding constraint for an already signaled fence,
> -	 * as our expectation is that it should not constrain the secondaries
> -	 * and is outside of the scope of the bonded request API (i.e. all
> -	 * userspace requests are meant to be running in parallel). As
> -	 * it imposes no constraint, and is effectively a no-op, we do not
> -	 * check below as normal execution flows are checked extensively above.
> -	 *
> -	 * XXX Is the degenerate handling of signaled submit fences the
> -	 * expected behaviour for userpace?
> -	 */
> -
> -	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> -
> -	if (igt_spinner_init(&spin, gt))
> -		return -ENOMEM;
> -
> -	err = 0;
> -	rq[0] = ERR_PTR(-ENOMEM);
> -	for_each_engine(master, gt, id) {
> -		struct i915_sw_fence fence = {};
> -		struct intel_context *ce;
> -
> -		if (master->class == class)
> -			continue;
> -
> -		ce = intel_context_create(master);
> -		if (IS_ERR(ce)) {
> -			err = PTR_ERR(ce);
> -			goto out;
> -		}
> -
> -		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> -
> -		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> -		intel_context_put(ce);
> -		if (IS_ERR(rq[0])) {
> -			err = PTR_ERR(rq[0]);
> -			goto out;
> -		}
> -		i915_request_get(rq[0]);
> -
> -		if (flags & BOND_SCHEDULE) {
> -			onstack_fence_init(&fence);
> -			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> -							       &fence,
> -							       GFP_KERNEL);
> -		}
> -
> -		i915_request_add(rq[0]);
> -		if (err < 0)
> -			goto out;
> -
> -		if (!(flags & BOND_SCHEDULE) &&
> -		    !igt_wait_for_spinner(&spin, rq[0])) {
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			struct intel_context *ve;
> -
> -			ve = intel_execlists_create_virtual(siblings, nsibling);
> -			if (IS_ERR(ve)) {
> -				err = PTR_ERR(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_virtual_engine_attach_bond(ve->engine,
> -							       master,
> -							       siblings[n]);
> -			if (err) {
> -				intel_context_put(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_context_pin(ve);
> -			intel_context_put(ve);
> -			if (err) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			rq[n + 1] = i915_request_create(ve);
> -			intel_context_unpin(ve);
> -			if (IS_ERR(rq[n + 1])) {
> -				err = PTR_ERR(rq[n + 1]);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -			i915_request_get(rq[n + 1]);
> -
> -			err = i915_request_await_execution(rq[n + 1],
> -							   &rq[0]->fence,
> -							   ve->engine->bond_execute);
> -			i915_request_add(rq[n + 1]);
> -			if (err < 0) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -		}
> -		onstack_fence_fini(&fence);
> -		intel_engine_flush_submission(master);
> -		igt_spinner_end(&spin);
> -
> -		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> -			pr_err("Master request did not execute (on %s)!\n",
> -			       rq[0]->engine->name);
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			if (i915_request_wait(rq[n + 1], 0,
> -					      MAX_SCHEDULE_TIMEOUT) < 0) {
> -				err = -EIO;
> -				goto out;
> -			}
> -
> -			if (rq[n + 1]->engine != siblings[n]) {
> -				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> -				       siblings[n]->name,
> -				       rq[n + 1]->engine->name,
> -				       rq[0]->engine->name);
> -				err = -EINVAL;
> -				goto out;
> -			}
> -		}
> -
> -		for (n = 0; !IS_ERR(rq[n]); n++)
> -			i915_request_put(rq[n]);
> -		rq[0] = ERR_PTR(-ENOMEM);
> -	}
> -
> -out:
> -	for (n = 0; !IS_ERR(rq[n]); n++)
> -		i915_request_put(rq[n]);
> -	if (igt_flush_test(gt->i915))
> -		err = -EIO;
> -
> -	igt_spinner_fini(&spin);
> -	return err;
> -}
> -
> -static int live_virtual_bond(void *arg)
> -{
> -	static const struct phase {
> -		const char *name;
> -		unsigned int flags;
> -	} phases[] = {
> -		{ "", 0 },
> -		{ "schedule", BOND_SCHEDULE },
> -		{ },
> -	};
> -	struct intel_gt *gt = arg;
> -	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> -	unsigned int class;
> -	int err;
> -
> -	if (intel_uc_uses_guc_submission(&gt->uc))
> -		return 0;
> -
> -	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> -		const struct phase *p;
> -		int nsibling;
> -
> -		nsibling = select_siblings(gt, class, siblings);
> -		if (nsibling < 2)
> -			continue;
> -
> -		for (p = phases; p->name; p++) {
> -			err = bond_virtual_engine(gt,
> -						  class, siblings, nsibling,
> -						  p->flags);
> -			if (err) {
> -				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> -				       __func__, p->name, class, nsibling, err);
> -				return err;
> -			}
> -		}
> -	}
> -
> -	return 0;
> -}
> -
>  static int reset_virtual_engine(struct intel_gt *gt,
>  				struct intel_engine_cs **siblings,
>  				unsigned int nsibling)
> @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>  		SUBTEST(live_virtual_mask),
>  		SUBTEST(live_virtual_preserved),
>  		SUBTEST(live_virtual_slice),
> -		SUBTEST(live_virtual_bond),
>  		SUBTEST(live_virtual_reset),
>  	};
>  
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/20] drm/i915/gem: Disallow bonding of virtual engines (v2)
@ 2021-04-27 13:58       ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-27 13:58 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Mon, Apr 26, 2021 at 06:43:30PM -0500, Jason Ekstrand wrote:
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.

Have you tripled checked this by running media stack with bonding? Also
this needs acks from media side, pls Cc Carl&Tony.

I think you should also explain a bit more indetail why exactly the bonded
submit thing is a no-op and what the implications are, since it took me a
while to get that. Plus you missed the entire SUBMIT_FENCE entertainment,
so obviously this isn't very obvious :-)
 
> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
> 
> v2 (Jason Ekstrand):
>  - Don't delete quite as much code.  Some of it was necessary.

Please explain the details here, after all this is rather tricky ...

> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

So this just stops the uapi and immediate things. But since I've looked
around in how this works I think it'd be worth it to throw a backend
cleanup task on top. Not the entire thing, but just the most egregious
detail:

One thing the submit fence does, aside from holding up the subsequent
batches until the first one is scheduled, is limit the set of engines to
the right pair - which we know once the engine is selected for the first
batch. That's done with some lockless trickery in the await fence callback
(iirc, would need to double-check) with cmpxchg. If we can delete that in
a follow-up, assuming it's really not pulling in an entire string of
things, I think that would be rather nice clarification on what's possible
or not possible wrt execlist backend scheduling.

I'd like to do this now because unlike all the rcu stuff it's a lot harder
to find it again and realize it's all dead code now. With the rcu/locking
stuff I'm much less worried about leaving complexity behind that we don't
realize isn't needed anymore.

Also we really need to make sure we can get away with this before we
commit to anything I think ...

Code itself looks reasonable, but I'll wait for r-b stamping until the
commit message is more polished.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>  .../drm/i915/gt/intel_execlists_submission.c  |  83 -------
>  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
>  4 files changed, 6 insertions(+), 328 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>  	}
>  	virtual = set->engines->engines[idx]->engine;
>  
> +	if (intel_engine_is_virtual(virtual)) {
> +		drm_dbg(&i915->drm,
> +			"Bonding with virtual engines not allowed\n");
> +		return -EINVAL;
> +	}
> +
>  	err = check_user_mbz(&ext->flags);
>  	if (err)
>  		return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>  				n, ci.engine_class, ci.engine_instance);
>  			return -EINVAL;
>  		}
> -
> -		/*
> -		 * A non-virtual engine has no siblings to choose between; and
> -		 * a submit fence will always be directed to the one engine.
> -		 */
> -		if (intel_engine_is_virtual(virtual)) {
> -			err = intel_virtual_engine_attach_bond(virtual,
> -							       master,
> -							       bond);
> -			if (err)
> -				return err;
> -		}
>  	}
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..a6204c60b59cb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>  		int prio;
>  	} nodes[I915_NUM_ENGINES];
>  
> -	/*
> -	 * Keep track of bonded pairs -- restrictions upon on our selection
> -	 * of physical engines any particular request may be submitted to.
> -	 * If we receive a submit-fence from a master engine, we will only
> -	 * use one of sibling_mask physical engines.
> -	 */
> -	struct ve_bond {
> -		const struct intel_engine_cs *master;
> -		intel_engine_mask_t sibling_mask;
> -	} *bonds;
> -	unsigned int num_bonds;
> -
>  	/* And finally, which physical engines this virtual engine maps onto. */
>  	unsigned int num_siblings;
>  	struct intel_engine_cs *siblings[];
> @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
>  	intel_breadcrumbs_free(ve->base.breadcrumbs);
>  	intel_engine_free_request_pool(&ve->base);
>  
> -	kfree(ve->bonds);
>  	kfree(ve);
>  }
>  
> @@ -3560,33 +3547,13 @@ static void virtual_submit_request(struct i915_request *rq)
>  	spin_unlock_irqrestore(&ve->base.active.lock, flags);
>  }
>  
> -static struct ve_bond *
> -virtual_find_bond(struct virtual_engine *ve,
> -		  const struct intel_engine_cs *master)
> -{
> -	int i;
> -
> -	for (i = 0; i < ve->num_bonds; i++) {
> -		if (ve->bonds[i].master == master)
> -			return &ve->bonds[i];
> -	}
> -
> -	return NULL;
> -}
> -
>  static void
>  virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
>  {
> -	struct virtual_engine *ve = to_virtual_engine(rq->engine);
>  	intel_engine_mask_t allowed, exec;
> -	struct ve_bond *bond;
>  
>  	allowed = ~to_request(signal)->engine->mask;
>  
> -	bond = virtual_find_bond(ve, to_request(signal)->engine);
> -	if (bond)
> -		allowed &= bond->sibling_mask;
> -
>  	/* Restrict the bonded request to run on only the available engines */
>  	exec = READ_ONCE(rq->execution_mask);
>  	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> @@ -3747,59 +3714,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
>  	if (IS_ERR(dst))
>  		return dst;
>  
> -	if (se->num_bonds) {
> -		struct virtual_engine *de = to_virtual_engine(dst->engine);
> -
> -		de->bonds = kmemdup(se->bonds,
> -				    sizeof(*se->bonds) * se->num_bonds,
> -				    GFP_KERNEL);
> -		if (!de->bonds) {
> -			intel_context_put(dst);
> -			return ERR_PTR(-ENOMEM);
> -		}
> -
> -		de->num_bonds = se->num_bonds;
> -	}
> -
>  	return dst;
>  }
>  
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling)
> -{
> -	struct virtual_engine *ve = to_virtual_engine(engine);
> -	struct ve_bond *bond;
> -	int n;
> -
> -	/* Sanity check the sibling is part of the virtual engine */
> -	for (n = 0; n < ve->num_siblings; n++)
> -		if (sibling == ve->siblings[n])
> -			break;
> -	if (n == ve->num_siblings)
> -		return -EINVAL;
> -
> -	bond = virtual_find_bond(ve, master);
> -	if (bond) {
> -		bond->sibling_mask |= sibling->mask;
> -		return 0;
> -	}
> -
> -	bond = krealloc(ve->bonds,
> -			sizeof(*bond) * (ve->num_bonds + 1),
> -			GFP_KERNEL);
> -	if (!bond)
> -		return -ENOMEM;
> -
> -	bond[ve->num_bonds].master = master;
> -	bond[ve->num_bonds].sibling_mask = sibling->mask;
> -
> -	ve->bonds = bond;
> -	ve->num_bonds++;
> -
> -	return 0;
> -}
> -
>  void intel_execlists_show_requests(struct intel_engine_cs *engine,
>  				   struct drm_printer *m,
>  				   void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> index fd61dae820e9e..80cec37a56ba9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>  struct intel_context *
>  intel_execlists_clone_virtual(struct intel_engine_cs *src);
>  
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling);
> -
>  bool
>  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
>  
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 1081cd36a2bd3..f03446d587160 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
>  	return 0;
>  }
>  
> -static int bond_virtual_engine(struct intel_gt *gt,
> -			       unsigned int class,
> -			       struct intel_engine_cs **siblings,
> -			       unsigned int nsibling,
> -			       unsigned int flags)
> -#define BOND_SCHEDULE BIT(0)
> -{
> -	struct intel_engine_cs *master;
> -	struct i915_request *rq[16];
> -	enum intel_engine_id id;
> -	struct igt_spinner spin;
> -	unsigned long n;
> -	int err;
> -
> -	/*
> -	 * A set of bonded requests is intended to be run concurrently
> -	 * across a number of engines. We use one request per-engine
> -	 * and a magic fence to schedule each of the bonded requests
> -	 * at the same time. A consequence of our current scheduler is that
> -	 * we only move requests to the HW ready queue when the request
> -	 * becomes ready, that is when all of its prerequisite fences have
> -	 * been signaled. As one of those fences is the master submit fence,
> -	 * there is a delay on all secondary fences as the HW may be
> -	 * currently busy. Equally, as all the requests are independent,
> -	 * they may have other fences that delay individual request
> -	 * submission to HW. Ergo, we do not guarantee that all requests are
> -	 * immediately submitted to HW at the same time, just that if the
> -	 * rules are abided by, they are ready at the same time as the
> -	 * first is submitted. Userspace can embed semaphores in its batch
> -	 * to ensure parallel execution of its phases as it requires.
> -	 * Though naturally it gets requested that perhaps the scheduler should
> -	 * take care of parallel execution, even across preemption events on
> -	 * different HW. (The proper answer is of course "lalalala".)
> -	 *
> -	 * With the submit-fence, we have identified three possible phases
> -	 * of synchronisation depending on the master fence: queued (not
> -	 * ready), executing, and signaled. The first two are quite simple
> -	 * and checked below. However, the signaled master fence handling is
> -	 * contentious. Currently we do not distinguish between a signaled
> -	 * fence and an expired fence, as once signaled it does not convey
> -	 * any information about the previous execution. It may even be freed
> -	 * and hence checking later it may not exist at all. Ergo we currently
> -	 * do not apply the bonding constraint for an already signaled fence,
> -	 * as our expectation is that it should not constrain the secondaries
> -	 * and is outside of the scope of the bonded request API (i.e. all
> -	 * userspace requests are meant to be running in parallel). As
> -	 * it imposes no constraint, and is effectively a no-op, we do not
> -	 * check below as normal execution flows are checked extensively above.
> -	 *
> -	 * XXX Is the degenerate handling of signaled submit fences the
> -	 * expected behaviour for userpace?
> -	 */
> -
> -	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> -
> -	if (igt_spinner_init(&spin, gt))
> -		return -ENOMEM;
> -
> -	err = 0;
> -	rq[0] = ERR_PTR(-ENOMEM);
> -	for_each_engine(master, gt, id) {
> -		struct i915_sw_fence fence = {};
> -		struct intel_context *ce;
> -
> -		if (master->class == class)
> -			continue;
> -
> -		ce = intel_context_create(master);
> -		if (IS_ERR(ce)) {
> -			err = PTR_ERR(ce);
> -			goto out;
> -		}
> -
> -		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> -
> -		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> -		intel_context_put(ce);
> -		if (IS_ERR(rq[0])) {
> -			err = PTR_ERR(rq[0]);
> -			goto out;
> -		}
> -		i915_request_get(rq[0]);
> -
> -		if (flags & BOND_SCHEDULE) {
> -			onstack_fence_init(&fence);
> -			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> -							       &fence,
> -							       GFP_KERNEL);
> -		}
> -
> -		i915_request_add(rq[0]);
> -		if (err < 0)
> -			goto out;
> -
> -		if (!(flags & BOND_SCHEDULE) &&
> -		    !igt_wait_for_spinner(&spin, rq[0])) {
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			struct intel_context *ve;
> -
> -			ve = intel_execlists_create_virtual(siblings, nsibling);
> -			if (IS_ERR(ve)) {
> -				err = PTR_ERR(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_virtual_engine_attach_bond(ve->engine,
> -							       master,
> -							       siblings[n]);
> -			if (err) {
> -				intel_context_put(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_context_pin(ve);
> -			intel_context_put(ve);
> -			if (err) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			rq[n + 1] = i915_request_create(ve);
> -			intel_context_unpin(ve);
> -			if (IS_ERR(rq[n + 1])) {
> -				err = PTR_ERR(rq[n + 1]);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -			i915_request_get(rq[n + 1]);
> -
> -			err = i915_request_await_execution(rq[n + 1],
> -							   &rq[0]->fence,
> -							   ve->engine->bond_execute);
> -			i915_request_add(rq[n + 1]);
> -			if (err < 0) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -		}
> -		onstack_fence_fini(&fence);
> -		intel_engine_flush_submission(master);
> -		igt_spinner_end(&spin);
> -
> -		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> -			pr_err("Master request did not execute (on %s)!\n",
> -			       rq[0]->engine->name);
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			if (i915_request_wait(rq[n + 1], 0,
> -					      MAX_SCHEDULE_TIMEOUT) < 0) {
> -				err = -EIO;
> -				goto out;
> -			}
> -
> -			if (rq[n + 1]->engine != siblings[n]) {
> -				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> -				       siblings[n]->name,
> -				       rq[n + 1]->engine->name,
> -				       rq[0]->engine->name);
> -				err = -EINVAL;
> -				goto out;
> -			}
> -		}
> -
> -		for (n = 0; !IS_ERR(rq[n]); n++)
> -			i915_request_put(rq[n]);
> -		rq[0] = ERR_PTR(-ENOMEM);
> -	}
> -
> -out:
> -	for (n = 0; !IS_ERR(rq[n]); n++)
> -		i915_request_put(rq[n]);
> -	if (igt_flush_test(gt->i915))
> -		err = -EIO;
> -
> -	igt_spinner_fini(&spin);
> -	return err;
> -}
> -
> -static int live_virtual_bond(void *arg)
> -{
> -	static const struct phase {
> -		const char *name;
> -		unsigned int flags;
> -	} phases[] = {
> -		{ "", 0 },
> -		{ "schedule", BOND_SCHEDULE },
> -		{ },
> -	};
> -	struct intel_gt *gt = arg;
> -	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> -	unsigned int class;
> -	int err;
> -
> -	if (intel_uc_uses_guc_submission(&gt->uc))
> -		return 0;
> -
> -	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> -		const struct phase *p;
> -		int nsibling;
> -
> -		nsibling = select_siblings(gt, class, siblings);
> -		if (nsibling < 2)
> -			continue;
> -
> -		for (p = phases; p->name; p++) {
> -			err = bond_virtual_engine(gt,
> -						  class, siblings, nsibling,
> -						  p->flags);
> -			if (err) {
> -				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> -				       __func__, p->name, class, nsibling, err);
> -				return err;
> -			}
> -		}
> -	}
> -
> -	return 0;
> -}
> -
>  static int reset_virtual_engine(struct intel_gt *gt,
>  				struct intel_engine_cs **siblings,
>  				unsigned int nsibling)
> @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>  		SUBTEST(live_virtual_mask),
>  		SUBTEST(live_virtual_preserved),
>  		SUBTEST(live_virtual_slice),
> -		SUBTEST(live_virtual_bond),
>  		SUBTEST(live_virtual_reset),
>  	};
>  
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
  2021-04-27  9:32     ` [Intel-gfx] " Daniel Vetter
@ 2021-04-28  3:33       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28  3:33 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Tue, Apr 27, 2021 at 4:32 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:11PM -0500, Jason Ekstrand wrote:
> > This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
> > ringsize on construction").  This API was originally added for OpenCL
> > but the compute-runtime PR has sat open for a year without action so we
> > can still pull it out if we want.  I argue we should drop it for three
> > reasons:
> >
> >  1. If the compute-runtime PR has sat open for a year, this clearly
> >     isn't that important.
> >
> >  2. It's a very leaky API.  Ring size is an implementation detail of the
> >     current execlist scheduler and really only makes sense there.  It
> >     can't apply to the older ring-buffer scheduler on pre-execlist
> >     hardware because that's shared across all contexts and it won't
> >     apply to the GuC scheduler that's in the pipeline.
> >
> >  3. Having userspace set a ring size in bytes is a bad solution to the
> >     problem of having too small a ring.  There is no way that userspace
> >     has the information to know how to properly set the ring size so
> >     it's just going to detect the feature and always set it to the
> >     maximum of 512K.  This is what the compute-runtime PR does.  The
> >     scheduler in i915, on the other hand, does have the information to
> >     make an informed choice.  It could detect if the ring size is a
> >     problem and grow it itself.  Or, if that's too hard, we could just
> >     increase the default size from 16K to 32K or even 64K instead of
> >     relying on userspace to do it.
> >
> > Let's drop this API for now and, if someone decides they really care
> > about solving this problem, they can do it properly.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>
> Two things:
> - I'm assuming you have an igt change to make sure we get EINVAL for both
>   set and getparam now? Just to make sure.

I've written up some quick tests.  I'll send them out in the next
version of the IGT series or as a separate series if that one gets
reviewed without comment (unlikely).

> - intel_context->ring is either a ring pointer when CONTEXT_ALLOC_BIT is
>   set in ce->flags, or the size of the ring stored in the pointer if not.
>   I'm seriously hoping you get rid of this complexity with your
>   proto-context series, and also delete __intel_context_ring_size() in the
>   end. That function has no business existing imo.

I hadn't done that yet, no.  But I typed up a patch today which I'll
send out with the next version of this series which does this.

--Jason

>   If not, please make sure that's the case.
>
> Aside from these patch looks good.
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> > ---
> >  drivers/gpu/drm/i915/Makefile                 |  1 -
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +------------------
> >  drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --------------
> >  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
> >  include/uapi/drm/i915_drm.h                   | 20 +----
> >  5 files changed, 4 insertions(+), 168 deletions(-)
> >  delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c
> >
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index d0d936d9137bc..afa22338fa343 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -88,7 +88,6 @@ gt-y += \
> >       gt/gen8_ppgtt.o \
> >       gt/intel_breadcrumbs.o \
> >       gt/intel_context.o \
> > -     gt/intel_context_param.o \
> >       gt/intel_context_sseu.o \
> >       gt/intel_engine_cs.o \
> >       gt/intel_engine_heartbeat.o \
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index fd8ee52e17a47..e52b85b8f923d 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
> >       return err;
> >  }
> >
> > -static int __apply_ringsize(struct intel_context *ce, void *sz)
> > -{
> > -     return intel_context_set_ring_size(ce, (unsigned long)sz);
> > -}
> > -
> > -static int set_ringsize(struct i915_gem_context *ctx,
> > -                     struct drm_i915_gem_context_param *args)
> > -{
> > -     if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> > -             return -ENODEV;
> > -
> > -     if (args->size)
> > -             return -EINVAL;
> > -
> > -     if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
> > -             return -EINVAL;
> > -
> > -     if (args->value < I915_GTT_PAGE_SIZE)
> > -             return -EINVAL;
> > -
> > -     if (args->value > 128 * I915_GTT_PAGE_SIZE)
> > -             return -EINVAL;
> > -
> > -     return context_apply_all(ctx,
> > -                              __apply_ringsize,
> > -                              __intel_context_ring_size(args->value));
> > -}
> > -
> > -static int __get_ringsize(struct intel_context *ce, void *arg)
> > -{
> > -     long sz;
> > -
> > -     sz = intel_context_get_ring_size(ce);
> > -     GEM_BUG_ON(sz > INT_MAX);
> > -
> > -     return sz; /* stop on first engine */
> > -}
> > -
> > -static int get_ringsize(struct i915_gem_context *ctx,
> > -                     struct drm_i915_gem_context_param *args)
> > -{
> > -     int sz;
> > -
> > -     if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> > -             return -ENODEV;
> > -
> > -     if (args->size)
> > -             return -EINVAL;
> > -
> > -     sz = context_apply_all(ctx, __get_ringsize, NULL);
> > -     if (sz < 0)
> > -             return sz;
> > -
> > -     args->value = sz;
> > -     return 0;
> > -}
> > -
> >  int
> >  i915_gem_user_to_context_sseu(struct intel_gt *gt,
> >                             const struct drm_i915_gem_context_param_sseu *user,
> > @@ -2037,11 +1980,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
> >               ret = set_persistence(ctx, args);
> >               break;
> >
> > -     case I915_CONTEXT_PARAM_RINGSIZE:
> > -             ret = set_ringsize(ctx, args);
> > -             break;
> > -
> >       case I915_CONTEXT_PARAM_BAN_PERIOD:
> > +     case I915_CONTEXT_PARAM_RINGSIZE:
> >       default:
> >               ret = -EINVAL;
> >               break;
> > @@ -2069,18 +2009,6 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> >  }
> >
> > -static int copy_ring_size(struct intel_context *dst,
> > -                       struct intel_context *src)
> > -{
> > -     long sz;
> > -
> > -     sz = intel_context_get_ring_size(src);
> > -     if (sz < 0)
> > -             return sz;
> > -
> > -     return intel_context_set_ring_size(dst, sz);
> > -}
> > -
> >  static int clone_engines(struct i915_gem_context *dst,
> >                        struct i915_gem_context *src)
> >  {
> > @@ -2125,12 +2053,6 @@ static int clone_engines(struct i915_gem_context *dst,
> >               }
> >
> >               intel_context_set_gem(clone->engines[n], dst);
> > -
> > -             /* Copy across the preferred ringsize */
> > -             if (copy_ring_size(clone->engines[n], e->engines[n])) {
> > -                     __free_engines(clone, n + 1);
> > -                     goto err_unlock;
> > -             }
> >       }
> >       clone->num_engines = n;
> >       i915_sw_fence_complete(&e->fence);
> > @@ -2490,11 +2412,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> >               args->value = i915_gem_context_is_persistent(ctx);
> >               break;
> >
> > -     case I915_CONTEXT_PARAM_RINGSIZE:
> > -             ret = get_ringsize(ctx, args);
> > -             break;
> > -
> >       case I915_CONTEXT_PARAM_BAN_PERIOD:
> > +     case I915_CONTEXT_PARAM_RINGSIZE:
> >       default:
> >               ret = -EINVAL;
> >               break;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
> > deleted file mode 100644
> > index 65dcd090245d6..0000000000000
> > --- a/drivers/gpu/drm/i915/gt/intel_context_param.c
> > +++ /dev/null
> > @@ -1,63 +0,0 @@
> > -// SPDX-License-Identifier: MIT
> > -/*
> > - * Copyright © 2019 Intel Corporation
> > - */
> > -
> > -#include "i915_active.h"
> > -#include "intel_context.h"
> > -#include "intel_context_param.h"
> > -#include "intel_ring.h"
> > -
> > -int intel_context_set_ring_size(struct intel_context *ce, long sz)
> > -{
> > -     int err;
> > -
> > -     if (intel_context_lock_pinned(ce))
> > -             return -EINTR;
> > -
> > -     err = i915_active_wait(&ce->active);
> > -     if (err < 0)
> > -             goto unlock;
> > -
> > -     if (intel_context_is_pinned(ce)) {
> > -             err = -EBUSY; /* In active use, come back later! */
> > -             goto unlock;
> > -     }
> > -
> > -     if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> > -             struct intel_ring *ring;
> > -
> > -             /* Replace the existing ringbuffer */
> > -             ring = intel_engine_create_ring(ce->engine, sz);
> > -             if (IS_ERR(ring)) {
> > -                     err = PTR_ERR(ring);
> > -                     goto unlock;
> > -             }
> > -
> > -             intel_ring_put(ce->ring);
> > -             ce->ring = ring;
> > -
> > -             /* Context image will be updated on next pin */
> > -     } else {
> > -             ce->ring = __intel_context_ring_size(sz);
> > -     }
> > -
> > -unlock:
> > -     intel_context_unlock_pinned(ce);
> > -     return err;
> > -}
> > -
> > -long intel_context_get_ring_size(struct intel_context *ce)
> > -{
> > -     long sz = (unsigned long)READ_ONCE(ce->ring);
> > -
> > -     if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> > -             if (intel_context_lock_pinned(ce))
> > -                     return -EINTR;
> > -
> > -             sz = ce->ring->size;
> > -             intel_context_unlock_pinned(ce);
> > -     }
> > -
> > -     return sz;
> > -}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > index 3ecacc675f414..dffedd983693d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > @@ -10,9 +10,6 @@
> >
> >  #include "intel_context.h"
> >
> > -int intel_context_set_ring_size(struct intel_context *ce, long sz);
> > -long intel_context_get_ring_size(struct intel_context *ce);
> > -
> >  static inline int
> >  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> >  {
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 6a34243a7646a..6eefbc6dec01f 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1721,24 +1721,8 @@ struct drm_i915_gem_context_param {
> >   */
> >  #define I915_CONTEXT_PARAM_PERSISTENCE       0xb
> >
> > -/*
> > - * I915_CONTEXT_PARAM_RINGSIZE:
> > - *
> > - * Sets the size of the CS ringbuffer to use for logical ring contexts. This
> > - * applies a limit of how many batches can be queued to HW before the caller
> > - * is blocked due to lack of space for more commands.
> > - *
> > - * Only reliably possible to be set prior to first use, i.e. during
> > - * construction. At any later point, the current execution must be flushed as
> > - * the ring can only be changed while the context is idle. Note, the ringsize
> > - * can be specified as a constructor property, see
> > - * I915_CONTEXT_CREATE_EXT_SETPARAM, but can also be set later if required.
> > - *
> > - * Only applies to the current set of engine and lost when those engines
> > - * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
> > - *
> > - * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
> > - * Default is 16 KiB.
> > +/* This API has been removed.  On the off chance someone somewhere has
> > + * attempted to use it, never re-use this context param number.
> >   */
> >  #define I915_CONTEXT_PARAM_RINGSIZE  0xc
> >  /* Must be kept compact -- no holes and well documented */
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
@ 2021-04-28  3:33       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28  3:33 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Tue, Apr 27, 2021 at 4:32 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:11PM -0500, Jason Ekstrand wrote:
> > This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify
> > ringsize on construction").  This API was originally added for OpenCL
> > but the compute-runtime PR has sat open for a year without action so we
> > can still pull it out if we want.  I argue we should drop it for three
> > reasons:
> >
> >  1. If the compute-runtime PR has sat open for a year, this clearly
> >     isn't that important.
> >
> >  2. It's a very leaky API.  Ring size is an implementation detail of the
> >     current execlist scheduler and really only makes sense there.  It
> >     can't apply to the older ring-buffer scheduler on pre-execlist
> >     hardware because that's shared across all contexts and it won't
> >     apply to the GuC scheduler that's in the pipeline.
> >
> >  3. Having userspace set a ring size in bytes is a bad solution to the
> >     problem of having too small a ring.  There is no way that userspace
> >     has the information to know how to properly set the ring size so
> >     it's just going to detect the feature and always set it to the
> >     maximum of 512K.  This is what the compute-runtime PR does.  The
> >     scheduler in i915, on the other hand, does have the information to
> >     make an informed choice.  It could detect if the ring size is a
> >     problem and grow it itself.  Or, if that's too hard, we could just
> >     increase the default size from 16K to 32K or even 64K instead of
> >     relying on userspace to do it.
> >
> > Let's drop this API for now and, if someone decides they really care
> > about solving this problem, they can do it properly.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>
> Two things:
> - I'm assuming you have an igt change to make sure we get EINVAL for both
>   set and getparam now? Just to make sure.

I've written up some quick tests.  I'll send them out in the next
version of the IGT series or as a separate series if that one gets
reviewed without comment (unlikely).

> - intel_context->ring is either a ring pointer when CONTEXT_ALLOC_BIT is
>   set in ce->flags, or the size of the ring stored in the pointer if not.
>   I'm seriously hoping you get rid of this complexity with your
>   proto-context series, and also delete __intel_context_ring_size() in the
>   end. That function has no business existing imo.

I hadn't done that yet, no.  But I typed up a patch today which I'll
send out with the next version of this series which does this.

--Jason

>   If not, please make sure that's the case.
>
> Aside from these patch looks good.
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> > ---
> >  drivers/gpu/drm/i915/Makefile                 |  1 -
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 85 +------------------
> >  drivers/gpu/drm/i915/gt/intel_context_param.c | 63 --------------
> >  drivers/gpu/drm/i915/gt/intel_context_param.h |  3 -
> >  include/uapi/drm/i915_drm.h                   | 20 +----
> >  5 files changed, 4 insertions(+), 168 deletions(-)
> >  delete mode 100644 drivers/gpu/drm/i915/gt/intel_context_param.c
> >
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index d0d936d9137bc..afa22338fa343 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -88,7 +88,6 @@ gt-y += \
> >       gt/gen8_ppgtt.o \
> >       gt/intel_breadcrumbs.o \
> >       gt/intel_context.o \
> > -     gt/intel_context_param.o \
> >       gt/intel_context_sseu.o \
> >       gt/intel_engine_cs.o \
> >       gt/intel_engine_heartbeat.o \
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index fd8ee52e17a47..e52b85b8f923d 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1335,63 +1335,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
> >       return err;
> >  }
> >
> > -static int __apply_ringsize(struct intel_context *ce, void *sz)
> > -{
> > -     return intel_context_set_ring_size(ce, (unsigned long)sz);
> > -}
> > -
> > -static int set_ringsize(struct i915_gem_context *ctx,
> > -                     struct drm_i915_gem_context_param *args)
> > -{
> > -     if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> > -             return -ENODEV;
> > -
> > -     if (args->size)
> > -             return -EINVAL;
> > -
> > -     if (!IS_ALIGNED(args->value, I915_GTT_PAGE_SIZE))
> > -             return -EINVAL;
> > -
> > -     if (args->value < I915_GTT_PAGE_SIZE)
> > -             return -EINVAL;
> > -
> > -     if (args->value > 128 * I915_GTT_PAGE_SIZE)
> > -             return -EINVAL;
> > -
> > -     return context_apply_all(ctx,
> > -                              __apply_ringsize,
> > -                              __intel_context_ring_size(args->value));
> > -}
> > -
> > -static int __get_ringsize(struct intel_context *ce, void *arg)
> > -{
> > -     long sz;
> > -
> > -     sz = intel_context_get_ring_size(ce);
> > -     GEM_BUG_ON(sz > INT_MAX);
> > -
> > -     return sz; /* stop on first engine */
> > -}
> > -
> > -static int get_ringsize(struct i915_gem_context *ctx,
> > -                     struct drm_i915_gem_context_param *args)
> > -{
> > -     int sz;
> > -
> > -     if (!HAS_LOGICAL_RING_CONTEXTS(ctx->i915))
> > -             return -ENODEV;
> > -
> > -     if (args->size)
> > -             return -EINVAL;
> > -
> > -     sz = context_apply_all(ctx, __get_ringsize, NULL);
> > -     if (sz < 0)
> > -             return sz;
> > -
> > -     args->value = sz;
> > -     return 0;
> > -}
> > -
> >  int
> >  i915_gem_user_to_context_sseu(struct intel_gt *gt,
> >                             const struct drm_i915_gem_context_param_sseu *user,
> > @@ -2037,11 +1980,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
> >               ret = set_persistence(ctx, args);
> >               break;
> >
> > -     case I915_CONTEXT_PARAM_RINGSIZE:
> > -             ret = set_ringsize(ctx, args);
> > -             break;
> > -
> >       case I915_CONTEXT_PARAM_BAN_PERIOD:
> > +     case I915_CONTEXT_PARAM_RINGSIZE:
> >       default:
> >               ret = -EINVAL;
> >               break;
> > @@ -2069,18 +2009,6 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> >  }
> >
> > -static int copy_ring_size(struct intel_context *dst,
> > -                       struct intel_context *src)
> > -{
> > -     long sz;
> > -
> > -     sz = intel_context_get_ring_size(src);
> > -     if (sz < 0)
> > -             return sz;
> > -
> > -     return intel_context_set_ring_size(dst, sz);
> > -}
> > -
> >  static int clone_engines(struct i915_gem_context *dst,
> >                        struct i915_gem_context *src)
> >  {
> > @@ -2125,12 +2053,6 @@ static int clone_engines(struct i915_gem_context *dst,
> >               }
> >
> >               intel_context_set_gem(clone->engines[n], dst);
> > -
> > -             /* Copy across the preferred ringsize */
> > -             if (copy_ring_size(clone->engines[n], e->engines[n])) {
> > -                     __free_engines(clone, n + 1);
> > -                     goto err_unlock;
> > -             }
> >       }
> >       clone->num_engines = n;
> >       i915_sw_fence_complete(&e->fence);
> > @@ -2490,11 +2412,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> >               args->value = i915_gem_context_is_persistent(ctx);
> >               break;
> >
> > -     case I915_CONTEXT_PARAM_RINGSIZE:
> > -             ret = get_ringsize(ctx, args);
> > -             break;
> > -
> >       case I915_CONTEXT_PARAM_BAN_PERIOD:
> > +     case I915_CONTEXT_PARAM_RINGSIZE:
> >       default:
> >               ret = -EINVAL;
> >               break;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
> > deleted file mode 100644
> > index 65dcd090245d6..0000000000000
> > --- a/drivers/gpu/drm/i915/gt/intel_context_param.c
> > +++ /dev/null
> > @@ -1,63 +0,0 @@
> > -// SPDX-License-Identifier: MIT
> > -/*
> > - * Copyright © 2019 Intel Corporation
> > - */
> > -
> > -#include "i915_active.h"
> > -#include "intel_context.h"
> > -#include "intel_context_param.h"
> > -#include "intel_ring.h"
> > -
> > -int intel_context_set_ring_size(struct intel_context *ce, long sz)
> > -{
> > -     int err;
> > -
> > -     if (intel_context_lock_pinned(ce))
> > -             return -EINTR;
> > -
> > -     err = i915_active_wait(&ce->active);
> > -     if (err < 0)
> > -             goto unlock;
> > -
> > -     if (intel_context_is_pinned(ce)) {
> > -             err = -EBUSY; /* In active use, come back later! */
> > -             goto unlock;
> > -     }
> > -
> > -     if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> > -             struct intel_ring *ring;
> > -
> > -             /* Replace the existing ringbuffer */
> > -             ring = intel_engine_create_ring(ce->engine, sz);
> > -             if (IS_ERR(ring)) {
> > -                     err = PTR_ERR(ring);
> > -                     goto unlock;
> > -             }
> > -
> > -             intel_ring_put(ce->ring);
> > -             ce->ring = ring;
> > -
> > -             /* Context image will be updated on next pin */
> > -     } else {
> > -             ce->ring = __intel_context_ring_size(sz);
> > -     }
> > -
> > -unlock:
> > -     intel_context_unlock_pinned(ce);
> > -     return err;
> > -}
> > -
> > -long intel_context_get_ring_size(struct intel_context *ce)
> > -{
> > -     long sz = (unsigned long)READ_ONCE(ce->ring);
> > -
> > -     if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
> > -             if (intel_context_lock_pinned(ce))
> > -                     return -EINTR;
> > -
> > -             sz = ce->ring->size;
> > -             intel_context_unlock_pinned(ce);
> > -     }
> > -
> > -     return sz;
> > -}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > index 3ecacc675f414..dffedd983693d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > @@ -10,9 +10,6 @@
> >
> >  #include "intel_context.h"
> >
> > -int intel_context_set_ring_size(struct intel_context *ce, long sz);
> > -long intel_context_get_ring_size(struct intel_context *ce);
> > -
> >  static inline int
> >  intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> >  {
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 6a34243a7646a..6eefbc6dec01f 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1721,24 +1721,8 @@ struct drm_i915_gem_context_param {
> >   */
> >  #define I915_CONTEXT_PARAM_PERSISTENCE       0xb
> >
> > -/*
> > - * I915_CONTEXT_PARAM_RINGSIZE:
> > - *
> > - * Sets the size of the CS ringbuffer to use for logical ring contexts. This
> > - * applies a limit of how many batches can be queued to HW before the caller
> > - * is blocked due to lack of space for more commands.
> > - *
> > - * Only reliably possible to be set prior to first use, i.e. during
> > - * construction. At any later point, the current execution must be flushed as
> > - * the ring can only be changed while the context is idle. Note, the ringsize
> > - * can be specified as a constructor property, see
> > - * I915_CONTEXT_CREATE_EXT_SETPARAM, but can also be set later if required.
> > - *
> > - * Only applies to the current set of engine and lost when those engines
> > - * are replaced by a new mapping (see I915_CONTEXT_PARAM_ENGINES).
> > - *
> > - * Must be between 4 - 512 KiB, in intervals of page size [4 KiB].
> > - * Default is 16 KiB.
> > +/* This API has been removed.  On the off chance someone somewhere has
> > + * attempted to use it, never re-use this context param number.
> >   */
> >  #define I915_CONTEXT_PARAM_RINGSIZE  0xc
> >  /* Must be kept compact -- no holes and well documented */
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-27 13:51     ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 10:13       ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 10:13 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > This adds a bunch of complexity which the media driver has never
> > actually used.  The media driver does technically bond a balanced engine
> > to another engine but the balanced engine only has one engine in the
> > sibling set.  This doesn't actually result in a virtual engine.
> >
> > Unless some userspace badly wants it, there's no good reason to support
> > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > leave the validation code in place in case we ever decide we want to do
> > something interesting with the bonding information.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> >  6 files changed, 7 insertions(+), 353 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index e8179918fa306..5f8d0faf783aa 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >         }
> >         virtual = set->engines->engines[idx]->engine;
> >
> > +       if (intel_engine_is_virtual(virtual)) {
> > +               drm_dbg(&i915->drm,
> > +                       "Bonding with virtual engines not allowed\n");
> > +               return -EINVAL;
> > +       }
> > +
> >         err = check_user_mbz(&ext->flags);
> >         if (err)
> >                 return err;
> > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >                                 n, ci.engine_class, ci.engine_instance);
> >                         return -EINVAL;
> >                 }
> > -
> > -               /*
> > -                * A non-virtual engine has no siblings to choose between; and
> > -                * a submit fence will always be directed to the one engine.
> > -                */
> > -               if (intel_engine_is_virtual(virtual)) {
> > -                       err = intel_virtual_engine_attach_bond(virtual,
> > -                                                              master,
> > -                                                              bond);
> > -                       if (err)
> > -                               return err;
> > -               }
> >         }
> >
> >         return 0;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index d640bba6ad9ab..efb2fa3522a42 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> >                         err = i915_request_await_execution(eb.request,
> >                                                            in_fence,
> > -                                                          eb.engine->bond_execute);
> > +                                                          NULL);
> >                 else
> >                         err = i915_request_await_dma_fence(eb.request,
> >                                                            in_fence);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index 883bafc449024..68cfe5080325c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> >          */
> >         void            (*submit_request)(struct i915_request *rq);
> >
> > -       /*
> > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > -        * request down to the bonded pairs.
> > -        */
> > -       void            (*bond_execute)(struct i915_request *rq,
> > -                                       struct dma_fence *signal);
> > -
> >         /*
> >          * Call when the priority on a request has changed and it and its
> >          * dependencies may need rescheduling. Note the request itself may
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index de124870af44d..b6e2b59f133b7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -181,18 +181,6 @@ struct virtual_engine {
> >                 int prio;
> >         } nodes[I915_NUM_ENGINES];
> >
> > -       /*
> > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > -        * of physical engines any particular request may be submitted to.
> > -        * If we receive a submit-fence from a master engine, we will only
> > -        * use one of sibling_mask physical engines.
> > -        */
> > -       struct ve_bond {
> > -               const struct intel_engine_cs *master;
> > -               intel_engine_mask_t sibling_mask;
> > -       } *bonds;
> > -       unsigned int num_bonds;
> > -
> >         /* And finally, which physical engines this virtual engine maps onto. */
> >         unsigned int num_siblings;
> >         struct intel_engine_cs *siblings[];
> > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> >         intel_engine_free_request_pool(&ve->base);
> >
> > -       kfree(ve->bonds);
> >         kfree(ve);
> >  }
> >
> > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> >  }
> >
> > -static struct ve_bond *
> > -virtual_find_bond(struct virtual_engine *ve,
> > -                 const struct intel_engine_cs *master)
> > -{
> > -       int i;
> > -
> > -       for (i = 0; i < ve->num_bonds; i++) {
> > -               if (ve->bonds[i].master == master)
> > -                       return &ve->bonds[i];
> > -       }
> > -
> > -       return NULL;
> > -}
> > -
> > -static void
> > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > -{
> > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > -       intel_engine_mask_t allowed, exec;
> > -       struct ve_bond *bond;
> > -
> > -       allowed = ~to_request(signal)->engine->mask;
> > -
> > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > -       if (bond)
> > -               allowed &= bond->sibling_mask;
> > -
> > -       /* Restrict the bonded request to run on only the available engines */
> > -       exec = READ_ONCE(rq->execution_mask);
> > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > -               ;
> > -
> > -       /* Prevent the master from being re-run on the bonded engines */
> > -       to_request(signal)->execution_mask &= ~allowed;
> 
> I sent a v2 of this patch because it turns out I deleted a bit too
> much code.  This function in particular, has to stay, unfortunately.
> When a batch is submitted with a SUBMIT_FENCE, this is used to push
> the work onto a different engine than than the one it's supposed to
> run in parallel with.  This means we can't dead-code this function or
> the bond_execution function pointer and related stuff.

Uh that's disappointing, since if I understand your point correctly, the
sibling engines should all be singletons, not load balancing virtual ones.
So there really should not be any need to pick the right one at execution
time.

At least my understanding is that we're only limiting the engine set
further, so if both signaller and signalled request can only run on
singletons (which must be distinct, or the bonded parameter validation is
busted) there's really nothing to do here.

Also this is the locking code that freaks me out about the current bonded
execlist code ...

Dazzled and confused.
-Daniel

> 
> --Jason
> 
> 
> > -}
> > -
> >  struct intel_context *
> >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >                                unsigned int count)
> > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >
> >         ve->base.schedule = i915_schedule;
> >         ve->base.submit_request = virtual_submit_request;
> > -       ve->base.bond_execute = virtual_bond_execute;
> >
> >         INIT_LIST_HEAD(virtual_queue(ve));
> >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> >         if (IS_ERR(dst))
> >                 return dst;
> >
> > -       if (se->num_bonds) {
> > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > -
> > -               de->bonds = kmemdup(se->bonds,
> > -                                   sizeof(*se->bonds) * se->num_bonds,
> > -                                   GFP_KERNEL);
> > -               if (!de->bonds) {
> > -                       intel_context_put(dst);
> > -                       return ERR_PTR(-ENOMEM);
> > -               }
> > -
> > -               de->num_bonds = se->num_bonds;
> > -       }
> > -
> >         return dst;
> >  }
> >
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -                                    const struct intel_engine_cs *master,
> > -                                    const struct intel_engine_cs *sibling)
> > -{
> > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > -       struct ve_bond *bond;
> > -       int n;
> > -
> > -       /* Sanity check the sibling is part of the virtual engine */
> > -       for (n = 0; n < ve->num_siblings; n++)
> > -               if (sibling == ve->siblings[n])
> > -                       break;
> > -       if (n == ve->num_siblings)
> > -               return -EINVAL;
> > -
> > -       bond = virtual_find_bond(ve, master);
> > -       if (bond) {
> > -               bond->sibling_mask |= sibling->mask;
> > -               return 0;
> > -       }
> > -
> > -       bond = krealloc(ve->bonds,
> > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > -                       GFP_KERNEL);
> > -       if (!bond)
> > -               return -ENOMEM;
> > -
> > -       bond[ve->num_bonds].master = master;
> > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > -
> > -       ve->bonds = bond;
> > -       ve->num_bonds++;
> > -
> > -       return 0;
> > -}
> > -
> >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> >                                    struct drm_printer *m,
> >                                    void (*show_request)(struct drm_printer *m,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > index fd61dae820e9e..80cec37a56ba9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >  struct intel_context *
> >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> >
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -                                    const struct intel_engine_cs *master,
> > -                                    const struct intel_engine_cs *sibling);
> > -
> >  bool
> >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> >
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > index 1081cd36a2bd3..f03446d587160 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> >         return 0;
> >  }
> >
> > -static int bond_virtual_engine(struct intel_gt *gt,
> > -                              unsigned int class,
> > -                              struct intel_engine_cs **siblings,
> > -                              unsigned int nsibling,
> > -                              unsigned int flags)
> > -#define BOND_SCHEDULE BIT(0)
> > -{
> > -       struct intel_engine_cs *master;
> > -       struct i915_request *rq[16];
> > -       enum intel_engine_id id;
> > -       struct igt_spinner spin;
> > -       unsigned long n;
> > -       int err;
> > -
> > -       /*
> > -        * A set of bonded requests is intended to be run concurrently
> > -        * across a number of engines. We use one request per-engine
> > -        * and a magic fence to schedule each of the bonded requests
> > -        * at the same time. A consequence of our current scheduler is that
> > -        * we only move requests to the HW ready queue when the request
> > -        * becomes ready, that is when all of its prerequisite fences have
> > -        * been signaled. As one of those fences is the master submit fence,
> > -        * there is a delay on all secondary fences as the HW may be
> > -        * currently busy. Equally, as all the requests are independent,
> > -        * they may have other fences that delay individual request
> > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > -        * immediately submitted to HW at the same time, just that if the
> > -        * rules are abided by, they are ready at the same time as the
> > -        * first is submitted. Userspace can embed semaphores in its batch
> > -        * to ensure parallel execution of its phases as it requires.
> > -        * Though naturally it gets requested that perhaps the scheduler should
> > -        * take care of parallel execution, even across preemption events on
> > -        * different HW. (The proper answer is of course "lalalala".)
> > -        *
> > -        * With the submit-fence, we have identified three possible phases
> > -        * of synchronisation depending on the master fence: queued (not
> > -        * ready), executing, and signaled. The first two are quite simple
> > -        * and checked below. However, the signaled master fence handling is
> > -        * contentious. Currently we do not distinguish between a signaled
> > -        * fence and an expired fence, as once signaled it does not convey
> > -        * any information about the previous execution. It may even be freed
> > -        * and hence checking later it may not exist at all. Ergo we currently
> > -        * do not apply the bonding constraint for an already signaled fence,
> > -        * as our expectation is that it should not constrain the secondaries
> > -        * and is outside of the scope of the bonded request API (i.e. all
> > -        * userspace requests are meant to be running in parallel). As
> > -        * it imposes no constraint, and is effectively a no-op, we do not
> > -        * check below as normal execution flows are checked extensively above.
> > -        *
> > -        * XXX Is the degenerate handling of signaled submit fences the
> > -        * expected behaviour for userpace?
> > -        */
> > -
> > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > -
> > -       if (igt_spinner_init(&spin, gt))
> > -               return -ENOMEM;
> > -
> > -       err = 0;
> > -       rq[0] = ERR_PTR(-ENOMEM);
> > -       for_each_engine(master, gt, id) {
> > -               struct i915_sw_fence fence = {};
> > -               struct intel_context *ce;
> > -
> > -               if (master->class == class)
> > -                       continue;
> > -
> > -               ce = intel_context_create(master);
> > -               if (IS_ERR(ce)) {
> > -                       err = PTR_ERR(ce);
> > -                       goto out;
> > -               }
> > -
> > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > -
> > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > -               intel_context_put(ce);
> > -               if (IS_ERR(rq[0])) {
> > -                       err = PTR_ERR(rq[0]);
> > -                       goto out;
> > -               }
> > -               i915_request_get(rq[0]);
> > -
> > -               if (flags & BOND_SCHEDULE) {
> > -                       onstack_fence_init(&fence);
> > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > -                                                              &fence,
> > -                                                              GFP_KERNEL);
> > -               }
> > -
> > -               i915_request_add(rq[0]);
> > -               if (err < 0)
> > -                       goto out;
> > -
> > -               if (!(flags & BOND_SCHEDULE) &&
> > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > -                       err = -EIO;
> > -                       goto out;
> > -               }
> > -
> > -               for (n = 0; n < nsibling; n++) {
> > -                       struct intel_context *ve;
> > -
> > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > -                       if (IS_ERR(ve)) {
> > -                               err = PTR_ERR(ve);
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -
> > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > -                                                              master,
> > -                                                              siblings[n]);
> > -                       if (err) {
> > -                               intel_context_put(ve);
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -
> > -                       err = intel_context_pin(ve);
> > -                       intel_context_put(ve);
> > -                       if (err) {
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -
> > -                       rq[n + 1] = i915_request_create(ve);
> > -                       intel_context_unpin(ve);
> > -                       if (IS_ERR(rq[n + 1])) {
> > -                               err = PTR_ERR(rq[n + 1]);
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -                       i915_request_get(rq[n + 1]);
> > -
> > -                       err = i915_request_await_execution(rq[n + 1],
> > -                                                          &rq[0]->fence,
> > -                                                          ve->engine->bond_execute);
> > -                       i915_request_add(rq[n + 1]);
> > -                       if (err < 0) {
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -               }
> > -               onstack_fence_fini(&fence);
> > -               intel_engine_flush_submission(master);
> > -               igt_spinner_end(&spin);
> > -
> > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > -                       pr_err("Master request did not execute (on %s)!\n",
> > -                              rq[0]->engine->name);
> > -                       err = -EIO;
> > -                       goto out;
> > -               }
> > -
> > -               for (n = 0; n < nsibling; n++) {
> > -                       if (i915_request_wait(rq[n + 1], 0,
> > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > -                               err = -EIO;
> > -                               goto out;
> > -                       }
> > -
> > -                       if (rq[n + 1]->engine != siblings[n]) {
> > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > -                                      siblings[n]->name,
> > -                                      rq[n + 1]->engine->name,
> > -                                      rq[0]->engine->name);
> > -                               err = -EINVAL;
> > -                               goto out;
> > -                       }
> > -               }
> > -
> > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > -                       i915_request_put(rq[n]);
> > -               rq[0] = ERR_PTR(-ENOMEM);
> > -       }
> > -
> > -out:
> > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > -               i915_request_put(rq[n]);
> > -       if (igt_flush_test(gt->i915))
> > -               err = -EIO;
> > -
> > -       igt_spinner_fini(&spin);
> > -       return err;
> > -}
> > -
> > -static int live_virtual_bond(void *arg)
> > -{
> > -       static const struct phase {
> > -               const char *name;
> > -               unsigned int flags;
> > -       } phases[] = {
> > -               { "", 0 },
> > -               { "schedule", BOND_SCHEDULE },
> > -               { },
> > -       };
> > -       struct intel_gt *gt = arg;
> > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > -       unsigned int class;
> > -       int err;
> > -
> > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > -               return 0;
> > -
> > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > -               const struct phase *p;
> > -               int nsibling;
> > -
> > -               nsibling = select_siblings(gt, class, siblings);
> > -               if (nsibling < 2)
> > -                       continue;
> > -
> > -               for (p = phases; p->name; p++) {
> > -                       err = bond_virtual_engine(gt,
> > -                                                 class, siblings, nsibling,
> > -                                                 p->flags);
> > -                       if (err) {
> > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > -                                      __func__, p->name, class, nsibling, err);
> > -                               return err;
> > -                       }
> > -               }
> > -       }
> > -
> > -       return 0;
> > -}
> > -
> >  static int reset_virtual_engine(struct intel_gt *gt,
> >                                 struct intel_engine_cs **siblings,
> >                                 unsigned int nsibling)
> > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> >                 SUBTEST(live_virtual_mask),
> >                 SUBTEST(live_virtual_preserved),
> >                 SUBTEST(live_virtual_slice),
> > -               SUBTEST(live_virtual_bond),
> >                 SUBTEST(live_virtual_reset),
> >         };
> >
> > --
> > 2.31.1
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 10:13       ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 10:13 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > This adds a bunch of complexity which the media driver has never
> > actually used.  The media driver does technically bond a balanced engine
> > to another engine but the balanced engine only has one engine in the
> > sibling set.  This doesn't actually result in a virtual engine.
> >
> > Unless some userspace badly wants it, there's no good reason to support
> > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > leave the validation code in place in case we ever decide we want to do
> > something interesting with the bonding information.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> >  6 files changed, 7 insertions(+), 353 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index e8179918fa306..5f8d0faf783aa 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >         }
> >         virtual = set->engines->engines[idx]->engine;
> >
> > +       if (intel_engine_is_virtual(virtual)) {
> > +               drm_dbg(&i915->drm,
> > +                       "Bonding with virtual engines not allowed\n");
> > +               return -EINVAL;
> > +       }
> > +
> >         err = check_user_mbz(&ext->flags);
> >         if (err)
> >                 return err;
> > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >                                 n, ci.engine_class, ci.engine_instance);
> >                         return -EINVAL;
> >                 }
> > -
> > -               /*
> > -                * A non-virtual engine has no siblings to choose between; and
> > -                * a submit fence will always be directed to the one engine.
> > -                */
> > -               if (intel_engine_is_virtual(virtual)) {
> > -                       err = intel_virtual_engine_attach_bond(virtual,
> > -                                                              master,
> > -                                                              bond);
> > -                       if (err)
> > -                               return err;
> > -               }
> >         }
> >
> >         return 0;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index d640bba6ad9ab..efb2fa3522a42 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> >                         err = i915_request_await_execution(eb.request,
> >                                                            in_fence,
> > -                                                          eb.engine->bond_execute);
> > +                                                          NULL);
> >                 else
> >                         err = i915_request_await_dma_fence(eb.request,
> >                                                            in_fence);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index 883bafc449024..68cfe5080325c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> >          */
> >         void            (*submit_request)(struct i915_request *rq);
> >
> > -       /*
> > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > -        * request down to the bonded pairs.
> > -        */
> > -       void            (*bond_execute)(struct i915_request *rq,
> > -                                       struct dma_fence *signal);
> > -
> >         /*
> >          * Call when the priority on a request has changed and it and its
> >          * dependencies may need rescheduling. Note the request itself may
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index de124870af44d..b6e2b59f133b7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -181,18 +181,6 @@ struct virtual_engine {
> >                 int prio;
> >         } nodes[I915_NUM_ENGINES];
> >
> > -       /*
> > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > -        * of physical engines any particular request may be submitted to.
> > -        * If we receive a submit-fence from a master engine, we will only
> > -        * use one of sibling_mask physical engines.
> > -        */
> > -       struct ve_bond {
> > -               const struct intel_engine_cs *master;
> > -               intel_engine_mask_t sibling_mask;
> > -       } *bonds;
> > -       unsigned int num_bonds;
> > -
> >         /* And finally, which physical engines this virtual engine maps onto. */
> >         unsigned int num_siblings;
> >         struct intel_engine_cs *siblings[];
> > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> >         intel_engine_free_request_pool(&ve->base);
> >
> > -       kfree(ve->bonds);
> >         kfree(ve);
> >  }
> >
> > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> >  }
> >
> > -static struct ve_bond *
> > -virtual_find_bond(struct virtual_engine *ve,
> > -                 const struct intel_engine_cs *master)
> > -{
> > -       int i;
> > -
> > -       for (i = 0; i < ve->num_bonds; i++) {
> > -               if (ve->bonds[i].master == master)
> > -                       return &ve->bonds[i];
> > -       }
> > -
> > -       return NULL;
> > -}
> > -
> > -static void
> > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > -{
> > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > -       intel_engine_mask_t allowed, exec;
> > -       struct ve_bond *bond;
> > -
> > -       allowed = ~to_request(signal)->engine->mask;
> > -
> > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > -       if (bond)
> > -               allowed &= bond->sibling_mask;
> > -
> > -       /* Restrict the bonded request to run on only the available engines */
> > -       exec = READ_ONCE(rq->execution_mask);
> > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > -               ;
> > -
> > -       /* Prevent the master from being re-run on the bonded engines */
> > -       to_request(signal)->execution_mask &= ~allowed;
> 
> I sent a v2 of this patch because it turns out I deleted a bit too
> much code.  This function in particular, has to stay, unfortunately.
> When a batch is submitted with a SUBMIT_FENCE, this is used to push
> the work onto a different engine than than the one it's supposed to
> run in parallel with.  This means we can't dead-code this function or
> the bond_execution function pointer and related stuff.

Uh that's disappointing, since if I understand your point correctly, the
sibling engines should all be singletons, not load balancing virtual ones.
So there really should not be any need to pick the right one at execution
time.

At least my understanding is that we're only limiting the engine set
further, so if both signaller and signalled request can only run on
singletons (which must be distinct, or the bonded parameter validation is
busted) there's really nothing to do here.

Also this is the locking code that freaks me out about the current bonded
execlist code ...

Dazzled and confused.
-Daniel

> 
> --Jason
> 
> 
> > -}
> > -
> >  struct intel_context *
> >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >                                unsigned int count)
> > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >
> >         ve->base.schedule = i915_schedule;
> >         ve->base.submit_request = virtual_submit_request;
> > -       ve->base.bond_execute = virtual_bond_execute;
> >
> >         INIT_LIST_HEAD(virtual_queue(ve));
> >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> >         if (IS_ERR(dst))
> >                 return dst;
> >
> > -       if (se->num_bonds) {
> > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > -
> > -               de->bonds = kmemdup(se->bonds,
> > -                                   sizeof(*se->bonds) * se->num_bonds,
> > -                                   GFP_KERNEL);
> > -               if (!de->bonds) {
> > -                       intel_context_put(dst);
> > -                       return ERR_PTR(-ENOMEM);
> > -               }
> > -
> > -               de->num_bonds = se->num_bonds;
> > -       }
> > -
> >         return dst;
> >  }
> >
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -                                    const struct intel_engine_cs *master,
> > -                                    const struct intel_engine_cs *sibling)
> > -{
> > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > -       struct ve_bond *bond;
> > -       int n;
> > -
> > -       /* Sanity check the sibling is part of the virtual engine */
> > -       for (n = 0; n < ve->num_siblings; n++)
> > -               if (sibling == ve->siblings[n])
> > -                       break;
> > -       if (n == ve->num_siblings)
> > -               return -EINVAL;
> > -
> > -       bond = virtual_find_bond(ve, master);
> > -       if (bond) {
> > -               bond->sibling_mask |= sibling->mask;
> > -               return 0;
> > -       }
> > -
> > -       bond = krealloc(ve->bonds,
> > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > -                       GFP_KERNEL);
> > -       if (!bond)
> > -               return -ENOMEM;
> > -
> > -       bond[ve->num_bonds].master = master;
> > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > -
> > -       ve->bonds = bond;
> > -       ve->num_bonds++;
> > -
> > -       return 0;
> > -}
> > -
> >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> >                                    struct drm_printer *m,
> >                                    void (*show_request)(struct drm_printer *m,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > index fd61dae820e9e..80cec37a56ba9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >  struct intel_context *
> >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> >
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -                                    const struct intel_engine_cs *master,
> > -                                    const struct intel_engine_cs *sibling);
> > -
> >  bool
> >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> >
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > index 1081cd36a2bd3..f03446d587160 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> >         return 0;
> >  }
> >
> > -static int bond_virtual_engine(struct intel_gt *gt,
> > -                              unsigned int class,
> > -                              struct intel_engine_cs **siblings,
> > -                              unsigned int nsibling,
> > -                              unsigned int flags)
> > -#define BOND_SCHEDULE BIT(0)
> > -{
> > -       struct intel_engine_cs *master;
> > -       struct i915_request *rq[16];
> > -       enum intel_engine_id id;
> > -       struct igt_spinner spin;
> > -       unsigned long n;
> > -       int err;
> > -
> > -       /*
> > -        * A set of bonded requests is intended to be run concurrently
> > -        * across a number of engines. We use one request per-engine
> > -        * and a magic fence to schedule each of the bonded requests
> > -        * at the same time. A consequence of our current scheduler is that
> > -        * we only move requests to the HW ready queue when the request
> > -        * becomes ready, that is when all of its prerequisite fences have
> > -        * been signaled. As one of those fences is the master submit fence,
> > -        * there is a delay on all secondary fences as the HW may be
> > -        * currently busy. Equally, as all the requests are independent,
> > -        * they may have other fences that delay individual request
> > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > -        * immediately submitted to HW at the same time, just that if the
> > -        * rules are abided by, they are ready at the same time as the
> > -        * first is submitted. Userspace can embed semaphores in its batch
> > -        * to ensure parallel execution of its phases as it requires.
> > -        * Though naturally it gets requested that perhaps the scheduler should
> > -        * take care of parallel execution, even across preemption events on
> > -        * different HW. (The proper answer is of course "lalalala".)
> > -        *
> > -        * With the submit-fence, we have identified three possible phases
> > -        * of synchronisation depending on the master fence: queued (not
> > -        * ready), executing, and signaled. The first two are quite simple
> > -        * and checked below. However, the signaled master fence handling is
> > -        * contentious. Currently we do not distinguish between a signaled
> > -        * fence and an expired fence, as once signaled it does not convey
> > -        * any information about the previous execution. It may even be freed
> > -        * and hence checking later it may not exist at all. Ergo we currently
> > -        * do not apply the bonding constraint for an already signaled fence,
> > -        * as our expectation is that it should not constrain the secondaries
> > -        * and is outside of the scope of the bonded request API (i.e. all
> > -        * userspace requests are meant to be running in parallel). As
> > -        * it imposes no constraint, and is effectively a no-op, we do not
> > -        * check below as normal execution flows are checked extensively above.
> > -        *
> > -        * XXX Is the degenerate handling of signaled submit fences the
> > -        * expected behaviour for userpace?
> > -        */
> > -
> > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > -
> > -       if (igt_spinner_init(&spin, gt))
> > -               return -ENOMEM;
> > -
> > -       err = 0;
> > -       rq[0] = ERR_PTR(-ENOMEM);
> > -       for_each_engine(master, gt, id) {
> > -               struct i915_sw_fence fence = {};
> > -               struct intel_context *ce;
> > -
> > -               if (master->class == class)
> > -                       continue;
> > -
> > -               ce = intel_context_create(master);
> > -               if (IS_ERR(ce)) {
> > -                       err = PTR_ERR(ce);
> > -                       goto out;
> > -               }
> > -
> > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > -
> > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > -               intel_context_put(ce);
> > -               if (IS_ERR(rq[0])) {
> > -                       err = PTR_ERR(rq[0]);
> > -                       goto out;
> > -               }
> > -               i915_request_get(rq[0]);
> > -
> > -               if (flags & BOND_SCHEDULE) {
> > -                       onstack_fence_init(&fence);
> > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > -                                                              &fence,
> > -                                                              GFP_KERNEL);
> > -               }
> > -
> > -               i915_request_add(rq[0]);
> > -               if (err < 0)
> > -                       goto out;
> > -
> > -               if (!(flags & BOND_SCHEDULE) &&
> > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > -                       err = -EIO;
> > -                       goto out;
> > -               }
> > -
> > -               for (n = 0; n < nsibling; n++) {
> > -                       struct intel_context *ve;
> > -
> > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > -                       if (IS_ERR(ve)) {
> > -                               err = PTR_ERR(ve);
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -
> > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > -                                                              master,
> > -                                                              siblings[n]);
> > -                       if (err) {
> > -                               intel_context_put(ve);
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -
> > -                       err = intel_context_pin(ve);
> > -                       intel_context_put(ve);
> > -                       if (err) {
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -
> > -                       rq[n + 1] = i915_request_create(ve);
> > -                       intel_context_unpin(ve);
> > -                       if (IS_ERR(rq[n + 1])) {
> > -                               err = PTR_ERR(rq[n + 1]);
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -                       i915_request_get(rq[n + 1]);
> > -
> > -                       err = i915_request_await_execution(rq[n + 1],
> > -                                                          &rq[0]->fence,
> > -                                                          ve->engine->bond_execute);
> > -                       i915_request_add(rq[n + 1]);
> > -                       if (err < 0) {
> > -                               onstack_fence_fini(&fence);
> > -                               goto out;
> > -                       }
> > -               }
> > -               onstack_fence_fini(&fence);
> > -               intel_engine_flush_submission(master);
> > -               igt_spinner_end(&spin);
> > -
> > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > -                       pr_err("Master request did not execute (on %s)!\n",
> > -                              rq[0]->engine->name);
> > -                       err = -EIO;
> > -                       goto out;
> > -               }
> > -
> > -               for (n = 0; n < nsibling; n++) {
> > -                       if (i915_request_wait(rq[n + 1], 0,
> > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > -                               err = -EIO;
> > -                               goto out;
> > -                       }
> > -
> > -                       if (rq[n + 1]->engine != siblings[n]) {
> > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > -                                      siblings[n]->name,
> > -                                      rq[n + 1]->engine->name,
> > -                                      rq[0]->engine->name);
> > -                               err = -EINVAL;
> > -                               goto out;
> > -                       }
> > -               }
> > -
> > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > -                       i915_request_put(rq[n]);
> > -               rq[0] = ERR_PTR(-ENOMEM);
> > -       }
> > -
> > -out:
> > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > -               i915_request_put(rq[n]);
> > -       if (igt_flush_test(gt->i915))
> > -               err = -EIO;
> > -
> > -       igt_spinner_fini(&spin);
> > -       return err;
> > -}
> > -
> > -static int live_virtual_bond(void *arg)
> > -{
> > -       static const struct phase {
> > -               const char *name;
> > -               unsigned int flags;
> > -       } phases[] = {
> > -               { "", 0 },
> > -               { "schedule", BOND_SCHEDULE },
> > -               { },
> > -       };
> > -       struct intel_gt *gt = arg;
> > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > -       unsigned int class;
> > -       int err;
> > -
> > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > -               return 0;
> > -
> > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > -               const struct phase *p;
> > -               int nsibling;
> > -
> > -               nsibling = select_siblings(gt, class, siblings);
> > -               if (nsibling < 2)
> > -                       continue;
> > -
> > -               for (p = phases; p->name; p++) {
> > -                       err = bond_virtual_engine(gt,
> > -                                                 class, siblings, nsibling,
> > -                                                 p->flags);
> > -                       if (err) {
> > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > -                                      __func__, p->name, class, nsibling, err);
> > -                               return err;
> > -                       }
> > -               }
> > -       }
> > -
> > -       return 0;
> > -}
> > -
> >  static int reset_virtual_engine(struct intel_gt *gt,
> >                                 struct intel_engine_cs **siblings,
> >                                 unsigned int nsibling)
> > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> >                 SUBTEST(live_virtual_mask),
> >                 SUBTEST(live_virtual_preserved),
> >                 SUBTEST(live_virtual_slice),
> > -               SUBTEST(live_virtual_bond),
> >                 SUBTEST(live_virtual_reset),
> >         };
> >
> > --
> > 2.31.1
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 10:16     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 10:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> There's no sense in allowing userspace to create more engines than it
> can possibly access via execbuf.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 5f8d0faf783aa..ecb3bf5369857 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>  		return -EINVAL;
>  	}
>  
> -	/*
> -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> -	 * first 64 engines defined here.
> -	 */
>  	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);

Maybe add a comment like /* RING_MASK has not shift, so can be used
directly here */ since I had to check that :-)

Same story about igt testcases needed, just to be sure.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> +	if (num_engines > I915_EXEC_RING_MASK + 1)
> +		return -EINVAL;
> +
>  	set.engines = alloc_engines(num_engines);
>  	if (!set.engines)
>  		return -ENOMEM;
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-28 10:16     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 10:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> There's no sense in allowing userspace to create more engines than it
> can possibly access via execbuf.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 5f8d0faf783aa..ecb3bf5369857 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>  		return -EINVAL;
>  	}
>  
> -	/*
> -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> -	 * first 64 engines defined here.
> -	 */
>  	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);

Maybe add a comment like /* RING_MASK has not shift, so can be used
directly here */ since I had to check that :-)

Same story about igt testcases needed, just to be sure.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> +	if (num_engines > I915_EXEC_RING_MASK + 1)
> +		return -EINVAL;
> +
>  	set.engines = alloc_engines(num_engines);
>  	if (!set.engines)
>  		return -ENOMEM;
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 10:27     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 10:27 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote:
> As far as I can tell, the only real reason for this is to avoid taking a
> reference to the i915_gem_context.  The cost of those two atomics
> probably pales in comparison to the cost of the ioctl itself so we're
> really not buying ourselves anything here.  We're about to make context
> lookup a tiny bit more complicated, so let's get rid of the one hand-
> rolled case.

I think the historical reason here is that i965_brw checks this before
every execbuf call, at least for arb_robustness contexts with the right
flag. But we've fixed that hotpath problem by adding non-recoverable
contexts. The kernel will tell you now automatically, for proper userspace
at least (I checked iris and anv, assuming I got it correct), and
reset_stats ioctl isn't a hot path worth micro-optimizing anymore.

With that bit of more context added to the commit message:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
>  drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
>  2 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index ecb3bf5369857..941fbf78267b4 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  	struct drm_i915_private *i915 = to_i915(dev);
>  	struct drm_i915_reset_stats *args = data;
>  	struct i915_gem_context *ctx;
> -	int ret;
>  
>  	if (args->flags || args->pad)
>  		return -EINVAL;
>  
> -	ret = -ENOENT;
> -	rcu_read_lock();
> -	ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
> +	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
>  	if (!ctx)
> -		goto out;
> +		return -ENOENT;
>  
>  	/*
>  	 * We opt for unserialised reads here. This may result in tearing
> @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  	args->batch_active = atomic_read(&ctx->guilty_count);
>  	args->batch_pending = atomic_read(&ctx->active_count);
>  
> -	ret = 0;
> -out:
> -	rcu_read_unlock();
> -	return ret;
> +	i915_gem_context_put(ctx);
> +	return 0;
>  }
>  
>  /* GEM context-engines iterator: for_each_gem_engine() */
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0b44333eb7033..8571c5c1509a7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
>  
>  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
>  
> -static inline struct i915_gem_context *
> -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
> -{
> -	return xa_load(&file_priv->context_xa, id);
> -}
> -
>  static inline struct i915_gem_context *
>  i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
>  {
>  	struct i915_gem_context *ctx;
>  
>  	rcu_read_lock();
> -	ctx = __i915_gem_context_lookup_rcu(file_priv, id);
> +	ctx = xa_load(&file_priv->context_xa, id);
>  	if (ctx && !kref_get_unless_zero(&ctx->ref))
>  		ctx = NULL;
>  	rcu_read_unlock();
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
@ 2021-04-28 10:27     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 10:27 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote:
> As far as I can tell, the only real reason for this is to avoid taking a
> reference to the i915_gem_context.  The cost of those two atomics
> probably pales in comparison to the cost of the ioctl itself so we're
> really not buying ourselves anything here.  We're about to make context
> lookup a tiny bit more complicated, so let's get rid of the one hand-
> rolled case.

I think the historical reason here is that i965_brw checks this before
every execbuf call, at least for arb_robustness contexts with the right
flag. But we've fixed that hotpath problem by adding non-recoverable
contexts. The kernel will tell you now automatically, for proper userspace
at least (I checked iris and anv, assuming I got it correct), and
reset_stats ioctl isn't a hot path worth micro-optimizing anymore.

With that bit of more context added to the commit message:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
>  drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
>  2 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index ecb3bf5369857..941fbf78267b4 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  	struct drm_i915_private *i915 = to_i915(dev);
>  	struct drm_i915_reset_stats *args = data;
>  	struct i915_gem_context *ctx;
> -	int ret;
>  
>  	if (args->flags || args->pad)
>  		return -EINVAL;
>  
> -	ret = -ENOENT;
> -	rcu_read_lock();
> -	ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
> +	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
>  	if (!ctx)
> -		goto out;
> +		return -ENOENT;
>  
>  	/*
>  	 * We opt for unserialised reads here. This may result in tearing
> @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  	args->batch_active = atomic_read(&ctx->guilty_count);
>  	args->batch_pending = atomic_read(&ctx->active_count);
>  
> -	ret = 0;
> -out:
> -	rcu_read_unlock();
> -	return ret;
> +	i915_gem_context_put(ctx);
> +	return 0;
>  }
>  
>  /* GEM context-engines iterator: for_each_gem_engine() */
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0b44333eb7033..8571c5c1509a7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
>  
>  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
>  
> -static inline struct i915_gem_context *
> -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
> -{
> -	return xa_load(&file_priv->context_xa, id);
> -}
> -
>  static inline struct i915_gem_context *
>  i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
>  {
>  	struct i915_gem_context *ctx;
>  
>  	rcu_read_lock();
> -	ctx = __i915_gem_context_lookup_rcu(file_priv, id);
> +	ctx = xa_load(&file_priv->context_xa, id);
>  	if (ctx && !kref_get_unless_zero(&ctx->ref))
>  		ctx = NULL;
>  	rcu_read_unlock();
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-28 10:16     ` Daniel Vetter
@ 2021-04-28 10:42       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 10:42 UTC (permalink / raw)
  To: Daniel Vetter, Jason Ekstrand; +Cc: intel-gfx, dri-devel


On 28/04/2021 11:16, Daniel Vetter wrote:
> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>> There's no sense in allowing userspace to create more engines than it
>> can possibly access via execbuf.
>>
>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> index 5f8d0faf783aa..ecb3bf5369857 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>   		return -EINVAL;
>>   	}
>>   
>> -	/*
>> -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>> -	 * first 64 engines defined here.
>> -	 */
>>   	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> 
> Maybe add a comment like /* RING_MASK has not shift, so can be used
> directly here */ since I had to check that :-)
> 
> Same story about igt testcases needed, just to be sure.
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I am not sure about the churn vs benefit ratio here. There are also 
patches which extend the engine selection field in execbuf2 over the 
unused constants bits (with an explicit flag). So churn upstream and 
churn in internal (if interesting) for not much benefit.

Regards,

Tvrtko

>> +	if (num_engines > I915_EXEC_RING_MASK + 1)
>> +		return -EINVAL;
>> +
>>   	set.engines = alloc_engines(num_engines);
>>   	if (!set.engines)
>>   		return -ENOMEM;
>> -- 
>> 2.31.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-28 10:42       ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 10:42 UTC (permalink / raw)
  To: Daniel Vetter, Jason Ekstrand; +Cc: intel-gfx, dri-devel


On 28/04/2021 11:16, Daniel Vetter wrote:
> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>> There's no sense in allowing userspace to create more engines than it
>> can possibly access via execbuf.
>>
>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> index 5f8d0faf783aa..ecb3bf5369857 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>   		return -EINVAL;
>>   	}
>>   
>> -	/*
>> -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>> -	 * first 64 engines defined here.
>> -	 */
>>   	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> 
> Maybe add a comment like /* RING_MASK has not shift, so can be used
> directly here */ since I had to check that :-)
> 
> Same story about igt testcases needed, just to be sure.
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I am not sure about the churn vs benefit ratio here. There are also 
patches which extend the engine selection field in execbuf2 over the 
unused constants bits (with an explicit flag). So churn upstream and 
churn in internal (if interesting) for not much benefit.

Regards,

Tvrtko

>> +	if (num_engines > I915_EXEC_RING_MASK + 1)
>> +		return -EINVAL;
>> +
>>   	set.engines = alloc_engines(num_engines);
>>   	if (!set.engines)
>>   		return -ENOMEM;
>> -- 
>> 2.31.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-28 10:42       ` Tvrtko Ursulin
@ 2021-04-28 14:02         ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 14:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: dri-devel, intel-gfx, Jason Ekstrand

On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> 
> On 28/04/2021 11:16, Daniel Vetter wrote:
> > On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> > > There's no sense in allowing userspace to create more engines than it
> > > can possibly access via execbuf.
> > > 
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> > >   1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index 5f8d0faf783aa..ecb3bf5369857 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> > >   		return -EINVAL;
> > >   	}
> > > -	/*
> > > -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> > > -	 * first 64 engines defined here.
> > > -	 */
> > >   	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> > 
> > Maybe add a comment like /* RING_MASK has not shift, so can be used
> > directly here */ since I had to check that :-)
> > 
> > Same story about igt testcases needed, just to be sure.
> > 
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> I am not sure about the churn vs benefit ratio here. There are also patches
> which extend the engine selection field in execbuf2 over the unused
> constants bits (with an explicit flag). So churn upstream and churn in
> internal (if interesting) for not much benefit.

This isn't churn.

This is "lock done uapi properly".
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-28 14:02         ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 14:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: dri-devel, intel-gfx

On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> 
> On 28/04/2021 11:16, Daniel Vetter wrote:
> > On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> > > There's no sense in allowing userspace to create more engines than it
> > > can possibly access via execbuf.
> > > 
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> > >   1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index 5f8d0faf783aa..ecb3bf5369857 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> > >   		return -EINVAL;
> > >   	}
> > > -	/*
> > > -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> > > -	 * first 64 engines defined here.
> > > -	 */
> > >   	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> > 
> > Maybe add a comment like /* RING_MASK has not shift, so can be used
> > directly here */ since I had to check that :-)
> > 
> > Same story about igt testcases needed, just to be sure.
> > 
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> I am not sure about the churn vs benefit ratio here. There are also patches
> which extend the engine selection field in execbuf2 over the unused
> constants bits (with an explicit flag). So churn upstream and churn in
> internal (if interesting) for not much benefit.

This isn't churn.

This is "lock done uapi properly".
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-28 14:02         ` Daniel Vetter
@ 2021-04-28 14:26           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 14:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, dri-devel, Jason Ekstrand


On 28/04/2021 15:02, Daniel Vetter wrote:
> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
>>
>> On 28/04/2021 11:16, Daniel Vetter wrote:
>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>>>> There's no sense in allowing userspace to create more engines than it
>>>> can possibly access via execbuf.
>>>>
>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>> ---
>>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>>>    1 file changed, 3 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>>>    		return -EINVAL;
>>>>    	}
>>>> -	/*
>>>> -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>>>> -	 * first 64 engines defined here.
>>>> -	 */
>>>>    	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
>>>
>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
>>> directly here */ since I had to check that :-)
>>>
>>> Same story about igt testcases needed, just to be sure.
>>>
>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> I am not sure about the churn vs benefit ratio here. There are also patches
>> which extend the engine selection field in execbuf2 over the unused
>> constants bits (with an explicit flag). So churn upstream and churn in
>> internal (if interesting) for not much benefit.
> 
> This isn't churn.
> 
> This is "lock done uapi properly".

IMO it is a "meh" patch. Doesn't fix any problems and will create work 
for other people and man hours spent which no one will ever properly 
account against.

Number of contexts in the engine map should not really be tied to 
execbuf2. As is demonstrated by the incoming work to address more than 
63 engines, either as an extension to execbuf2 or future execbuf3.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-28 14:26           ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 14:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, dri-devel


On 28/04/2021 15:02, Daniel Vetter wrote:
> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
>>
>> On 28/04/2021 11:16, Daniel Vetter wrote:
>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>>>> There's no sense in allowing userspace to create more engines than it
>>>> can possibly access via execbuf.
>>>>
>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>> ---
>>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>>>    1 file changed, 3 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>>>    		return -EINVAL;
>>>>    	}
>>>> -	/*
>>>> -	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>>>> -	 * first 64 engines defined here.
>>>> -	 */
>>>>    	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
>>>
>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
>>> directly here */ since I had to check that :-)
>>>
>>> Same story about igt testcases needed, just to be sure.
>>>
>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> I am not sure about the churn vs benefit ratio here. There are also patches
>> which extend the engine selection field in execbuf2 over the unused
>> constants bits (with an explicit flag). So churn upstream and churn in
>> internal (if interesting) for not much benefit.
> 
> This isn't churn.
> 
> This is "lock done uapi properly".

IMO it is a "meh" patch. Doesn't fix any problems and will create work 
for other people and man hours spent which no one will ever properly 
account against.

Number of contexts in the engine map should not really be tied to 
execbuf2. As is demonstrated by the incoming work to address more than 
63 engines, either as an extension to execbuf2 or future execbuf3.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 12/21] drm/i915/gem: Add a separate validate_priority helper
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 14:37     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 14:37 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:22PM -0500, Jason Ekstrand wrote:

Maybe explain that you pull this out since with the proto context there
will be two paths to set this, one for proto context, the other for
context already finalized and executing patches?

With that: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 +++++++++++++--------
>  1 file changed, 27 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 941fbf78267b4..e5efd22c89ba2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx,
>  	return i915_gem_context_get_engine(ctx, idx);
>  }
>  
> +static int validate_priority(struct drm_i915_private *i915,
> +			     const struct drm_i915_gem_context_param *args)
> +{
> +	s64 priority = args->value;
> +
> +	if (args->size)
> +		return -EINVAL;
> +
> +	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
> +		return -ENODEV;
> +
> +	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
> +	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
> +		return -EINVAL;
> +
> +	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
> +	    !capable(CAP_SYS_NICE))
> +		return -EPERM;
> +
> +	return 0;
> +}
> +
>  static struct i915_address_space *
>  context_get_vm_rcu(struct i915_gem_context *ctx)
>  {
> @@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, void *arg)
>  static int set_priority(struct i915_gem_context *ctx,
>  			const struct drm_i915_gem_context_param *args)
>  {
> -	s64 priority = args->value;
> -
> -	if (args->size)
> -		return -EINVAL;
> -
> -	if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
> -		return -ENODEV;
> -
> -	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
> -	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
> -		return -EINVAL;
> +	int err;
>  
> -	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
> -	    !capable(CAP_SYS_NICE))
> -		return -EPERM;
> +	err = validate_priority(ctx->i915, args);
> +	if (err)
> +		return err;
>  
> -	ctx->sched.priority = priority;
> +	ctx->sched.priority = args->value;
>  	context_apply_all(ctx, __apply_priority, ctx);
>  
>  	return 0;
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 12/21] drm/i915/gem: Add a separate validate_priority helper
@ 2021-04-28 14:37     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-28 14:37 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:22PM -0500, Jason Ekstrand wrote:

Maybe explain that you pull this out since with the proto context there
will be two paths to set this, one for proto context, the other for
context already finalized and executing patches?

With that: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 +++++++++++++--------
>  1 file changed, 27 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 941fbf78267b4..e5efd22c89ba2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx,
>  	return i915_gem_context_get_engine(ctx, idx);
>  }
>  
> +static int validate_priority(struct drm_i915_private *i915,
> +			     const struct drm_i915_gem_context_param *args)
> +{
> +	s64 priority = args->value;
> +
> +	if (args->size)
> +		return -EINVAL;
> +
> +	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
> +		return -ENODEV;
> +
> +	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
> +	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
> +		return -EINVAL;
> +
> +	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
> +	    !capable(CAP_SYS_NICE))
> +		return -EPERM;
> +
> +	return 0;
> +}
> +
>  static struct i915_address_space *
>  context_get_vm_rcu(struct i915_gem_context *ctx)
>  {
> @@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, void *arg)
>  static int set_priority(struct i915_gem_context *ctx,
>  			const struct drm_i915_gem_context_param *args)
>  {
> -	s64 priority = args->value;
> -
> -	if (args->size)
> -		return -EINVAL;
> -
> -	if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
> -		return -ENODEV;
> -
> -	if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
> -	    priority < I915_CONTEXT_MIN_USER_PRIORITY)
> -		return -EINVAL;
> +	int err;
>  
> -	if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
> -	    !capable(CAP_SYS_NICE))
> -		return -EPERM;
> +	err = validate_priority(ctx->i915, args);
> +	if (err)
> +		return err;
>  
> -	ctx->sched.priority = priority;
> +	ctx->sched.priority = args->value;
>  	context_apply_all(ctx, __apply_priority, ctx);
>  
>  	return 0;
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 15:49     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:49 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> This API is entirely unnecessary and I'd love to get rid of it.  If
> userspace wants a single timeline across multiple contexts, they can
> either use implicit synchronization or a syncobj, both of which existed
> at the time this feature landed.  The justification given at the time
> was that it would help GL drivers which are inherently single-timeline.
> However, neither of our GL drivers actually wanted the feature.  i965
> was already in maintenance mode at the time and iris uses syncobj for
> everything.
> 
> Unfortunately, as much as I'd love to get rid of it, it is used by the
> media driver so we can't do that.  We can, however, do the next-best
> thing which is to embed a syncobj in the context and do exactly what
> we'd expect from userspace internally.  This isn't an entirely identical
> implementation because it's no longer atomic if userspace races with
> itself by calling execbuffer2 twice simultaneously from different
> threads.  It won't crash in that case; it just doesn't guarantee any
> ordering between those two submits.

1)

Please also mention the difference in context/timeline name when 
observed via the sync file API.

2)

I don't remember what we have concluded in terms of observable effects 
in sync_file_merge?

Regards,

Tvrtko

> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> advantages beyond mere annoyance.  One is that intel_timeline is no
> longer an api-visible object and can remain entirely an implementation
> detail.  This may be advantageous as we make scheduler changes going
> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> we should now have a 1:1 mapping between intel_context and
> intel_timeline which may help us reduce locking.
> 
> v2 (Jason Ekstrand):
>   - Update the comment on i915_gem_context::syncobj to mention that it's
>     an emulation and the possible race if userspace calls execbuffer2
>     twice on the same context concurrently.
>   - Wrap the checks for eb.gem_context->syncobj in unlikely()
>   - Drop the dma_fence reference
>   - Improved commit message
> 
> v3 (Jason Ekstrand):
>   - Move the dma_fence_put() to before the error exit
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
>   .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
>   3 files changed, 40 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 2c2fefa912805..a72c9b256723b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -67,6 +67,8 @@
>   #include <linux/log2.h>
>   #include <linux/nospec.h>
>   
> +#include <drm/drm_syncobj.h>
> +
>   #include "gt/gen6_ppgtt.h"
>   #include "gt/intel_context.h"
>   #include "gt/intel_context_param.h"
> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
>   		ce->vm = vm;
>   	}
>   
> -	GEM_BUG_ON(ce->timeline);
> -	if (ctx->timeline)
> -		ce->timeline = intel_timeline_get(ctx->timeline);
> -
>   	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>   	    intel_engine_has_timeslices(ce->engine))
>   		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>   	mutex_destroy(&ctx->engines_mutex);
>   	mutex_destroy(&ctx->lut_mutex);
>   
> -	if (ctx->timeline)
> -		intel_timeline_put(ctx->timeline);
> -
>   	put_pid(ctx->pid);
>   	mutex_destroy(&ctx->mutex);
>   
> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>   	if (vm)
>   		i915_vm_close(vm);
>   
> +	if (ctx->syncobj)
> +		drm_syncobj_put(ctx->syncobj);
> +
>   	ctx->file_priv = ERR_PTR(-EBADF);
>   
>   	/*
> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>   		i915_vm_close(vm);
>   }
>   
> -static void __set_timeline(struct intel_timeline **dst,
> -			   struct intel_timeline *src)
> -{
> -	struct intel_timeline *old = *dst;
> -
> -	*dst = src ? intel_timeline_get(src) : NULL;
> -
> -	if (old)
> -		intel_timeline_put(old);
> -}
> -
> -static void __apply_timeline(struct intel_context *ce, void *timeline)
> -{
> -	__set_timeline(&ce->timeline, timeline);
> -}
> -
> -static void __assign_timeline(struct i915_gem_context *ctx,
> -			      struct intel_timeline *timeline)
> -{
> -	__set_timeline(&ctx->timeline, timeline);
> -	context_apply_all(ctx, __apply_timeline, timeline);
> -}
> -
>   static struct i915_gem_context *
>   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   {
>   	struct i915_gem_context *ctx;
> +	int ret;
>   
>   	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
>   	    !HAS_EXECLISTS(i915))
> @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   	}
>   
>   	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> -		struct intel_timeline *timeline;
> -
> -		timeline = intel_timeline_create(&i915->gt);
> -		if (IS_ERR(timeline)) {
> +		ret = drm_syncobj_create(&ctx->syncobj,
> +					 DRM_SYNCOBJ_CREATE_SIGNALED,
> +					 NULL);
> +		if (ret) {
>   			context_close(ctx);
> -			return ERR_CAST(timeline);
> +			return ERR_PTR(ret);
>   		}
> -
> -		__assign_timeline(ctx, timeline);
> -		intel_timeline_put(timeline);
>   	}
>   
>   	trace_i915_context_create(ctx);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 676592e27e7d2..df76767f0c41b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -83,7 +83,19 @@ struct i915_gem_context {
>   	struct i915_gem_engines __rcu *engines;
>   	struct mutex engines_mutex; /* guards writes to engines */
>   
> -	struct intel_timeline *timeline;
> +	/**
> +	 * @syncobj: Shared timeline syncobj
> +	 *
> +	 * When the SHARED_TIMELINE flag is set on context creation, we
> +	 * emulate a single timeline across all engines using this syncobj.
> +	 * For every execbuffer2 call, this syncobj is used as both an in-
> +	 * and out-fence.  Unlike the real intel_timeline, this doesn't
> +	 * provide perfect atomic in-order guarantees if the client races
> +	 * with itself by calling execbuffer2 twice concurrently.  However,
> +	 * if userspace races with itself, that's not likely to yield well-
> +	 * defined results anyway so we choose to not care.
> +	 */
> +	struct drm_syncobj *syncobj;
>   
>   	/**
>   	 * @vm: unique address space (GTT)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b812f313422a9..d640bba6ad9ab 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   		goto err_vma;
>   	}
>   
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		struct dma_fence *fence;
> +
> +		fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> +		err = i915_request_await_dma_fence(eb.request, fence);
> +		dma_fence_put(fence);
> +		if (err)
> +			goto err_ext;
> +	}
> +
>   	if (in_fence) {
>   		if (args->flags & I915_EXEC_FENCE_SUBMIT)
>   			err = i915_request_await_execution(eb.request,
> @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   			fput(out_fence->file);
>   		}
>   	}
> +
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		drm_syncobj_replace_fence(eb.gem_context->syncobj,
> +					  &eb.request->fence);
> +	}
> +
>   	i915_request_put(eb.request);
>   
>   err_vma:
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-28 15:49     ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:49 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> This API is entirely unnecessary and I'd love to get rid of it.  If
> userspace wants a single timeline across multiple contexts, they can
> either use implicit synchronization or a syncobj, both of which existed
> at the time this feature landed.  The justification given at the time
> was that it would help GL drivers which are inherently single-timeline.
> However, neither of our GL drivers actually wanted the feature.  i965
> was already in maintenance mode at the time and iris uses syncobj for
> everything.
> 
> Unfortunately, as much as I'd love to get rid of it, it is used by the
> media driver so we can't do that.  We can, however, do the next-best
> thing which is to embed a syncobj in the context and do exactly what
> we'd expect from userspace internally.  This isn't an entirely identical
> implementation because it's no longer atomic if userspace races with
> itself by calling execbuffer2 twice simultaneously from different
> threads.  It won't crash in that case; it just doesn't guarantee any
> ordering between those two submits.

1)

Please also mention the difference in context/timeline name when 
observed via the sync file API.

2)

I don't remember what we have concluded in terms of observable effects 
in sync_file_merge?

Regards,

Tvrtko

> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> advantages beyond mere annoyance.  One is that intel_timeline is no
> longer an api-visible object and can remain entirely an implementation
> detail.  This may be advantageous as we make scheduler changes going
> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> we should now have a 1:1 mapping between intel_context and
> intel_timeline which may help us reduce locking.
> 
> v2 (Jason Ekstrand):
>   - Update the comment on i915_gem_context::syncobj to mention that it's
>     an emulation and the possible race if userspace calls execbuffer2
>     twice on the same context concurrently.
>   - Wrap the checks for eb.gem_context->syncobj in unlikely()
>   - Drop the dma_fence reference
>   - Improved commit message
> 
> v3 (Jason Ekstrand):
>   - Move the dma_fence_put() to before the error exit
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
>   .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
>   3 files changed, 40 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 2c2fefa912805..a72c9b256723b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -67,6 +67,8 @@
>   #include <linux/log2.h>
>   #include <linux/nospec.h>
>   
> +#include <drm/drm_syncobj.h>
> +
>   #include "gt/gen6_ppgtt.h"
>   #include "gt/intel_context.h"
>   #include "gt/intel_context_param.h"
> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
>   		ce->vm = vm;
>   	}
>   
> -	GEM_BUG_ON(ce->timeline);
> -	if (ctx->timeline)
> -		ce->timeline = intel_timeline_get(ctx->timeline);
> -
>   	if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>   	    intel_engine_has_timeslices(ce->engine))
>   		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>   	mutex_destroy(&ctx->engines_mutex);
>   	mutex_destroy(&ctx->lut_mutex);
>   
> -	if (ctx->timeline)
> -		intel_timeline_put(ctx->timeline);
> -
>   	put_pid(ctx->pid);
>   	mutex_destroy(&ctx->mutex);
>   
> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>   	if (vm)
>   		i915_vm_close(vm);
>   
> +	if (ctx->syncobj)
> +		drm_syncobj_put(ctx->syncobj);
> +
>   	ctx->file_priv = ERR_PTR(-EBADF);
>   
>   	/*
> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>   		i915_vm_close(vm);
>   }
>   
> -static void __set_timeline(struct intel_timeline **dst,
> -			   struct intel_timeline *src)
> -{
> -	struct intel_timeline *old = *dst;
> -
> -	*dst = src ? intel_timeline_get(src) : NULL;
> -
> -	if (old)
> -		intel_timeline_put(old);
> -}
> -
> -static void __apply_timeline(struct intel_context *ce, void *timeline)
> -{
> -	__set_timeline(&ce->timeline, timeline);
> -}
> -
> -static void __assign_timeline(struct i915_gem_context *ctx,
> -			      struct intel_timeline *timeline)
> -{
> -	__set_timeline(&ctx->timeline, timeline);
> -	context_apply_all(ctx, __apply_timeline, timeline);
> -}
> -
>   static struct i915_gem_context *
>   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   {
>   	struct i915_gem_context *ctx;
> +	int ret;
>   
>   	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
>   	    !HAS_EXECLISTS(i915))
> @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   	}
>   
>   	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> -		struct intel_timeline *timeline;
> -
> -		timeline = intel_timeline_create(&i915->gt);
> -		if (IS_ERR(timeline)) {
> +		ret = drm_syncobj_create(&ctx->syncobj,
> +					 DRM_SYNCOBJ_CREATE_SIGNALED,
> +					 NULL);
> +		if (ret) {
>   			context_close(ctx);
> -			return ERR_CAST(timeline);
> +			return ERR_PTR(ret);
>   		}
> -
> -		__assign_timeline(ctx, timeline);
> -		intel_timeline_put(timeline);
>   	}
>   
>   	trace_i915_context_create(ctx);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 676592e27e7d2..df76767f0c41b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -83,7 +83,19 @@ struct i915_gem_context {
>   	struct i915_gem_engines __rcu *engines;
>   	struct mutex engines_mutex; /* guards writes to engines */
>   
> -	struct intel_timeline *timeline;
> +	/**
> +	 * @syncobj: Shared timeline syncobj
> +	 *
> +	 * When the SHARED_TIMELINE flag is set on context creation, we
> +	 * emulate a single timeline across all engines using this syncobj.
> +	 * For every execbuffer2 call, this syncobj is used as both an in-
> +	 * and out-fence.  Unlike the real intel_timeline, this doesn't
> +	 * provide perfect atomic in-order guarantees if the client races
> +	 * with itself by calling execbuffer2 twice concurrently.  However,
> +	 * if userspace races with itself, that's not likely to yield well-
> +	 * defined results anyway so we choose to not care.
> +	 */
> +	struct drm_syncobj *syncobj;
>   
>   	/**
>   	 * @vm: unique address space (GTT)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b812f313422a9..d640bba6ad9ab 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   		goto err_vma;
>   	}
>   
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		struct dma_fence *fence;
> +
> +		fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> +		err = i915_request_await_dma_fence(eb.request, fence);
> +		dma_fence_put(fence);
> +		if (err)
> +			goto err_ext;
> +	}
> +
>   	if (in_fence) {
>   		if (args->flags & I915_EXEC_FENCE_SUBMIT)
>   			err = i915_request_await_execution(eb.request,
> @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   			fput(out_fence->file);
>   		}
>   	}
> +
> +	if (unlikely(eb.gem_context->syncobj)) {
> +		drm_syncobj_replace_fence(eb.gem_context->syncobj,
> +					  &eb.request->fence);
> +	}
> +
>   	i915_request_put(eb.request);
>   
>   err_vma:
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 15:51     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:51 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.

For historical reference, this is not because uapi was over-engineered 
but because certain SKUs never materialized.

Regards,

Tvrtko

> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>   drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
>   .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
>   .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
>   6 files changed, 7 insertions(+), 353 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>   	}
>   	virtual = set->engines->engines[idx]->engine;
>   
> +	if (intel_engine_is_virtual(virtual)) {
> +		drm_dbg(&i915->drm,
> +			"Bonding with virtual engines not allowed\n");
> +		return -EINVAL;
> +	}
> +
>   	err = check_user_mbz(&ext->flags);
>   	if (err)
>   		return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>   				n, ci.engine_class, ci.engine_instance);
>   			return -EINVAL;
>   		}
> -
> -		/*
> -		 * A non-virtual engine has no siblings to choose between; and
> -		 * a submit fence will always be directed to the one engine.
> -		 */
> -		if (intel_engine_is_virtual(virtual)) {
> -			err = intel_virtual_engine_attach_bond(virtual,
> -							       master,
> -							       bond);
> -			if (err)
> -				return err;
> -		}
>   	}
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d640bba6ad9ab..efb2fa3522a42 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   		if (args->flags & I915_EXEC_FENCE_SUBMIT)
>   			err = i915_request_await_execution(eb.request,
>   							   in_fence,
> -							   eb.engine->bond_execute);
> +							   NULL);
>   		else
>   			err = i915_request_await_dma_fence(eb.request,
>   							   in_fence);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 883bafc449024..68cfe5080325c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -446,13 +446,6 @@ struct intel_engine_cs {
>   	 */
>   	void		(*submit_request)(struct i915_request *rq);
>   
> -	/*
> -	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> -	 * request down to the bonded pairs.
> -	 */
> -	void            (*bond_execute)(struct i915_request *rq,
> -					struct dma_fence *signal);
> -
>   	/*
>   	 * Call when the priority on a request has changed and it and its
>   	 * dependencies may need rescheduling. Note the request itself may
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..b6e2b59f133b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>   		int prio;
>   	} nodes[I915_NUM_ENGINES];
>   
> -	/*
> -	 * Keep track of bonded pairs -- restrictions upon on our selection
> -	 * of physical engines any particular request may be submitted to.
> -	 * If we receive a submit-fence from a master engine, we will only
> -	 * use one of sibling_mask physical engines.
> -	 */
> -	struct ve_bond {
> -		const struct intel_engine_cs *master;
> -		intel_engine_mask_t sibling_mask;
> -	} *bonds;
> -	unsigned int num_bonds;
> -
>   	/* And finally, which physical engines this virtual engine maps onto. */
>   	unsigned int num_siblings;
>   	struct intel_engine_cs *siblings[];
> @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
>   	intel_breadcrumbs_free(ve->base.breadcrumbs);
>   	intel_engine_free_request_pool(&ve->base);
>   
> -	kfree(ve->bonds);
>   	kfree(ve);
>   }
>   
> @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
>   	spin_unlock_irqrestore(&ve->base.active.lock, flags);
>   }
>   
> -static struct ve_bond *
> -virtual_find_bond(struct virtual_engine *ve,
> -		  const struct intel_engine_cs *master)
> -{
> -	int i;
> -
> -	for (i = 0; i < ve->num_bonds; i++) {
> -		if (ve->bonds[i].master == master)
> -			return &ve->bonds[i];
> -	}
> -
> -	return NULL;
> -}
> -
> -static void
> -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> -{
> -	struct virtual_engine *ve = to_virtual_engine(rq->engine);
> -	intel_engine_mask_t allowed, exec;
> -	struct ve_bond *bond;
> -
> -	allowed = ~to_request(signal)->engine->mask;
> -
> -	bond = virtual_find_bond(ve, to_request(signal)->engine);
> -	if (bond)
> -		allowed &= bond->sibling_mask;
> -
> -	/* Restrict the bonded request to run on only the available engines */
> -	exec = READ_ONCE(rq->execution_mask);
> -	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> -		;
> -
> -	/* Prevent the master from being re-run on the bonded engines */
> -	to_request(signal)->execution_mask &= ~allowed;
> -}
> -
>   struct intel_context *
>   intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   			       unsigned int count)
> @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   
>   	ve->base.schedule = i915_schedule;
>   	ve->base.submit_request = virtual_submit_request;
> -	ve->base.bond_execute = virtual_bond_execute;
>   
>   	INIT_LIST_HEAD(virtual_queue(ve));
>   	ve->base.execlists.queue_priority_hint = INT_MIN;
> @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
>   	if (IS_ERR(dst))
>   		return dst;
>   
> -	if (se->num_bonds) {
> -		struct virtual_engine *de = to_virtual_engine(dst->engine);
> -
> -		de->bonds = kmemdup(se->bonds,
> -				    sizeof(*se->bonds) * se->num_bonds,
> -				    GFP_KERNEL);
> -		if (!de->bonds) {
> -			intel_context_put(dst);
> -			return ERR_PTR(-ENOMEM);
> -		}
> -
> -		de->num_bonds = se->num_bonds;
> -	}
> -
>   	return dst;
>   }
>   
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling)
> -{
> -	struct virtual_engine *ve = to_virtual_engine(engine);
> -	struct ve_bond *bond;
> -	int n;
> -
> -	/* Sanity check the sibling is part of the virtual engine */
> -	for (n = 0; n < ve->num_siblings; n++)
> -		if (sibling == ve->siblings[n])
> -			break;
> -	if (n == ve->num_siblings)
> -		return -EINVAL;
> -
> -	bond = virtual_find_bond(ve, master);
> -	if (bond) {
> -		bond->sibling_mask |= sibling->mask;
> -		return 0;
> -	}
> -
> -	bond = krealloc(ve->bonds,
> -			sizeof(*bond) * (ve->num_bonds + 1),
> -			GFP_KERNEL);
> -	if (!bond)
> -		return -ENOMEM;
> -
> -	bond[ve->num_bonds].master = master;
> -	bond[ve->num_bonds].sibling_mask = sibling->mask;
> -
> -	ve->bonds = bond;
> -	ve->num_bonds++;
> -
> -	return 0;
> -}
> -
>   void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   				   struct drm_printer *m,
>   				   void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> index fd61dae820e9e..80cec37a56ba9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   struct intel_context *
>   intel_execlists_clone_virtual(struct intel_engine_cs *src);
>   
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling);
> -
>   bool
>   intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
>   
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 1081cd36a2bd3..f03446d587160 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
>   	return 0;
>   }
>   
> -static int bond_virtual_engine(struct intel_gt *gt,
> -			       unsigned int class,
> -			       struct intel_engine_cs **siblings,
> -			       unsigned int nsibling,
> -			       unsigned int flags)
> -#define BOND_SCHEDULE BIT(0)
> -{
> -	struct intel_engine_cs *master;
> -	struct i915_request *rq[16];
> -	enum intel_engine_id id;
> -	struct igt_spinner spin;
> -	unsigned long n;
> -	int err;
> -
> -	/*
> -	 * A set of bonded requests is intended to be run concurrently
> -	 * across a number of engines. We use one request per-engine
> -	 * and a magic fence to schedule each of the bonded requests
> -	 * at the same time. A consequence of our current scheduler is that
> -	 * we only move requests to the HW ready queue when the request
> -	 * becomes ready, that is when all of its prerequisite fences have
> -	 * been signaled. As one of those fences is the master submit fence,
> -	 * there is a delay on all secondary fences as the HW may be
> -	 * currently busy. Equally, as all the requests are independent,
> -	 * they may have other fences that delay individual request
> -	 * submission to HW. Ergo, we do not guarantee that all requests are
> -	 * immediately submitted to HW at the same time, just that if the
> -	 * rules are abided by, they are ready at the same time as the
> -	 * first is submitted. Userspace can embed semaphores in its batch
> -	 * to ensure parallel execution of its phases as it requires.
> -	 * Though naturally it gets requested that perhaps the scheduler should
> -	 * take care of parallel execution, even across preemption events on
> -	 * different HW. (The proper answer is of course "lalalala".)
> -	 *
> -	 * With the submit-fence, we have identified three possible phases
> -	 * of synchronisation depending on the master fence: queued (not
> -	 * ready), executing, and signaled. The first two are quite simple
> -	 * and checked below. However, the signaled master fence handling is
> -	 * contentious. Currently we do not distinguish between a signaled
> -	 * fence and an expired fence, as once signaled it does not convey
> -	 * any information about the previous execution. It may even be freed
> -	 * and hence checking later it may not exist at all. Ergo we currently
> -	 * do not apply the bonding constraint for an already signaled fence,
> -	 * as our expectation is that it should not constrain the secondaries
> -	 * and is outside of the scope of the bonded request API (i.e. all
> -	 * userspace requests are meant to be running in parallel). As
> -	 * it imposes no constraint, and is effectively a no-op, we do not
> -	 * check below as normal execution flows are checked extensively above.
> -	 *
> -	 * XXX Is the degenerate handling of signaled submit fences the
> -	 * expected behaviour for userpace?
> -	 */
> -
> -	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> -
> -	if (igt_spinner_init(&spin, gt))
> -		return -ENOMEM;
> -
> -	err = 0;
> -	rq[0] = ERR_PTR(-ENOMEM);
> -	for_each_engine(master, gt, id) {
> -		struct i915_sw_fence fence = {};
> -		struct intel_context *ce;
> -
> -		if (master->class == class)
> -			continue;
> -
> -		ce = intel_context_create(master);
> -		if (IS_ERR(ce)) {
> -			err = PTR_ERR(ce);
> -			goto out;
> -		}
> -
> -		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> -
> -		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> -		intel_context_put(ce);
> -		if (IS_ERR(rq[0])) {
> -			err = PTR_ERR(rq[0]);
> -			goto out;
> -		}
> -		i915_request_get(rq[0]);
> -
> -		if (flags & BOND_SCHEDULE) {
> -			onstack_fence_init(&fence);
> -			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> -							       &fence,
> -							       GFP_KERNEL);
> -		}
> -
> -		i915_request_add(rq[0]);
> -		if (err < 0)
> -			goto out;
> -
> -		if (!(flags & BOND_SCHEDULE) &&
> -		    !igt_wait_for_spinner(&spin, rq[0])) {
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			struct intel_context *ve;
> -
> -			ve = intel_execlists_create_virtual(siblings, nsibling);
> -			if (IS_ERR(ve)) {
> -				err = PTR_ERR(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_virtual_engine_attach_bond(ve->engine,
> -							       master,
> -							       siblings[n]);
> -			if (err) {
> -				intel_context_put(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_context_pin(ve);
> -			intel_context_put(ve);
> -			if (err) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			rq[n + 1] = i915_request_create(ve);
> -			intel_context_unpin(ve);
> -			if (IS_ERR(rq[n + 1])) {
> -				err = PTR_ERR(rq[n + 1]);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -			i915_request_get(rq[n + 1]);
> -
> -			err = i915_request_await_execution(rq[n + 1],
> -							   &rq[0]->fence,
> -							   ve->engine->bond_execute);
> -			i915_request_add(rq[n + 1]);
> -			if (err < 0) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -		}
> -		onstack_fence_fini(&fence);
> -		intel_engine_flush_submission(master);
> -		igt_spinner_end(&spin);
> -
> -		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> -			pr_err("Master request did not execute (on %s)!\n",
> -			       rq[0]->engine->name);
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			if (i915_request_wait(rq[n + 1], 0,
> -					      MAX_SCHEDULE_TIMEOUT) < 0) {
> -				err = -EIO;
> -				goto out;
> -			}
> -
> -			if (rq[n + 1]->engine != siblings[n]) {
> -				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> -				       siblings[n]->name,
> -				       rq[n + 1]->engine->name,
> -				       rq[0]->engine->name);
> -				err = -EINVAL;
> -				goto out;
> -			}
> -		}
> -
> -		for (n = 0; !IS_ERR(rq[n]); n++)
> -			i915_request_put(rq[n]);
> -		rq[0] = ERR_PTR(-ENOMEM);
> -	}
> -
> -out:
> -	for (n = 0; !IS_ERR(rq[n]); n++)
> -		i915_request_put(rq[n]);
> -	if (igt_flush_test(gt->i915))
> -		err = -EIO;
> -
> -	igt_spinner_fini(&spin);
> -	return err;
> -}
> -
> -static int live_virtual_bond(void *arg)
> -{
> -	static const struct phase {
> -		const char *name;
> -		unsigned int flags;
> -	} phases[] = {
> -		{ "", 0 },
> -		{ "schedule", BOND_SCHEDULE },
> -		{ },
> -	};
> -	struct intel_gt *gt = arg;
> -	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> -	unsigned int class;
> -	int err;
> -
> -	if (intel_uc_uses_guc_submission(&gt->uc))
> -		return 0;
> -
> -	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> -		const struct phase *p;
> -		int nsibling;
> -
> -		nsibling = select_siblings(gt, class, siblings);
> -		if (nsibling < 2)
> -			continue;
> -
> -		for (p = phases; p->name; p++) {
> -			err = bond_virtual_engine(gt,
> -						  class, siblings, nsibling,
> -						  p->flags);
> -			if (err) {
> -				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> -				       __func__, p->name, class, nsibling, err);
> -				return err;
> -			}
> -		}
> -	}
> -
> -	return 0;
> -}
> -
>   static int reset_virtual_engine(struct intel_gt *gt,
>   				struct intel_engine_cs **siblings,
>   				unsigned int nsibling)
> @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>   		SUBTEST(live_virtual_mask),
>   		SUBTEST(live_virtual_preserved),
>   		SUBTEST(live_virtual_slice),
> -		SUBTEST(live_virtual_bond),
>   		SUBTEST(live_virtual_reset),
>   	};
>   
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 15:51     ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:51 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> This adds a bunch of complexity which the media driver has never
> actually used.  The media driver does technically bond a balanced engine
> to another engine but the balanced engine only has one engine in the
> sibling set.  This doesn't actually result in a virtual engine.

For historical reference, this is not because uapi was over-engineered 
but because certain SKUs never materialized.

Regards,

Tvrtko

> Unless some userspace badly wants it, there's no good reason to support
> this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> leave the validation code in place in case we ever decide we want to do
> something interesting with the bonding information.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>   drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
>   .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
>   .../drm/i915/gt/intel_execlists_submission.h  |   4 -
>   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
>   6 files changed, 7 insertions(+), 353 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e8179918fa306..5f8d0faf783aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>   	}
>   	virtual = set->engines->engines[idx]->engine;
>   
> +	if (intel_engine_is_virtual(virtual)) {
> +		drm_dbg(&i915->drm,
> +			"Bonding with virtual engines not allowed\n");
> +		return -EINVAL;
> +	}
> +
>   	err = check_user_mbz(&ext->flags);
>   	if (err)
>   		return err;
> @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
>   				n, ci.engine_class, ci.engine_instance);
>   			return -EINVAL;
>   		}
> -
> -		/*
> -		 * A non-virtual engine has no siblings to choose between; and
> -		 * a submit fence will always be directed to the one engine.
> -		 */
> -		if (intel_engine_is_virtual(virtual)) {
> -			err = intel_virtual_engine_attach_bond(virtual,
> -							       master,
> -							       bond);
> -			if (err)
> -				return err;
> -		}
>   	}
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d640bba6ad9ab..efb2fa3522a42 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   		if (args->flags & I915_EXEC_FENCE_SUBMIT)
>   			err = i915_request_await_execution(eb.request,
>   							   in_fence,
> -							   eb.engine->bond_execute);
> +							   NULL);
>   		else
>   			err = i915_request_await_dma_fence(eb.request,
>   							   in_fence);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index 883bafc449024..68cfe5080325c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -446,13 +446,6 @@ struct intel_engine_cs {
>   	 */
>   	void		(*submit_request)(struct i915_request *rq);
>   
> -	/*
> -	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> -	 * request down to the bonded pairs.
> -	 */
> -	void            (*bond_execute)(struct i915_request *rq,
> -					struct dma_fence *signal);
> -
>   	/*
>   	 * Call when the priority on a request has changed and it and its
>   	 * dependencies may need rescheduling. Note the request itself may
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de124870af44d..b6e2b59f133b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -181,18 +181,6 @@ struct virtual_engine {
>   		int prio;
>   	} nodes[I915_NUM_ENGINES];
>   
> -	/*
> -	 * Keep track of bonded pairs -- restrictions upon on our selection
> -	 * of physical engines any particular request may be submitted to.
> -	 * If we receive a submit-fence from a master engine, we will only
> -	 * use one of sibling_mask physical engines.
> -	 */
> -	struct ve_bond {
> -		const struct intel_engine_cs *master;
> -		intel_engine_mask_t sibling_mask;
> -	} *bonds;
> -	unsigned int num_bonds;
> -
>   	/* And finally, which physical engines this virtual engine maps onto. */
>   	unsigned int num_siblings;
>   	struct intel_engine_cs *siblings[];
> @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
>   	intel_breadcrumbs_free(ve->base.breadcrumbs);
>   	intel_engine_free_request_pool(&ve->base);
>   
> -	kfree(ve->bonds);
>   	kfree(ve);
>   }
>   
> @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
>   	spin_unlock_irqrestore(&ve->base.active.lock, flags);
>   }
>   
> -static struct ve_bond *
> -virtual_find_bond(struct virtual_engine *ve,
> -		  const struct intel_engine_cs *master)
> -{
> -	int i;
> -
> -	for (i = 0; i < ve->num_bonds; i++) {
> -		if (ve->bonds[i].master == master)
> -			return &ve->bonds[i];
> -	}
> -
> -	return NULL;
> -}
> -
> -static void
> -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> -{
> -	struct virtual_engine *ve = to_virtual_engine(rq->engine);
> -	intel_engine_mask_t allowed, exec;
> -	struct ve_bond *bond;
> -
> -	allowed = ~to_request(signal)->engine->mask;
> -
> -	bond = virtual_find_bond(ve, to_request(signal)->engine);
> -	if (bond)
> -		allowed &= bond->sibling_mask;
> -
> -	/* Restrict the bonded request to run on only the available engines */
> -	exec = READ_ONCE(rq->execution_mask);
> -	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> -		;
> -
> -	/* Prevent the master from being re-run on the bonded engines */
> -	to_request(signal)->execution_mask &= ~allowed;
> -}
> -
>   struct intel_context *
>   intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   			       unsigned int count)
> @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   
>   	ve->base.schedule = i915_schedule;
>   	ve->base.submit_request = virtual_submit_request;
> -	ve->base.bond_execute = virtual_bond_execute;
>   
>   	INIT_LIST_HEAD(virtual_queue(ve));
>   	ve->base.execlists.queue_priority_hint = INT_MIN;
> @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
>   	if (IS_ERR(dst))
>   		return dst;
>   
> -	if (se->num_bonds) {
> -		struct virtual_engine *de = to_virtual_engine(dst->engine);
> -
> -		de->bonds = kmemdup(se->bonds,
> -				    sizeof(*se->bonds) * se->num_bonds,
> -				    GFP_KERNEL);
> -		if (!de->bonds) {
> -			intel_context_put(dst);
> -			return ERR_PTR(-ENOMEM);
> -		}
> -
> -		de->num_bonds = se->num_bonds;
> -	}
> -
>   	return dst;
>   }
>   
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling)
> -{
> -	struct virtual_engine *ve = to_virtual_engine(engine);
> -	struct ve_bond *bond;
> -	int n;
> -
> -	/* Sanity check the sibling is part of the virtual engine */
> -	for (n = 0; n < ve->num_siblings; n++)
> -		if (sibling == ve->siblings[n])
> -			break;
> -	if (n == ve->num_siblings)
> -		return -EINVAL;
> -
> -	bond = virtual_find_bond(ve, master);
> -	if (bond) {
> -		bond->sibling_mask |= sibling->mask;
> -		return 0;
> -	}
> -
> -	bond = krealloc(ve->bonds,
> -			sizeof(*bond) * (ve->num_bonds + 1),
> -			GFP_KERNEL);
> -	if (!bond)
> -		return -ENOMEM;
> -
> -	bond[ve->num_bonds].master = master;
> -	bond[ve->num_bonds].sibling_mask = sibling->mask;
> -
> -	ve->bonds = bond;
> -	ve->num_bonds++;
> -
> -	return 0;
> -}
> -
>   void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   				   struct drm_printer *m,
>   				   void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> index fd61dae820e9e..80cec37a56ba9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
>   struct intel_context *
>   intel_execlists_clone_virtual(struct intel_engine_cs *src);
>   
> -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> -				     const struct intel_engine_cs *master,
> -				     const struct intel_engine_cs *sibling);
> -
>   bool
>   intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
>   
> diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> index 1081cd36a2bd3..f03446d587160 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
>   	return 0;
>   }
>   
> -static int bond_virtual_engine(struct intel_gt *gt,
> -			       unsigned int class,
> -			       struct intel_engine_cs **siblings,
> -			       unsigned int nsibling,
> -			       unsigned int flags)
> -#define BOND_SCHEDULE BIT(0)
> -{
> -	struct intel_engine_cs *master;
> -	struct i915_request *rq[16];
> -	enum intel_engine_id id;
> -	struct igt_spinner spin;
> -	unsigned long n;
> -	int err;
> -
> -	/*
> -	 * A set of bonded requests is intended to be run concurrently
> -	 * across a number of engines. We use one request per-engine
> -	 * and a magic fence to schedule each of the bonded requests
> -	 * at the same time. A consequence of our current scheduler is that
> -	 * we only move requests to the HW ready queue when the request
> -	 * becomes ready, that is when all of its prerequisite fences have
> -	 * been signaled. As one of those fences is the master submit fence,
> -	 * there is a delay on all secondary fences as the HW may be
> -	 * currently busy. Equally, as all the requests are independent,
> -	 * they may have other fences that delay individual request
> -	 * submission to HW. Ergo, we do not guarantee that all requests are
> -	 * immediately submitted to HW at the same time, just that if the
> -	 * rules are abided by, they are ready at the same time as the
> -	 * first is submitted. Userspace can embed semaphores in its batch
> -	 * to ensure parallel execution of its phases as it requires.
> -	 * Though naturally it gets requested that perhaps the scheduler should
> -	 * take care of parallel execution, even across preemption events on
> -	 * different HW. (The proper answer is of course "lalalala".)
> -	 *
> -	 * With the submit-fence, we have identified three possible phases
> -	 * of synchronisation depending on the master fence: queued (not
> -	 * ready), executing, and signaled. The first two are quite simple
> -	 * and checked below. However, the signaled master fence handling is
> -	 * contentious. Currently we do not distinguish between a signaled
> -	 * fence and an expired fence, as once signaled it does not convey
> -	 * any information about the previous execution. It may even be freed
> -	 * and hence checking later it may not exist at all. Ergo we currently
> -	 * do not apply the bonding constraint for an already signaled fence,
> -	 * as our expectation is that it should not constrain the secondaries
> -	 * and is outside of the scope of the bonded request API (i.e. all
> -	 * userspace requests are meant to be running in parallel). As
> -	 * it imposes no constraint, and is effectively a no-op, we do not
> -	 * check below as normal execution flows are checked extensively above.
> -	 *
> -	 * XXX Is the degenerate handling of signaled submit fences the
> -	 * expected behaviour for userpace?
> -	 */
> -
> -	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> -
> -	if (igt_spinner_init(&spin, gt))
> -		return -ENOMEM;
> -
> -	err = 0;
> -	rq[0] = ERR_PTR(-ENOMEM);
> -	for_each_engine(master, gt, id) {
> -		struct i915_sw_fence fence = {};
> -		struct intel_context *ce;
> -
> -		if (master->class == class)
> -			continue;
> -
> -		ce = intel_context_create(master);
> -		if (IS_ERR(ce)) {
> -			err = PTR_ERR(ce);
> -			goto out;
> -		}
> -
> -		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> -
> -		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> -		intel_context_put(ce);
> -		if (IS_ERR(rq[0])) {
> -			err = PTR_ERR(rq[0]);
> -			goto out;
> -		}
> -		i915_request_get(rq[0]);
> -
> -		if (flags & BOND_SCHEDULE) {
> -			onstack_fence_init(&fence);
> -			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> -							       &fence,
> -							       GFP_KERNEL);
> -		}
> -
> -		i915_request_add(rq[0]);
> -		if (err < 0)
> -			goto out;
> -
> -		if (!(flags & BOND_SCHEDULE) &&
> -		    !igt_wait_for_spinner(&spin, rq[0])) {
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			struct intel_context *ve;
> -
> -			ve = intel_execlists_create_virtual(siblings, nsibling);
> -			if (IS_ERR(ve)) {
> -				err = PTR_ERR(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_virtual_engine_attach_bond(ve->engine,
> -							       master,
> -							       siblings[n]);
> -			if (err) {
> -				intel_context_put(ve);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			err = intel_context_pin(ve);
> -			intel_context_put(ve);
> -			if (err) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -
> -			rq[n + 1] = i915_request_create(ve);
> -			intel_context_unpin(ve);
> -			if (IS_ERR(rq[n + 1])) {
> -				err = PTR_ERR(rq[n + 1]);
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -			i915_request_get(rq[n + 1]);
> -
> -			err = i915_request_await_execution(rq[n + 1],
> -							   &rq[0]->fence,
> -							   ve->engine->bond_execute);
> -			i915_request_add(rq[n + 1]);
> -			if (err < 0) {
> -				onstack_fence_fini(&fence);
> -				goto out;
> -			}
> -		}
> -		onstack_fence_fini(&fence);
> -		intel_engine_flush_submission(master);
> -		igt_spinner_end(&spin);
> -
> -		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> -			pr_err("Master request did not execute (on %s)!\n",
> -			       rq[0]->engine->name);
> -			err = -EIO;
> -			goto out;
> -		}
> -
> -		for (n = 0; n < nsibling; n++) {
> -			if (i915_request_wait(rq[n + 1], 0,
> -					      MAX_SCHEDULE_TIMEOUT) < 0) {
> -				err = -EIO;
> -				goto out;
> -			}
> -
> -			if (rq[n + 1]->engine != siblings[n]) {
> -				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> -				       siblings[n]->name,
> -				       rq[n + 1]->engine->name,
> -				       rq[0]->engine->name);
> -				err = -EINVAL;
> -				goto out;
> -			}
> -		}
> -
> -		for (n = 0; !IS_ERR(rq[n]); n++)
> -			i915_request_put(rq[n]);
> -		rq[0] = ERR_PTR(-ENOMEM);
> -	}
> -
> -out:
> -	for (n = 0; !IS_ERR(rq[n]); n++)
> -		i915_request_put(rq[n]);
> -	if (igt_flush_test(gt->i915))
> -		err = -EIO;
> -
> -	igt_spinner_fini(&spin);
> -	return err;
> -}
> -
> -static int live_virtual_bond(void *arg)
> -{
> -	static const struct phase {
> -		const char *name;
> -		unsigned int flags;
> -	} phases[] = {
> -		{ "", 0 },
> -		{ "schedule", BOND_SCHEDULE },
> -		{ },
> -	};
> -	struct intel_gt *gt = arg;
> -	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> -	unsigned int class;
> -	int err;
> -
> -	if (intel_uc_uses_guc_submission(&gt->uc))
> -		return 0;
> -
> -	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> -		const struct phase *p;
> -		int nsibling;
> -
> -		nsibling = select_siblings(gt, class, siblings);
> -		if (nsibling < 2)
> -			continue;
> -
> -		for (p = phases; p->name; p++) {
> -			err = bond_virtual_engine(gt,
> -						  class, siblings, nsibling,
> -						  p->flags);
> -			if (err) {
> -				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> -				       __func__, p->name, class, nsibling, err);
> -				return err;
> -			}
> -		}
> -	}
> -
> -	return 0;
> -}
> -
>   static int reset_virtual_engine(struct intel_gt *gt,
>   				struct intel_engine_cs **siblings,
>   				unsigned int nsibling)
> @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
>   		SUBTEST(live_virtual_mask),
>   		SUBTEST(live_virtual_preserved),
>   		SUBTEST(live_virtual_slice),
> -		SUBTEST(live_virtual_bond),
>   		SUBTEST(live_virtual_reset),
>   	};
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 15:55     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:55 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> Instead of handling it like a context param, unconditionally set it when
> intel_contexts are created.  This doesn't fix anything but does simplify
> the code a bit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>   .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>   drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>   3 files changed, 6 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 35bcdeddfbf3f..1091cc04a242a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>   	    intel_engine_has_timeslices(ce->engine))
>   		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>   
> -	intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> +	if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> +	    ctx->i915->params.request_timeout_ms) {
> +		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> +		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);

Blank line between declarations and code please, or just lose the local.

Otherwise looks okay. Slight change that same GEM context can now have a 
mix of different request expirations isn't interesting I think. At least 
the change goes away by the end of the series.

Regards,

Tvrtko

> +	}
>   }
>   
>   static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
>   	context_apply_all(ctx, __apply_timeline, timeline);
>   }
>   
> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> -{
> -	return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> -}
> -
> -static int
> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> -{
> -	int ret;
> -
> -	ret = context_apply_all(ctx, __apply_watchdog,
> -				(void *)(uintptr_t)timeout_us);
> -	if (!ret)
> -		ctx->watchdog.timeout_us = timeout_us;
> -
> -	return ret;
> -}
> -
> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> -{
> -	struct drm_i915_private *i915 = ctx->i915;
> -	int ret;
> -
> -	if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> -	    !i915->params.request_timeout_ms)
> -		return;
> -
> -	/* Default expiry for user fences. */
> -	ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> -	if (ret)
> -		drm_notice(&i915->drm,
> -			   "Failed to configure default fence expiry! (%d)",
> -			   ret);
> -}
> -
>   static struct i915_gem_context *
>   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   {
> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   		intel_timeline_put(timeline);
>   	}
>   
> -	__set_default_fence_expiry(ctx);
> -
>   	trace_i915_context_create(ctx);
>   
>   	return ctx;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 5ae71ec936f7c..676592e27e7d2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -153,10 +153,6 @@ struct i915_gem_context {
>   	 */
>   	atomic_t active_count;
>   
> -	struct {
> -		u64 timeout_us;
> -	} watchdog;
> -
>   	/**
>   	 * @hang_timestamp: The last time(s) this context caused a GPU hang
>   	 */
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index dffedd983693d..0c69cb42d075c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,11 +10,10 @@
>   
>   #include "intel_context.h"
>   
> -static inline int
> +static inline void
>   intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>   {
>   	ce->watchdog.timeout_us = timeout_us;
> -	return 0;
>   }
>   
>   #endif /* INTEL_CONTEXT_PARAM_H */
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-28 15:55     ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:55 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> Instead of handling it like a context param, unconditionally set it when
> intel_contexts are created.  This doesn't fix anything but does simplify
> the code a bit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>   .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>   drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>   3 files changed, 6 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 35bcdeddfbf3f..1091cc04a242a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>   	    intel_engine_has_timeslices(ce->engine))
>   		__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>   
> -	intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> +	if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> +	    ctx->i915->params.request_timeout_ms) {
> +		unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> +		intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);

Blank line between declarations and code please, or just lose the local.

Otherwise looks okay. Slight change that same GEM context can now have a 
mix of different request expirations isn't interesting I think. At least 
the change goes away by the end of the series.

Regards,

Tvrtko

> +	}
>   }
>   
>   static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
>   	context_apply_all(ctx, __apply_timeline, timeline);
>   }
>   
> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> -{
> -	return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> -}
> -
> -static int
> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> -{
> -	int ret;
> -
> -	ret = context_apply_all(ctx, __apply_watchdog,
> -				(void *)(uintptr_t)timeout_us);
> -	if (!ret)
> -		ctx->watchdog.timeout_us = timeout_us;
> -
> -	return ret;
> -}
> -
> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> -{
> -	struct drm_i915_private *i915 = ctx->i915;
> -	int ret;
> -
> -	if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> -	    !i915->params.request_timeout_ms)
> -		return;
> -
> -	/* Default expiry for user fences. */
> -	ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> -	if (ret)
> -		drm_notice(&i915->drm,
> -			   "Failed to configure default fence expiry! (%d)",
> -			   ret);
> -}
> -
>   static struct i915_gem_context *
>   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   {
> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>   		intel_timeline_put(timeline);
>   	}
>   
> -	__set_default_fence_expiry(ctx);
> -
>   	trace_i915_context_create(ctx);
>   
>   	return ctx;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index 5ae71ec936f7c..676592e27e7d2 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -153,10 +153,6 @@ struct i915_gem_context {
>   	 */
>   	atomic_t active_count;
>   
> -	struct {
> -		u64 timeout_us;
> -	} watchdog;
> -
>   	/**
>   	 * @hang_timestamp: The last time(s) this context caused a GPU hang
>   	 */
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> index dffedd983693d..0c69cb42d075c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> @@ -10,11 +10,10 @@
>   
>   #include "intel_context.h"
>   
> -static inline int
> +static inline void
>   intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>   {
>   	ce->watchdog.timeout_us = timeout_us;
> -	return 0;
>   }
>   
>   #endif /* INTEL_CONTEXT_PARAM_H */
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 15:59     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:59 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> This API allows one context to grab bits out of another context upon
> creation.  It can be used as a short-cut for setparam(getparam()) for
> things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> real userspace.  It's used by a few IGT tests and that's it.  Since it
> doesn't add any real value (most of the stuff you can CLONE you can copy
> in other ways), drop it.
> 
> There is one thing that this API allows you to clone which you cannot
> clone via getparam/setparam: timelines.  However, timelines are an
> implementation detail of i915 and not really something that needs to be
> exposed to userspace.  Also, sharing timelines between contexts isn't
> obviously useful and supporting it has the potential to complicate i915
> internally.  It also doesn't add any functionality that the client can't
> get in other ways.  If a client really wants a shared timeline, they can
> use a syncobj and set it as an in and out fence on every submit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

As mentioned before I have no major problem with removing unused uapi 
apart from disagreeing on when to do it. And the fact I find cloning a 
very plausible equivalent of clone(2). Which is an established nice 
model and all. So a sad ack is all I can give.

Regards,

Tvrtko

> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
>   include/uapi/drm/i915_drm.h                 |  16 +-
>   2 files changed, 6 insertions(+), 209 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a77855123cec..2c2fefa912805 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>   	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>   }
>   
> -static int clone_engines(struct i915_gem_context *dst,
> -			 struct i915_gem_context *src)
> +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
>   {
> -	struct i915_gem_engines *clone, *e;
> -	bool user_engines;
> -	unsigned long n;
> -
> -	e = __context_engines_await(src, &user_engines);
> -	if (!e)
> -		return -ENOENT;
> -
> -	clone = alloc_engines(e->num_engines);
> -	if (!clone)
> -		goto err_unlock;
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_engine_cs *engine;
> -
> -		if (!e->engines[n]) {
> -			clone->engines[n] = NULL;
> -			continue;
> -		}
> -		engine = e->engines[n]->engine;
> -
> -		/*
> -		 * Virtual engines are singletons; they can only exist
> -		 * inside a single context, because they embed their
> -		 * HW context... As each virtual context implies a single
> -		 * timeline (each engine can only dequeue a single request
> -		 * at any time), it would be surprising for two contexts
> -		 * to use the same engine. So let's create a copy of
> -		 * the virtual engine instead.
> -		 */
> -		if (intel_engine_is_virtual(engine))
> -			clone->engines[n] =
> -				intel_execlists_clone_virtual(engine);
> -		else
> -			clone->engines[n] = intel_context_create(engine);
> -		if (IS_ERR_OR_NULL(clone->engines[n])) {
> -			__free_engines(clone, n);
> -			goto err_unlock;
> -		}
> -
> -		intel_context_set_gem(clone->engines[n], dst);
> -	}
> -	clone->num_engines = n;
> -	i915_sw_fence_complete(&e->fence);
> -
> -	/* Serialised by constructor */
> -	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> -	if (user_engines)
> -		i915_gem_context_set_user_engines(dst);
> -	else
> -		i915_gem_context_clear_user_engines(dst);
> -	return 0;
> -
> -err_unlock:
> -	i915_sw_fence_complete(&e->fence);
> -	return -ENOMEM;
> -}
> -
> -static int clone_flags(struct i915_gem_context *dst,
> -		       struct i915_gem_context *src)
> -{
> -	dst->user_flags = src->user_flags;
> -	return 0;
> -}
> -
> -static int clone_schedattr(struct i915_gem_context *dst,
> -			   struct i915_gem_context *src)
> -{
> -	dst->sched = src->sched;
> -	return 0;
> -}
> -
> -static int clone_sseu(struct i915_gem_context *dst,
> -		      struct i915_gem_context *src)
> -{
> -	struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> -	struct i915_gem_engines *clone;
> -	unsigned long n;
> -	int err;
> -
> -	/* no locking required; sole access under constructor*/
> -	clone = __context_engines_static(dst);
> -	if (e->num_engines != clone->num_engines) {
> -		err = -EINVAL;
> -		goto unlock;
> -	}
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_context *ce = e->engines[n];
> -
> -		if (clone->engines[n]->engine->class != ce->engine->class) {
> -			/* Must have compatible engine maps! */
> -			err = -EINVAL;
> -			goto unlock;
> -		}
> -
> -		/* serialises with set_sseu */
> -		err = intel_context_lock_pinned(ce);
> -		if (err)
> -			goto unlock;
> -
> -		clone->engines[n]->sseu = ce->sseu;
> -		intel_context_unlock_pinned(ce);
> -	}
> -
> -	err = 0;
> -unlock:
> -	i915_gem_context_unlock_engines(src);
> -	return err;
> -}
> -
> -static int clone_timeline(struct i915_gem_context *dst,
> -			  struct i915_gem_context *src)
> -{
> -	if (src->timeline)
> -		__assign_timeline(dst, src->timeline);
> -
> -	return 0;
> -}
> -
> -static int clone_vm(struct i915_gem_context *dst,
> -		    struct i915_gem_context *src)
> -{
> -	struct i915_address_space *vm;
> -	int err = 0;
> -
> -	if (!rcu_access_pointer(src->vm))
> -		return 0;
> -
> -	rcu_read_lock();
> -	vm = context_get_vm_rcu(src);
> -	rcu_read_unlock();
> -
> -	if (!mutex_lock_interruptible(&dst->mutex)) {
> -		__assign_ppgtt(dst, vm);
> -		mutex_unlock(&dst->mutex);
> -	} else {
> -		err = -EINTR;
> -	}
> -
> -	i915_vm_put(vm);
> -	return err;
> -}
> -
> -static int create_clone(struct i915_user_extension __user *ext, void *data)
> -{
> -	static int (* const fn[])(struct i915_gem_context *dst,
> -				  struct i915_gem_context *src) = {
> -#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
> -		MAP(ENGINES, clone_engines),
> -		MAP(FLAGS, clone_flags),
> -		MAP(SCHEDATTR, clone_schedattr),
> -		MAP(SSEU, clone_sseu),
> -		MAP(TIMELINE, clone_timeline),
> -		MAP(VM, clone_vm),
> -#undef MAP
> -	};
> -	struct drm_i915_gem_context_create_ext_clone local;
> -	const struct create_ext *arg = data;
> -	struct i915_gem_context *dst = arg->ctx;
> -	struct i915_gem_context *src;
> -	int err, bit;
> -
> -	if (copy_from_user(&local, ext, sizeof(local)))
> -		return -EFAULT;
> -
> -	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
> -		     I915_CONTEXT_CLONE_UNKNOWN);
> -
> -	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
> -		return -EINVAL;
> -
> -	if (local.rsvd)
> -		return -EINVAL;
> -
> -	rcu_read_lock();
> -	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
> -	rcu_read_unlock();
> -	if (!src)
> -		return -ENOENT;
> -
> -	GEM_BUG_ON(src == dst);
> -
> -	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
> -		if (!(local.flags & BIT(bit)))
> -			continue;
> -
> -		err = fn[bit](dst, src);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> +	return -EINVAL;
>   }
>   
>   static const i915_user_extension_fn create_extensions[] = {
>   	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
> -	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
> +	[I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
>   };
>   
>   static bool client_is_banned(struct drm_i915_file_private *file_priv)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a0aaa8298f28d..75a71b6756ed8 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
>   	struct drm_i915_gem_context_param param;
>   };
>   
> -struct drm_i915_gem_context_create_ext_clone {
> +/* This API has been removed.  On the off chance someone somewhere has
> + * attempted to use it, never re-use this extension number.
> + */
>   #define I915_CONTEXT_CREATE_EXT_CLONE 1
> -	struct i915_user_extension base;
> -	__u32 clone_id;
> -	__u32 flags;
> -#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
> -#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
> -#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
> -#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
> -#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
> -#define I915_CONTEXT_CLONE_VM		(1u << 5)
> -#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
> -	__u64 rsvd;
> -};
>   
>   struct drm_i915_gem_context_destroy {
>   	__u32 ctx_id;
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
@ 2021-04-28 15:59     ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-28 15:59 UTC (permalink / raw)
  To: Jason Ekstrand, intel-gfx, dri-devel


On 23/04/2021 23:31, Jason Ekstrand wrote:
> This API allows one context to grab bits out of another context upon
> creation.  It can be used as a short-cut for setparam(getparam()) for
> things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> real userspace.  It's used by a few IGT tests and that's it.  Since it
> doesn't add any real value (most of the stuff you can CLONE you can copy
> in other ways), drop it.
> 
> There is one thing that this API allows you to clone which you cannot
> clone via getparam/setparam: timelines.  However, timelines are an
> implementation detail of i915 and not really something that needs to be
> exposed to userspace.  Also, sharing timelines between contexts isn't
> obviously useful and supporting it has the potential to complicate i915
> internally.  It also doesn't add any functionality that the client can't
> get in other ways.  If a client really wants a shared timeline, they can
> use a syncobj and set it as an in and out fence on every submit.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

As mentioned before I have no major problem with removing unused uapi 
apart from disagreeing on when to do it. And the fact I find cloning a 
very plausible equivalent of clone(2). Which is an established nice 
model and all. So a sad ack is all I can give.

Regards,

Tvrtko

> ---
>   drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
>   include/uapi/drm/i915_drm.h                 |  16 +-
>   2 files changed, 6 insertions(+), 209 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 8a77855123cec..2c2fefa912805 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>   	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
>   }
>   
> -static int clone_engines(struct i915_gem_context *dst,
> -			 struct i915_gem_context *src)
> +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
>   {
> -	struct i915_gem_engines *clone, *e;
> -	bool user_engines;
> -	unsigned long n;
> -
> -	e = __context_engines_await(src, &user_engines);
> -	if (!e)
> -		return -ENOENT;
> -
> -	clone = alloc_engines(e->num_engines);
> -	if (!clone)
> -		goto err_unlock;
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_engine_cs *engine;
> -
> -		if (!e->engines[n]) {
> -			clone->engines[n] = NULL;
> -			continue;
> -		}
> -		engine = e->engines[n]->engine;
> -
> -		/*
> -		 * Virtual engines are singletons; they can only exist
> -		 * inside a single context, because they embed their
> -		 * HW context... As each virtual context implies a single
> -		 * timeline (each engine can only dequeue a single request
> -		 * at any time), it would be surprising for two contexts
> -		 * to use the same engine. So let's create a copy of
> -		 * the virtual engine instead.
> -		 */
> -		if (intel_engine_is_virtual(engine))
> -			clone->engines[n] =
> -				intel_execlists_clone_virtual(engine);
> -		else
> -			clone->engines[n] = intel_context_create(engine);
> -		if (IS_ERR_OR_NULL(clone->engines[n])) {
> -			__free_engines(clone, n);
> -			goto err_unlock;
> -		}
> -
> -		intel_context_set_gem(clone->engines[n], dst);
> -	}
> -	clone->num_engines = n;
> -	i915_sw_fence_complete(&e->fence);
> -
> -	/* Serialised by constructor */
> -	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> -	if (user_engines)
> -		i915_gem_context_set_user_engines(dst);
> -	else
> -		i915_gem_context_clear_user_engines(dst);
> -	return 0;
> -
> -err_unlock:
> -	i915_sw_fence_complete(&e->fence);
> -	return -ENOMEM;
> -}
> -
> -static int clone_flags(struct i915_gem_context *dst,
> -		       struct i915_gem_context *src)
> -{
> -	dst->user_flags = src->user_flags;
> -	return 0;
> -}
> -
> -static int clone_schedattr(struct i915_gem_context *dst,
> -			   struct i915_gem_context *src)
> -{
> -	dst->sched = src->sched;
> -	return 0;
> -}
> -
> -static int clone_sseu(struct i915_gem_context *dst,
> -		      struct i915_gem_context *src)
> -{
> -	struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> -	struct i915_gem_engines *clone;
> -	unsigned long n;
> -	int err;
> -
> -	/* no locking required; sole access under constructor*/
> -	clone = __context_engines_static(dst);
> -	if (e->num_engines != clone->num_engines) {
> -		err = -EINVAL;
> -		goto unlock;
> -	}
> -
> -	for (n = 0; n < e->num_engines; n++) {
> -		struct intel_context *ce = e->engines[n];
> -
> -		if (clone->engines[n]->engine->class != ce->engine->class) {
> -			/* Must have compatible engine maps! */
> -			err = -EINVAL;
> -			goto unlock;
> -		}
> -
> -		/* serialises with set_sseu */
> -		err = intel_context_lock_pinned(ce);
> -		if (err)
> -			goto unlock;
> -
> -		clone->engines[n]->sseu = ce->sseu;
> -		intel_context_unlock_pinned(ce);
> -	}
> -
> -	err = 0;
> -unlock:
> -	i915_gem_context_unlock_engines(src);
> -	return err;
> -}
> -
> -static int clone_timeline(struct i915_gem_context *dst,
> -			  struct i915_gem_context *src)
> -{
> -	if (src->timeline)
> -		__assign_timeline(dst, src->timeline);
> -
> -	return 0;
> -}
> -
> -static int clone_vm(struct i915_gem_context *dst,
> -		    struct i915_gem_context *src)
> -{
> -	struct i915_address_space *vm;
> -	int err = 0;
> -
> -	if (!rcu_access_pointer(src->vm))
> -		return 0;
> -
> -	rcu_read_lock();
> -	vm = context_get_vm_rcu(src);
> -	rcu_read_unlock();
> -
> -	if (!mutex_lock_interruptible(&dst->mutex)) {
> -		__assign_ppgtt(dst, vm);
> -		mutex_unlock(&dst->mutex);
> -	} else {
> -		err = -EINTR;
> -	}
> -
> -	i915_vm_put(vm);
> -	return err;
> -}
> -
> -static int create_clone(struct i915_user_extension __user *ext, void *data)
> -{
> -	static int (* const fn[])(struct i915_gem_context *dst,
> -				  struct i915_gem_context *src) = {
> -#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
> -		MAP(ENGINES, clone_engines),
> -		MAP(FLAGS, clone_flags),
> -		MAP(SCHEDATTR, clone_schedattr),
> -		MAP(SSEU, clone_sseu),
> -		MAP(TIMELINE, clone_timeline),
> -		MAP(VM, clone_vm),
> -#undef MAP
> -	};
> -	struct drm_i915_gem_context_create_ext_clone local;
> -	const struct create_ext *arg = data;
> -	struct i915_gem_context *dst = arg->ctx;
> -	struct i915_gem_context *src;
> -	int err, bit;
> -
> -	if (copy_from_user(&local, ext, sizeof(local)))
> -		return -EFAULT;
> -
> -	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
> -		     I915_CONTEXT_CLONE_UNKNOWN);
> -
> -	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
> -		return -EINVAL;
> -
> -	if (local.rsvd)
> -		return -EINVAL;
> -
> -	rcu_read_lock();
> -	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
> -	rcu_read_unlock();
> -	if (!src)
> -		return -ENOENT;
> -
> -	GEM_BUG_ON(src == dst);
> -
> -	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
> -		if (!(local.flags & BIT(bit)))
> -			continue;
> -
> -		err = fn[bit](dst, src);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> +	return -EINVAL;
>   }
>   
>   static const i915_user_extension_fn create_extensions[] = {
>   	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
> -	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
> +	[I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
>   };
>   
>   static bool client_is_banned(struct drm_i915_file_private *file_priv)
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a0aaa8298f28d..75a71b6756ed8 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
>   	struct drm_i915_gem_context_param param;
>   };
>   
> -struct drm_i915_gem_context_create_ext_clone {
> +/* This API has been removed.  On the off chance someone somewhere has
> + * attempted to use it, never re-use this extension number.
> + */
>   #define I915_CONTEXT_CREATE_EXT_CLONE 1
> -	struct i915_user_extension base;
> -	__u32 clone_id;
> -	__u32 flags;
> -#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
> -#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
> -#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
> -#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
> -#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
> -#define I915_CONTEXT_CLONE_VM		(1u << 5)
> -#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
> -	__u64 rsvd;
> -};
>   
>   struct drm_i915_gem_context_destroy {
>   	__u32 ctx_id;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-28 14:26           ` Tvrtko Ursulin
@ 2021-04-28 17:09             ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:09 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> On 28/04/2021 15:02, Daniel Vetter wrote:
> > On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>
> >> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>> There's no sense in allowing userspace to create more engines than it
> >>>> can possibly access via execbuf.
> >>>>
> >>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>    1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>                    return -EINVAL;
> >>>>            }
> >>>> -  /*
> >>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>> -   * first 64 engines defined here.
> >>>> -   */
> >>>>            num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>
> >>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>> directly here */ since I had to check that :-)
> >>>
> >>> Same story about igt testcases needed, just to be sure.
> >>>
> >>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>
> >> I am not sure about the churn vs benefit ratio here. There are also patches
> >> which extend the engine selection field in execbuf2 over the unused
> >> constants bits (with an explicit flag). So churn upstream and churn in
> >> internal (if interesting) for not much benefit.
> >
> > This isn't churn.
> >
> > This is "lock done uapi properly".

Pretty much.

> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> for other people and man hours spent which no one will ever properly
> account against.
>
> Number of contexts in the engine map should not really be tied to
> execbuf2. As is demonstrated by the incoming work to address more than
> 63 engines, either as an extension to execbuf2 or future execbuf3.

Which userspace driver has requested more than 64 engines in a single context?

Also, for execbuf3, I'd like to get rid of contexts entirely and have
engines be their own userspace-visible object.  If we go this
direction, you can have UINT32_MAX of them.  Problem solved.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-28 17:09             ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:09 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> On 28/04/2021 15:02, Daniel Vetter wrote:
> > On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>
> >> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>> There's no sense in allowing userspace to create more engines than it
> >>>> can possibly access via execbuf.
> >>>>
> >>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>    1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>                    return -EINVAL;
> >>>>            }
> >>>> -  /*
> >>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>> -   * first 64 engines defined here.
> >>>> -   */
> >>>>            num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>
> >>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>> directly here */ since I had to check that :-)
> >>>
> >>> Same story about igt testcases needed, just to be sure.
> >>>
> >>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>
> >> I am not sure about the churn vs benefit ratio here. There are also patches
> >> which extend the engine selection field in execbuf2 over the unused
> >> constants bits (with an explicit flag). So churn upstream and churn in
> >> internal (if interesting) for not much benefit.
> >
> > This isn't churn.
> >
> > This is "lock done uapi properly".

Pretty much.

> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> for other people and man hours spent which no one will ever properly
> account against.
>
> Number of contexts in the engine map should not really be tied to
> execbuf2. As is demonstrated by the incoming work to address more than
> 63 engines, either as an extension to execbuf2 or future execbuf3.

Which userspace driver has requested more than 64 engines in a single context?

Also, for execbuf3, I'd like to get rid of contexts entirely and have
engines be their own userspace-visible object.  If we go this
direction, you can have UINT32_MAX of them.  Problem solved.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 10:13       ` [Intel-gfx] " Daniel Vetter
@ 2021-04-28 17:18         ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > This adds a bunch of complexity which the media driver has never
> > > actually used.  The media driver does technically bond a balanced engine
> > > to another engine but the balanced engine only has one engine in the
> > > sibling set.  This doesn't actually result in a virtual engine.
> > >
> > > Unless some userspace badly wants it, there's no good reason to support
> > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > leave the validation code in place in case we ever decide we want to do
> > > something interesting with the bonding information.
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index e8179918fa306..5f8d0faf783aa 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > >         }
> > >         virtual = set->engines->engines[idx]->engine;
> > >
> > > +       if (intel_engine_is_virtual(virtual)) {
> > > +               drm_dbg(&i915->drm,
> > > +                       "Bonding with virtual engines not allowed\n");
> > > +               return -EINVAL;
> > > +       }
> > > +
> > >         err = check_user_mbz(&ext->flags);
> > >         if (err)
> > >                 return err;
> > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > >                                 n, ci.engine_class, ci.engine_instance);
> > >                         return -EINVAL;
> > >                 }
> > > -
> > > -               /*
> > > -                * A non-virtual engine has no siblings to choose between; and
> > > -                * a submit fence will always be directed to the one engine.
> > > -                */
> > > -               if (intel_engine_is_virtual(virtual)) {
> > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > -                                                              master,
> > > -                                                              bond);
> > > -                       if (err)
> > > -                               return err;
> > > -               }
> > >         }
> > >
> > >         return 0;
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > >                         err = i915_request_await_execution(eb.request,
> > >                                                            in_fence,
> > > -                                                          eb.engine->bond_execute);
> > > +                                                          NULL);
> > >                 else
> > >                         err = i915_request_await_dma_fence(eb.request,
> > >                                                            in_fence);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > index 883bafc449024..68cfe5080325c 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > >          */
> > >         void            (*submit_request)(struct i915_request *rq);
> > >
> > > -       /*
> > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > -        * request down to the bonded pairs.
> > > -        */
> > > -       void            (*bond_execute)(struct i915_request *rq,
> > > -                                       struct dma_fence *signal);
> > > -
> > >         /*
> > >          * Call when the priority on a request has changed and it and its
> > >          * dependencies may need rescheduling. Note the request itself may
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > index de124870af44d..b6e2b59f133b7 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > >                 int prio;
> > >         } nodes[I915_NUM_ENGINES];
> > >
> > > -       /*
> > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > -        * of physical engines any particular request may be submitted to.
> > > -        * If we receive a submit-fence from a master engine, we will only
> > > -        * use one of sibling_mask physical engines.
> > > -        */
> > > -       struct ve_bond {
> > > -               const struct intel_engine_cs *master;
> > > -               intel_engine_mask_t sibling_mask;
> > > -       } *bonds;
> > > -       unsigned int num_bonds;
> > > -
> > >         /* And finally, which physical engines this virtual engine maps onto. */
> > >         unsigned int num_siblings;
> > >         struct intel_engine_cs *siblings[];
> > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > >         intel_engine_free_request_pool(&ve->base);
> > >
> > > -       kfree(ve->bonds);
> > >         kfree(ve);
> > >  }
> > >
> > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > >  }
> > >
> > > -static struct ve_bond *
> > > -virtual_find_bond(struct virtual_engine *ve,
> > > -                 const struct intel_engine_cs *master)
> > > -{
> > > -       int i;
> > > -
> > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > -               if (ve->bonds[i].master == master)
> > > -                       return &ve->bonds[i];
> > > -       }
> > > -
> > > -       return NULL;
> > > -}
> > > -
> > > -static void
> > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > -{
> > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > -       intel_engine_mask_t allowed, exec;
> > > -       struct ve_bond *bond;
> > > -
> > > -       allowed = ~to_request(signal)->engine->mask;
> > > -
> > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > -       if (bond)
> > > -               allowed &= bond->sibling_mask;
> > > -
> > > -       /* Restrict the bonded request to run on only the available engines */
> > > -       exec = READ_ONCE(rq->execution_mask);
> > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > -               ;
> > > -
> > > -       /* Prevent the master from being re-run on the bonded engines */
> > > -       to_request(signal)->execution_mask &= ~allowed;
> >
> > I sent a v2 of this patch because it turns out I deleted a bit too
> > much code.  This function in particular, has to stay, unfortunately.
> > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > the work onto a different engine than than the one it's supposed to
> > run in parallel with.  This means we can't dead-code this function or
> > the bond_execution function pointer and related stuff.
>
> Uh that's disappointing, since if I understand your point correctly, the
> sibling engines should all be singletons, not load balancing virtual ones.
> So there really should not be any need to pick the right one at execution
> time.

The media driver itself seems to work fine if I delete all the code.
It's just an IGT testcase that blows up.  I'll do more digging to see
if I can better isolate why.

--Jason

> At least my understanding is that we're only limiting the engine set
> further, so if both signaller and signalled request can only run on
> singletons (which must be distinct, or the bonded parameter validation is
> busted) there's really nothing to do here.
>
> Also this is the locking code that freaks me out about the current bonded
> execlist code ...
>
> Dazzled and confused.
> -Daniel
>
> >
> > --Jason
> >
> >
> > > -}
> > > -
> > >  struct intel_context *
> > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > >                                unsigned int count)
> > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > >
> > >         ve->base.schedule = i915_schedule;
> > >         ve->base.submit_request = virtual_submit_request;
> > > -       ve->base.bond_execute = virtual_bond_execute;
> > >
> > >         INIT_LIST_HEAD(virtual_queue(ve));
> > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > >         if (IS_ERR(dst))
> > >                 return dst;
> > >
> > > -       if (se->num_bonds) {
> > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > -
> > > -               de->bonds = kmemdup(se->bonds,
> > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > -                                   GFP_KERNEL);
> > > -               if (!de->bonds) {
> > > -                       intel_context_put(dst);
> > > -                       return ERR_PTR(-ENOMEM);
> > > -               }
> > > -
> > > -               de->num_bonds = se->num_bonds;
> > > -       }
> > > -
> > >         return dst;
> > >  }
> > >
> > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > -                                    const struct intel_engine_cs *master,
> > > -                                    const struct intel_engine_cs *sibling)
> > > -{
> > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > -       struct ve_bond *bond;
> > > -       int n;
> > > -
> > > -       /* Sanity check the sibling is part of the virtual engine */
> > > -       for (n = 0; n < ve->num_siblings; n++)
> > > -               if (sibling == ve->siblings[n])
> > > -                       break;
> > > -       if (n == ve->num_siblings)
> > > -               return -EINVAL;
> > > -
> > > -       bond = virtual_find_bond(ve, master);
> > > -       if (bond) {
> > > -               bond->sibling_mask |= sibling->mask;
> > > -               return 0;
> > > -       }
> > > -
> > > -       bond = krealloc(ve->bonds,
> > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > -                       GFP_KERNEL);
> > > -       if (!bond)
> > > -               return -ENOMEM;
> > > -
> > > -       bond[ve->num_bonds].master = master;
> > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > -
> > > -       ve->bonds = bond;
> > > -       ve->num_bonds++;
> > > -
> > > -       return 0;
> > > -}
> > > -
> > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > >                                    struct drm_printer *m,
> > >                                    void (*show_request)(struct drm_printer *m,
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > index fd61dae820e9e..80cec37a56ba9 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > >  struct intel_context *
> > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > >
> > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > -                                    const struct intel_engine_cs *master,
> > > -                                    const struct intel_engine_cs *sibling);
> > > -
> > >  bool
> > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > index 1081cd36a2bd3..f03446d587160 100644
> > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > >         return 0;
> > >  }
> > >
> > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > -                              unsigned int class,
> > > -                              struct intel_engine_cs **siblings,
> > > -                              unsigned int nsibling,
> > > -                              unsigned int flags)
> > > -#define BOND_SCHEDULE BIT(0)
> > > -{
> > > -       struct intel_engine_cs *master;
> > > -       struct i915_request *rq[16];
> > > -       enum intel_engine_id id;
> > > -       struct igt_spinner spin;
> > > -       unsigned long n;
> > > -       int err;
> > > -
> > > -       /*
> > > -        * A set of bonded requests is intended to be run concurrently
> > > -        * across a number of engines. We use one request per-engine
> > > -        * and a magic fence to schedule each of the bonded requests
> > > -        * at the same time. A consequence of our current scheduler is that
> > > -        * we only move requests to the HW ready queue when the request
> > > -        * becomes ready, that is when all of its prerequisite fences have
> > > -        * been signaled. As one of those fences is the master submit fence,
> > > -        * there is a delay on all secondary fences as the HW may be
> > > -        * currently busy. Equally, as all the requests are independent,
> > > -        * they may have other fences that delay individual request
> > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > -        * immediately submitted to HW at the same time, just that if the
> > > -        * rules are abided by, they are ready at the same time as the
> > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > -        * to ensure parallel execution of its phases as it requires.
> > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > -        * take care of parallel execution, even across preemption events on
> > > -        * different HW. (The proper answer is of course "lalalala".)
> > > -        *
> > > -        * With the submit-fence, we have identified three possible phases
> > > -        * of synchronisation depending on the master fence: queued (not
> > > -        * ready), executing, and signaled. The first two are quite simple
> > > -        * and checked below. However, the signaled master fence handling is
> > > -        * contentious. Currently we do not distinguish between a signaled
> > > -        * fence and an expired fence, as once signaled it does not convey
> > > -        * any information about the previous execution. It may even be freed
> > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > -        * do not apply the bonding constraint for an already signaled fence,
> > > -        * as our expectation is that it should not constrain the secondaries
> > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > -        * userspace requests are meant to be running in parallel). As
> > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > -        * check below as normal execution flows are checked extensively above.
> > > -        *
> > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > -        * expected behaviour for userpace?
> > > -        */
> > > -
> > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > -
> > > -       if (igt_spinner_init(&spin, gt))
> > > -               return -ENOMEM;
> > > -
> > > -       err = 0;
> > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > -       for_each_engine(master, gt, id) {
> > > -               struct i915_sw_fence fence = {};
> > > -               struct intel_context *ce;
> > > -
> > > -               if (master->class == class)
> > > -                       continue;
> > > -
> > > -               ce = intel_context_create(master);
> > > -               if (IS_ERR(ce)) {
> > > -                       err = PTR_ERR(ce);
> > > -                       goto out;
> > > -               }
> > > -
> > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > -
> > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > -               intel_context_put(ce);
> > > -               if (IS_ERR(rq[0])) {
> > > -                       err = PTR_ERR(rq[0]);
> > > -                       goto out;
> > > -               }
> > > -               i915_request_get(rq[0]);
> > > -
> > > -               if (flags & BOND_SCHEDULE) {
> > > -                       onstack_fence_init(&fence);
> > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > -                                                              &fence,
> > > -                                                              GFP_KERNEL);
> > > -               }
> > > -
> > > -               i915_request_add(rq[0]);
> > > -               if (err < 0)
> > > -                       goto out;
> > > -
> > > -               if (!(flags & BOND_SCHEDULE) &&
> > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > -                       err = -EIO;
> > > -                       goto out;
> > > -               }
> > > -
> > > -               for (n = 0; n < nsibling; n++) {
> > > -                       struct intel_context *ve;
> > > -
> > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > -                       if (IS_ERR(ve)) {
> > > -                               err = PTR_ERR(ve);
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > -                                                              master,
> > > -                                                              siblings[n]);
> > > -                       if (err) {
> > > -                               intel_context_put(ve);
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       err = intel_context_pin(ve);
> > > -                       intel_context_put(ve);
> > > -                       if (err) {
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       rq[n + 1] = i915_request_create(ve);
> > > -                       intel_context_unpin(ve);
> > > -                       if (IS_ERR(rq[n + 1])) {
> > > -                               err = PTR_ERR(rq[n + 1]);
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -                       i915_request_get(rq[n + 1]);
> > > -
> > > -                       err = i915_request_await_execution(rq[n + 1],
> > > -                                                          &rq[0]->fence,
> > > -                                                          ve->engine->bond_execute);
> > > -                       i915_request_add(rq[n + 1]);
> > > -                       if (err < 0) {
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -               }
> > > -               onstack_fence_fini(&fence);
> > > -               intel_engine_flush_submission(master);
> > > -               igt_spinner_end(&spin);
> > > -
> > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > -                              rq[0]->engine->name);
> > > -                       err = -EIO;
> > > -                       goto out;
> > > -               }
> > > -
> > > -               for (n = 0; n < nsibling; n++) {
> > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > -                               err = -EIO;
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > -                                      siblings[n]->name,
> > > -                                      rq[n + 1]->engine->name,
> > > -                                      rq[0]->engine->name);
> > > -                               err = -EINVAL;
> > > -                               goto out;
> > > -                       }
> > > -               }
> > > -
> > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > -                       i915_request_put(rq[n]);
> > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > -       }
> > > -
> > > -out:
> > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > -               i915_request_put(rq[n]);
> > > -       if (igt_flush_test(gt->i915))
> > > -               err = -EIO;
> > > -
> > > -       igt_spinner_fini(&spin);
> > > -       return err;
> > > -}
> > > -
> > > -static int live_virtual_bond(void *arg)
> > > -{
> > > -       static const struct phase {
> > > -               const char *name;
> > > -               unsigned int flags;
> > > -       } phases[] = {
> > > -               { "", 0 },
> > > -               { "schedule", BOND_SCHEDULE },
> > > -               { },
> > > -       };
> > > -       struct intel_gt *gt = arg;
> > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > -       unsigned int class;
> > > -       int err;
> > > -
> > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > -               return 0;
> > > -
> > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > -               const struct phase *p;
> > > -               int nsibling;
> > > -
> > > -               nsibling = select_siblings(gt, class, siblings);
> > > -               if (nsibling < 2)
> > > -                       continue;
> > > -
> > > -               for (p = phases; p->name; p++) {
> > > -                       err = bond_virtual_engine(gt,
> > > -                                                 class, siblings, nsibling,
> > > -                                                 p->flags);
> > > -                       if (err) {
> > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > -                                      __func__, p->name, class, nsibling, err);
> > > -                               return err;
> > > -                       }
> > > -               }
> > > -       }
> > > -
> > > -       return 0;
> > > -}
> > > -
> > >  static int reset_virtual_engine(struct intel_gt *gt,
> > >                                 struct intel_engine_cs **siblings,
> > >                                 unsigned int nsibling)
> > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > >                 SUBTEST(live_virtual_mask),
> > >                 SUBTEST(live_virtual_preserved),
> > >                 SUBTEST(live_virtual_slice),
> > > -               SUBTEST(live_virtual_bond),
> > >                 SUBTEST(live_virtual_reset),
> > >         };
> > >
> > > --
> > > 2.31.1
> > >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 17:18         ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > This adds a bunch of complexity which the media driver has never
> > > actually used.  The media driver does technically bond a balanced engine
> > > to another engine but the balanced engine only has one engine in the
> > > sibling set.  This doesn't actually result in a virtual engine.
> > >
> > > Unless some userspace badly wants it, there's no good reason to support
> > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > leave the validation code in place in case we ever decide we want to do
> > > something interesting with the bonding information.
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index e8179918fa306..5f8d0faf783aa 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > >         }
> > >         virtual = set->engines->engines[idx]->engine;
> > >
> > > +       if (intel_engine_is_virtual(virtual)) {
> > > +               drm_dbg(&i915->drm,
> > > +                       "Bonding with virtual engines not allowed\n");
> > > +               return -EINVAL;
> > > +       }
> > > +
> > >         err = check_user_mbz(&ext->flags);
> > >         if (err)
> > >                 return err;
> > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > >                                 n, ci.engine_class, ci.engine_instance);
> > >                         return -EINVAL;
> > >                 }
> > > -
> > > -               /*
> > > -                * A non-virtual engine has no siblings to choose between; and
> > > -                * a submit fence will always be directed to the one engine.
> > > -                */
> > > -               if (intel_engine_is_virtual(virtual)) {
> > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > -                                                              master,
> > > -                                                              bond);
> > > -                       if (err)
> > > -                               return err;
> > > -               }
> > >         }
> > >
> > >         return 0;
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > >                         err = i915_request_await_execution(eb.request,
> > >                                                            in_fence,
> > > -                                                          eb.engine->bond_execute);
> > > +                                                          NULL);
> > >                 else
> > >                         err = i915_request_await_dma_fence(eb.request,
> > >                                                            in_fence);
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > index 883bafc449024..68cfe5080325c 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > >          */
> > >         void            (*submit_request)(struct i915_request *rq);
> > >
> > > -       /*
> > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > -        * request down to the bonded pairs.
> > > -        */
> > > -       void            (*bond_execute)(struct i915_request *rq,
> > > -                                       struct dma_fence *signal);
> > > -
> > >         /*
> > >          * Call when the priority on a request has changed and it and its
> > >          * dependencies may need rescheduling. Note the request itself may
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > index de124870af44d..b6e2b59f133b7 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > >                 int prio;
> > >         } nodes[I915_NUM_ENGINES];
> > >
> > > -       /*
> > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > -        * of physical engines any particular request may be submitted to.
> > > -        * If we receive a submit-fence from a master engine, we will only
> > > -        * use one of sibling_mask physical engines.
> > > -        */
> > > -       struct ve_bond {
> > > -               const struct intel_engine_cs *master;
> > > -               intel_engine_mask_t sibling_mask;
> > > -       } *bonds;
> > > -       unsigned int num_bonds;
> > > -
> > >         /* And finally, which physical engines this virtual engine maps onto. */
> > >         unsigned int num_siblings;
> > >         struct intel_engine_cs *siblings[];
> > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > >         intel_engine_free_request_pool(&ve->base);
> > >
> > > -       kfree(ve->bonds);
> > >         kfree(ve);
> > >  }
> > >
> > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > >  }
> > >
> > > -static struct ve_bond *
> > > -virtual_find_bond(struct virtual_engine *ve,
> > > -                 const struct intel_engine_cs *master)
> > > -{
> > > -       int i;
> > > -
> > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > -               if (ve->bonds[i].master == master)
> > > -                       return &ve->bonds[i];
> > > -       }
> > > -
> > > -       return NULL;
> > > -}
> > > -
> > > -static void
> > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > -{
> > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > -       intel_engine_mask_t allowed, exec;
> > > -       struct ve_bond *bond;
> > > -
> > > -       allowed = ~to_request(signal)->engine->mask;
> > > -
> > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > -       if (bond)
> > > -               allowed &= bond->sibling_mask;
> > > -
> > > -       /* Restrict the bonded request to run on only the available engines */
> > > -       exec = READ_ONCE(rq->execution_mask);
> > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > -               ;
> > > -
> > > -       /* Prevent the master from being re-run on the bonded engines */
> > > -       to_request(signal)->execution_mask &= ~allowed;
> >
> > I sent a v2 of this patch because it turns out I deleted a bit too
> > much code.  This function in particular, has to stay, unfortunately.
> > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > the work onto a different engine than than the one it's supposed to
> > run in parallel with.  This means we can't dead-code this function or
> > the bond_execution function pointer and related stuff.
>
> Uh that's disappointing, since if I understand your point correctly, the
> sibling engines should all be singletons, not load balancing virtual ones.
> So there really should not be any need to pick the right one at execution
> time.

The media driver itself seems to work fine if I delete all the code.
It's just an IGT testcase that blows up.  I'll do more digging to see
if I can better isolate why.

--Jason

> At least my understanding is that we're only limiting the engine set
> further, so if both signaller and signalled request can only run on
> singletons (which must be distinct, or the bonded parameter validation is
> busted) there's really nothing to do here.
>
> Also this is the locking code that freaks me out about the current bonded
> execlist code ...
>
> Dazzled and confused.
> -Daniel
>
> >
> > --Jason
> >
> >
> > > -}
> > > -
> > >  struct intel_context *
> > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > >                                unsigned int count)
> > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > >
> > >         ve->base.schedule = i915_schedule;
> > >         ve->base.submit_request = virtual_submit_request;
> > > -       ve->base.bond_execute = virtual_bond_execute;
> > >
> > >         INIT_LIST_HEAD(virtual_queue(ve));
> > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > >         if (IS_ERR(dst))
> > >                 return dst;
> > >
> > > -       if (se->num_bonds) {
> > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > -
> > > -               de->bonds = kmemdup(se->bonds,
> > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > -                                   GFP_KERNEL);
> > > -               if (!de->bonds) {
> > > -                       intel_context_put(dst);
> > > -                       return ERR_PTR(-ENOMEM);
> > > -               }
> > > -
> > > -               de->num_bonds = se->num_bonds;
> > > -       }
> > > -
> > >         return dst;
> > >  }
> > >
> > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > -                                    const struct intel_engine_cs *master,
> > > -                                    const struct intel_engine_cs *sibling)
> > > -{
> > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > -       struct ve_bond *bond;
> > > -       int n;
> > > -
> > > -       /* Sanity check the sibling is part of the virtual engine */
> > > -       for (n = 0; n < ve->num_siblings; n++)
> > > -               if (sibling == ve->siblings[n])
> > > -                       break;
> > > -       if (n == ve->num_siblings)
> > > -               return -EINVAL;
> > > -
> > > -       bond = virtual_find_bond(ve, master);
> > > -       if (bond) {
> > > -               bond->sibling_mask |= sibling->mask;
> > > -               return 0;
> > > -       }
> > > -
> > > -       bond = krealloc(ve->bonds,
> > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > -                       GFP_KERNEL);
> > > -       if (!bond)
> > > -               return -ENOMEM;
> > > -
> > > -       bond[ve->num_bonds].master = master;
> > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > -
> > > -       ve->bonds = bond;
> > > -       ve->num_bonds++;
> > > -
> > > -       return 0;
> > > -}
> > > -
> > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > >                                    struct drm_printer *m,
> > >                                    void (*show_request)(struct drm_printer *m,
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > index fd61dae820e9e..80cec37a56ba9 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > >  struct intel_context *
> > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > >
> > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > -                                    const struct intel_engine_cs *master,
> > > -                                    const struct intel_engine_cs *sibling);
> > > -
> > >  bool
> > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > index 1081cd36a2bd3..f03446d587160 100644
> > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > >         return 0;
> > >  }
> > >
> > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > -                              unsigned int class,
> > > -                              struct intel_engine_cs **siblings,
> > > -                              unsigned int nsibling,
> > > -                              unsigned int flags)
> > > -#define BOND_SCHEDULE BIT(0)
> > > -{
> > > -       struct intel_engine_cs *master;
> > > -       struct i915_request *rq[16];
> > > -       enum intel_engine_id id;
> > > -       struct igt_spinner spin;
> > > -       unsigned long n;
> > > -       int err;
> > > -
> > > -       /*
> > > -        * A set of bonded requests is intended to be run concurrently
> > > -        * across a number of engines. We use one request per-engine
> > > -        * and a magic fence to schedule each of the bonded requests
> > > -        * at the same time. A consequence of our current scheduler is that
> > > -        * we only move requests to the HW ready queue when the request
> > > -        * becomes ready, that is when all of its prerequisite fences have
> > > -        * been signaled. As one of those fences is the master submit fence,
> > > -        * there is a delay on all secondary fences as the HW may be
> > > -        * currently busy. Equally, as all the requests are independent,
> > > -        * they may have other fences that delay individual request
> > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > -        * immediately submitted to HW at the same time, just that if the
> > > -        * rules are abided by, they are ready at the same time as the
> > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > -        * to ensure parallel execution of its phases as it requires.
> > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > -        * take care of parallel execution, even across preemption events on
> > > -        * different HW. (The proper answer is of course "lalalala".)
> > > -        *
> > > -        * With the submit-fence, we have identified three possible phases
> > > -        * of synchronisation depending on the master fence: queued (not
> > > -        * ready), executing, and signaled. The first two are quite simple
> > > -        * and checked below. However, the signaled master fence handling is
> > > -        * contentious. Currently we do not distinguish between a signaled
> > > -        * fence and an expired fence, as once signaled it does not convey
> > > -        * any information about the previous execution. It may even be freed
> > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > -        * do not apply the bonding constraint for an already signaled fence,
> > > -        * as our expectation is that it should not constrain the secondaries
> > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > -        * userspace requests are meant to be running in parallel). As
> > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > -        * check below as normal execution flows are checked extensively above.
> > > -        *
> > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > -        * expected behaviour for userpace?
> > > -        */
> > > -
> > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > -
> > > -       if (igt_spinner_init(&spin, gt))
> > > -               return -ENOMEM;
> > > -
> > > -       err = 0;
> > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > -       for_each_engine(master, gt, id) {
> > > -               struct i915_sw_fence fence = {};
> > > -               struct intel_context *ce;
> > > -
> > > -               if (master->class == class)
> > > -                       continue;
> > > -
> > > -               ce = intel_context_create(master);
> > > -               if (IS_ERR(ce)) {
> > > -                       err = PTR_ERR(ce);
> > > -                       goto out;
> > > -               }
> > > -
> > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > -
> > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > -               intel_context_put(ce);
> > > -               if (IS_ERR(rq[0])) {
> > > -                       err = PTR_ERR(rq[0]);
> > > -                       goto out;
> > > -               }
> > > -               i915_request_get(rq[0]);
> > > -
> > > -               if (flags & BOND_SCHEDULE) {
> > > -                       onstack_fence_init(&fence);
> > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > -                                                              &fence,
> > > -                                                              GFP_KERNEL);
> > > -               }
> > > -
> > > -               i915_request_add(rq[0]);
> > > -               if (err < 0)
> > > -                       goto out;
> > > -
> > > -               if (!(flags & BOND_SCHEDULE) &&
> > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > -                       err = -EIO;
> > > -                       goto out;
> > > -               }
> > > -
> > > -               for (n = 0; n < nsibling; n++) {
> > > -                       struct intel_context *ve;
> > > -
> > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > -                       if (IS_ERR(ve)) {
> > > -                               err = PTR_ERR(ve);
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > -                                                              master,
> > > -                                                              siblings[n]);
> > > -                       if (err) {
> > > -                               intel_context_put(ve);
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       err = intel_context_pin(ve);
> > > -                       intel_context_put(ve);
> > > -                       if (err) {
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       rq[n + 1] = i915_request_create(ve);
> > > -                       intel_context_unpin(ve);
> > > -                       if (IS_ERR(rq[n + 1])) {
> > > -                               err = PTR_ERR(rq[n + 1]);
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -                       i915_request_get(rq[n + 1]);
> > > -
> > > -                       err = i915_request_await_execution(rq[n + 1],
> > > -                                                          &rq[0]->fence,
> > > -                                                          ve->engine->bond_execute);
> > > -                       i915_request_add(rq[n + 1]);
> > > -                       if (err < 0) {
> > > -                               onstack_fence_fini(&fence);
> > > -                               goto out;
> > > -                       }
> > > -               }
> > > -               onstack_fence_fini(&fence);
> > > -               intel_engine_flush_submission(master);
> > > -               igt_spinner_end(&spin);
> > > -
> > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > -                              rq[0]->engine->name);
> > > -                       err = -EIO;
> > > -                       goto out;
> > > -               }
> > > -
> > > -               for (n = 0; n < nsibling; n++) {
> > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > -                               err = -EIO;
> > > -                               goto out;
> > > -                       }
> > > -
> > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > -                                      siblings[n]->name,
> > > -                                      rq[n + 1]->engine->name,
> > > -                                      rq[0]->engine->name);
> > > -                               err = -EINVAL;
> > > -                               goto out;
> > > -                       }
> > > -               }
> > > -
> > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > -                       i915_request_put(rq[n]);
> > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > -       }
> > > -
> > > -out:
> > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > -               i915_request_put(rq[n]);
> > > -       if (igt_flush_test(gt->i915))
> > > -               err = -EIO;
> > > -
> > > -       igt_spinner_fini(&spin);
> > > -       return err;
> > > -}
> > > -
> > > -static int live_virtual_bond(void *arg)
> > > -{
> > > -       static const struct phase {
> > > -               const char *name;
> > > -               unsigned int flags;
> > > -       } phases[] = {
> > > -               { "", 0 },
> > > -               { "schedule", BOND_SCHEDULE },
> > > -               { },
> > > -       };
> > > -       struct intel_gt *gt = arg;
> > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > -       unsigned int class;
> > > -       int err;
> > > -
> > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > -               return 0;
> > > -
> > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > -               const struct phase *p;
> > > -               int nsibling;
> > > -
> > > -               nsibling = select_siblings(gt, class, siblings);
> > > -               if (nsibling < 2)
> > > -                       continue;
> > > -
> > > -               for (p = phases; p->name; p++) {
> > > -                       err = bond_virtual_engine(gt,
> > > -                                                 class, siblings, nsibling,
> > > -                                                 p->flags);
> > > -                       if (err) {
> > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > -                                      __func__, p->name, class, nsibling, err);
> > > -                               return err;
> > > -                       }
> > > -               }
> > > -       }
> > > -
> > > -       return 0;
> > > -}
> > > -
> > >  static int reset_virtual_engine(struct intel_gt *gt,
> > >                                 struct intel_engine_cs **siblings,
> > >                                 unsigned int nsibling)
> > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > >                 SUBTEST(live_virtual_mask),
> > >                 SUBTEST(live_virtual_preserved),
> > >                 SUBTEST(live_virtual_slice),
> > > -               SUBTEST(live_virtual_bond),
> > >                 SUBTEST(live_virtual_reset),
> > >         };
> > >
> > > --
> > > 2.31.1
> > >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 17:18         ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 17:18           ` Matthew Brost
  -1 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-04-28 17:18 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > >
> > > > This adds a bunch of complexity which the media driver has never
> > > > actually used.  The media driver does technically bond a balanced engine
> > > > to another engine but the balanced engine only has one engine in the
> > > > sibling set.  This doesn't actually result in a virtual engine.
> > > >
> > > > Unless some userspace badly wants it, there's no good reason to support
> > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > leave the validation code in place in case we ever decide we want to do
> > > > something interesting with the bonding information.
> > > >
> > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > ---
> > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > >         }
> > > >         virtual = set->engines->engines[idx]->engine;
> > > >
> > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > +               drm_dbg(&i915->drm,
> > > > +                       "Bonding with virtual engines not allowed\n");
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > >         err = check_user_mbz(&ext->flags);
> > > >         if (err)
> > > >                 return err;
> > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > >                                 n, ci.engine_class, ci.engine_instance);
> > > >                         return -EINVAL;
> > > >                 }
> > > > -
> > > > -               /*
> > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > -                * a submit fence will always be directed to the one engine.
> > > > -                */
> > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > -                                                              master,
> > > > -                                                              bond);
> > > > -                       if (err)
> > > > -                               return err;
> > > > -               }
> > > >         }
> > > >
> > > >         return 0;
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > >                         err = i915_request_await_execution(eb.request,
> > > >                                                            in_fence,
> > > > -                                                          eb.engine->bond_execute);
> > > > +                                                          NULL);
> > > >                 else
> > > >                         err = i915_request_await_dma_fence(eb.request,
> > > >                                                            in_fence);
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > index 883bafc449024..68cfe5080325c 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > >          */
> > > >         void            (*submit_request)(struct i915_request *rq);
> > > >
> > > > -       /*
> > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > -        * request down to the bonded pairs.
> > > > -        */
> > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > -                                       struct dma_fence *signal);
> > > > -
> > > >         /*
> > > >          * Call when the priority on a request has changed and it and its
> > > >          * dependencies may need rescheduling. Note the request itself may
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > index de124870af44d..b6e2b59f133b7 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > >                 int prio;
> > > >         } nodes[I915_NUM_ENGINES];
> > > >
> > > > -       /*
> > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > -        * of physical engines any particular request may be submitted to.
> > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > -        * use one of sibling_mask physical engines.
> > > > -        */
> > > > -       struct ve_bond {
> > > > -               const struct intel_engine_cs *master;
> > > > -               intel_engine_mask_t sibling_mask;
> > > > -       } *bonds;
> > > > -       unsigned int num_bonds;
> > > > -
> > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > >         unsigned int num_siblings;
> > > >         struct intel_engine_cs *siblings[];
> > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > >         intel_engine_free_request_pool(&ve->base);
> > > >
> > > > -       kfree(ve->bonds);
> > > >         kfree(ve);
> > > >  }
> > > >
> > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > >  }
> > > >
> > > > -static struct ve_bond *
> > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > -                 const struct intel_engine_cs *master)
> > > > -{
> > > > -       int i;
> > > > -
> > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > -               if (ve->bonds[i].master == master)
> > > > -                       return &ve->bonds[i];
> > > > -       }
> > > > -
> > > > -       return NULL;
> > > > -}
> > > > -
> > > > -static void
> > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > -{
> > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > -       intel_engine_mask_t allowed, exec;
> > > > -       struct ve_bond *bond;
> > > > -
> > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > -
> > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > -       if (bond)
> > > > -               allowed &= bond->sibling_mask;
> > > > -
> > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > -               ;
> > > > -
> > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > -       to_request(signal)->execution_mask &= ~allowed;
> > >
> > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > much code.  This function in particular, has to stay, unfortunately.
> > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > the work onto a different engine than than the one it's supposed to
> > > run in parallel with.  This means we can't dead-code this function or
> > > the bond_execution function pointer and related stuff.
> >
> > Uh that's disappointing, since if I understand your point correctly, the
> > sibling engines should all be singletons, not load balancing virtual ones.
> > So there really should not be any need to pick the right one at execution
> > time.
> 
> The media driver itself seems to work fine if I delete all the code.
> It's just an IGT testcase that blows up.  I'll do more digging to see
> if I can better isolate why.
> 

Jumping on here mid-thread. For what is is worth to make execlists work
with the upcoming parallel submission extension I leveraged some of the
existing bonding code so I wouldn't be too eager to delete this code
until that lands.

Matt

> --Jason
> 
> > At least my understanding is that we're only limiting the engine set
> > further, so if both signaller and signalled request can only run on
> > singletons (which must be distinct, or the bonded parameter validation is
> > busted) there's really nothing to do here.
> >
> > Also this is the locking code that freaks me out about the current bonded
> > execlist code ...
> >
> > Dazzled and confused.
> > -Daniel
> >
> > >
> > > --Jason
> > >
> > >
> > > > -}
> > > > -
> > > >  struct intel_context *
> > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > >                                unsigned int count)
> > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > >
> > > >         ve->base.schedule = i915_schedule;
> > > >         ve->base.submit_request = virtual_submit_request;
> > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > >
> > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > >         if (IS_ERR(dst))
> > > >                 return dst;
> > > >
> > > > -       if (se->num_bonds) {
> > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > -
> > > > -               de->bonds = kmemdup(se->bonds,
> > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > -                                   GFP_KERNEL);
> > > > -               if (!de->bonds) {
> > > > -                       intel_context_put(dst);
> > > > -                       return ERR_PTR(-ENOMEM);
> > > > -               }
> > > > -
> > > > -               de->num_bonds = se->num_bonds;
> > > > -       }
> > > > -
> > > >         return dst;
> > > >  }
> > > >
> > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > -                                    const struct intel_engine_cs *master,
> > > > -                                    const struct intel_engine_cs *sibling)
> > > > -{
> > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > -       struct ve_bond *bond;
> > > > -       int n;
> > > > -
> > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > -               if (sibling == ve->siblings[n])
> > > > -                       break;
> > > > -       if (n == ve->num_siblings)
> > > > -               return -EINVAL;
> > > > -
> > > > -       bond = virtual_find_bond(ve, master);
> > > > -       if (bond) {
> > > > -               bond->sibling_mask |= sibling->mask;
> > > > -               return 0;
> > > > -       }
> > > > -
> > > > -       bond = krealloc(ve->bonds,
> > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > -                       GFP_KERNEL);
> > > > -       if (!bond)
> > > > -               return -ENOMEM;
> > > > -
> > > > -       bond[ve->num_bonds].master = master;
> > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > -
> > > > -       ve->bonds = bond;
> > > > -       ve->num_bonds++;
> > > > -
> > > > -       return 0;
> > > > -}
> > > > -
> > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > >                                    struct drm_printer *m,
> > > >                                    void (*show_request)(struct drm_printer *m,
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > >  struct intel_context *
> > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > >
> > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > -                                    const struct intel_engine_cs *master,
> > > > -                                    const struct intel_engine_cs *sibling);
> > > > -
> > > >  bool
> > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > >         return 0;
> > > >  }
> > > >
> > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > -                              unsigned int class,
> > > > -                              struct intel_engine_cs **siblings,
> > > > -                              unsigned int nsibling,
> > > > -                              unsigned int flags)
> > > > -#define BOND_SCHEDULE BIT(0)
> > > > -{
> > > > -       struct intel_engine_cs *master;
> > > > -       struct i915_request *rq[16];
> > > > -       enum intel_engine_id id;
> > > > -       struct igt_spinner spin;
> > > > -       unsigned long n;
> > > > -       int err;
> > > > -
> > > > -       /*
> > > > -        * A set of bonded requests is intended to be run concurrently
> > > > -        * across a number of engines. We use one request per-engine
> > > > -        * and a magic fence to schedule each of the bonded requests
> > > > -        * at the same time. A consequence of our current scheduler is that
> > > > -        * we only move requests to the HW ready queue when the request
> > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > -        * there is a delay on all secondary fences as the HW may be
> > > > -        * currently busy. Equally, as all the requests are independent,
> > > > -        * they may have other fences that delay individual request
> > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > -        * immediately submitted to HW at the same time, just that if the
> > > > -        * rules are abided by, they are ready at the same time as the
> > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > -        * to ensure parallel execution of its phases as it requires.
> > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > -        * take care of parallel execution, even across preemption events on
> > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > -        *
> > > > -        * With the submit-fence, we have identified three possible phases
> > > > -        * of synchronisation depending on the master fence: queued (not
> > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > -        * and checked below. However, the signaled master fence handling is
> > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > -        * any information about the previous execution. It may even be freed
> > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > -        * as our expectation is that it should not constrain the secondaries
> > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > -        * userspace requests are meant to be running in parallel). As
> > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > -        * check below as normal execution flows are checked extensively above.
> > > > -        *
> > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > -        * expected behaviour for userpace?
> > > > -        */
> > > > -
> > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > -
> > > > -       if (igt_spinner_init(&spin, gt))
> > > > -               return -ENOMEM;
> > > > -
> > > > -       err = 0;
> > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > -       for_each_engine(master, gt, id) {
> > > > -               struct i915_sw_fence fence = {};
> > > > -               struct intel_context *ce;
> > > > -
> > > > -               if (master->class == class)
> > > > -                       continue;
> > > > -
> > > > -               ce = intel_context_create(master);
> > > > -               if (IS_ERR(ce)) {
> > > > -                       err = PTR_ERR(ce);
> > > > -                       goto out;
> > > > -               }
> > > > -
> > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > -
> > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > -               intel_context_put(ce);
> > > > -               if (IS_ERR(rq[0])) {
> > > > -                       err = PTR_ERR(rq[0]);
> > > > -                       goto out;
> > > > -               }
> > > > -               i915_request_get(rq[0]);
> > > > -
> > > > -               if (flags & BOND_SCHEDULE) {
> > > > -                       onstack_fence_init(&fence);
> > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > -                                                              &fence,
> > > > -                                                              GFP_KERNEL);
> > > > -               }
> > > > -
> > > > -               i915_request_add(rq[0]);
> > > > -               if (err < 0)
> > > > -                       goto out;
> > > > -
> > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > -                       err = -EIO;
> > > > -                       goto out;
> > > > -               }
> > > > -
> > > > -               for (n = 0; n < nsibling; n++) {
> > > > -                       struct intel_context *ve;
> > > > -
> > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > -                       if (IS_ERR(ve)) {
> > > > -                               err = PTR_ERR(ve);
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > -                                                              master,
> > > > -                                                              siblings[n]);
> > > > -                       if (err) {
> > > > -                               intel_context_put(ve);
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       err = intel_context_pin(ve);
> > > > -                       intel_context_put(ve);
> > > > -                       if (err) {
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > -                       intel_context_unpin(ve);
> > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -                       i915_request_get(rq[n + 1]);
> > > > -
> > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > -                                                          &rq[0]->fence,
> > > > -                                                          ve->engine->bond_execute);
> > > > -                       i915_request_add(rq[n + 1]);
> > > > -                       if (err < 0) {
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -               }
> > > > -               onstack_fence_fini(&fence);
> > > > -               intel_engine_flush_submission(master);
> > > > -               igt_spinner_end(&spin);
> > > > -
> > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > -                              rq[0]->engine->name);
> > > > -                       err = -EIO;
> > > > -                       goto out;
> > > > -               }
> > > > -
> > > > -               for (n = 0; n < nsibling; n++) {
> > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > -                               err = -EIO;
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > -                                      siblings[n]->name,
> > > > -                                      rq[n + 1]->engine->name,
> > > > -                                      rq[0]->engine->name);
> > > > -                               err = -EINVAL;
> > > > -                               goto out;
> > > > -                       }
> > > > -               }
> > > > -
> > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > -                       i915_request_put(rq[n]);
> > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > -       }
> > > > -
> > > > -out:
> > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > -               i915_request_put(rq[n]);
> > > > -       if (igt_flush_test(gt->i915))
> > > > -               err = -EIO;
> > > > -
> > > > -       igt_spinner_fini(&spin);
> > > > -       return err;
> > > > -}
> > > > -
> > > > -static int live_virtual_bond(void *arg)
> > > > -{
> > > > -       static const struct phase {
> > > > -               const char *name;
> > > > -               unsigned int flags;
> > > > -       } phases[] = {
> > > > -               { "", 0 },
> > > > -               { "schedule", BOND_SCHEDULE },
> > > > -               { },
> > > > -       };
> > > > -       struct intel_gt *gt = arg;
> > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > -       unsigned int class;
> > > > -       int err;
> > > > -
> > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > -               return 0;
> > > > -
> > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > -               const struct phase *p;
> > > > -               int nsibling;
> > > > -
> > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > -               if (nsibling < 2)
> > > > -                       continue;
> > > > -
> > > > -               for (p = phases; p->name; p++) {
> > > > -                       err = bond_virtual_engine(gt,
> > > > -                                                 class, siblings, nsibling,
> > > > -                                                 p->flags);
> > > > -                       if (err) {
> > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > -                                      __func__, p->name, class, nsibling, err);
> > > > -                               return err;
> > > > -                       }
> > > > -               }
> > > > -       }
> > > > -
> > > > -       return 0;
> > > > -}
> > > > -
> > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > >                                 struct intel_engine_cs **siblings,
> > > >                                 unsigned int nsibling)
> > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > >                 SUBTEST(live_virtual_mask),
> > > >                 SUBTEST(live_virtual_preserved),
> > > >                 SUBTEST(live_virtual_slice),
> > > > -               SUBTEST(live_virtual_bond),
> > > >                 SUBTEST(live_virtual_reset),
> > > >         };
> > > >
> > > > --
> > > > 2.31.1
> > > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 17:18           ` Matthew Brost
  0 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-04-28 17:18 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > >
> > > > This adds a bunch of complexity which the media driver has never
> > > > actually used.  The media driver does technically bond a balanced engine
> > > > to another engine but the balanced engine only has one engine in the
> > > > sibling set.  This doesn't actually result in a virtual engine.
> > > >
> > > > Unless some userspace badly wants it, there's no good reason to support
> > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > leave the validation code in place in case we ever decide we want to do
> > > > something interesting with the bonding information.
> > > >
> > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > ---
> > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > >         }
> > > >         virtual = set->engines->engines[idx]->engine;
> > > >
> > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > +               drm_dbg(&i915->drm,
> > > > +                       "Bonding with virtual engines not allowed\n");
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > >         err = check_user_mbz(&ext->flags);
> > > >         if (err)
> > > >                 return err;
> > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > >                                 n, ci.engine_class, ci.engine_instance);
> > > >                         return -EINVAL;
> > > >                 }
> > > > -
> > > > -               /*
> > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > -                * a submit fence will always be directed to the one engine.
> > > > -                */
> > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > -                                                              master,
> > > > -                                                              bond);
> > > > -                       if (err)
> > > > -                               return err;
> > > > -               }
> > > >         }
> > > >
> > > >         return 0;
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > >                         err = i915_request_await_execution(eb.request,
> > > >                                                            in_fence,
> > > > -                                                          eb.engine->bond_execute);
> > > > +                                                          NULL);
> > > >                 else
> > > >                         err = i915_request_await_dma_fence(eb.request,
> > > >                                                            in_fence);
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > index 883bafc449024..68cfe5080325c 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > >          */
> > > >         void            (*submit_request)(struct i915_request *rq);
> > > >
> > > > -       /*
> > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > -        * request down to the bonded pairs.
> > > > -        */
> > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > -                                       struct dma_fence *signal);
> > > > -
> > > >         /*
> > > >          * Call when the priority on a request has changed and it and its
> > > >          * dependencies may need rescheduling. Note the request itself may
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > index de124870af44d..b6e2b59f133b7 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > >                 int prio;
> > > >         } nodes[I915_NUM_ENGINES];
> > > >
> > > > -       /*
> > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > -        * of physical engines any particular request may be submitted to.
> > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > -        * use one of sibling_mask physical engines.
> > > > -        */
> > > > -       struct ve_bond {
> > > > -               const struct intel_engine_cs *master;
> > > > -               intel_engine_mask_t sibling_mask;
> > > > -       } *bonds;
> > > > -       unsigned int num_bonds;
> > > > -
> > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > >         unsigned int num_siblings;
> > > >         struct intel_engine_cs *siblings[];
> > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > >         intel_engine_free_request_pool(&ve->base);
> > > >
> > > > -       kfree(ve->bonds);
> > > >         kfree(ve);
> > > >  }
> > > >
> > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > >  }
> > > >
> > > > -static struct ve_bond *
> > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > -                 const struct intel_engine_cs *master)
> > > > -{
> > > > -       int i;
> > > > -
> > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > -               if (ve->bonds[i].master == master)
> > > > -                       return &ve->bonds[i];
> > > > -       }
> > > > -
> > > > -       return NULL;
> > > > -}
> > > > -
> > > > -static void
> > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > -{
> > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > -       intel_engine_mask_t allowed, exec;
> > > > -       struct ve_bond *bond;
> > > > -
> > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > -
> > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > -       if (bond)
> > > > -               allowed &= bond->sibling_mask;
> > > > -
> > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > -               ;
> > > > -
> > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > -       to_request(signal)->execution_mask &= ~allowed;
> > >
> > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > much code.  This function in particular, has to stay, unfortunately.
> > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > the work onto a different engine than than the one it's supposed to
> > > run in parallel with.  This means we can't dead-code this function or
> > > the bond_execution function pointer and related stuff.
> >
> > Uh that's disappointing, since if I understand your point correctly, the
> > sibling engines should all be singletons, not load balancing virtual ones.
> > So there really should not be any need to pick the right one at execution
> > time.
> 
> The media driver itself seems to work fine if I delete all the code.
> It's just an IGT testcase that blows up.  I'll do more digging to see
> if I can better isolate why.
> 

Jumping on here mid-thread. For what is is worth to make execlists work
with the upcoming parallel submission extension I leveraged some of the
existing bonding code so I wouldn't be too eager to delete this code
until that lands.

Matt

> --Jason
> 
> > At least my understanding is that we're only limiting the engine set
> > further, so if both signaller and signalled request can only run on
> > singletons (which must be distinct, or the bonded parameter validation is
> > busted) there's really nothing to do here.
> >
> > Also this is the locking code that freaks me out about the current bonded
> > execlist code ...
> >
> > Dazzled and confused.
> > -Daniel
> >
> > >
> > > --Jason
> > >
> > >
> > > > -}
> > > > -
> > > >  struct intel_context *
> > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > >                                unsigned int count)
> > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > >
> > > >         ve->base.schedule = i915_schedule;
> > > >         ve->base.submit_request = virtual_submit_request;
> > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > >
> > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > >         if (IS_ERR(dst))
> > > >                 return dst;
> > > >
> > > > -       if (se->num_bonds) {
> > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > -
> > > > -               de->bonds = kmemdup(se->bonds,
> > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > -                                   GFP_KERNEL);
> > > > -               if (!de->bonds) {
> > > > -                       intel_context_put(dst);
> > > > -                       return ERR_PTR(-ENOMEM);
> > > > -               }
> > > > -
> > > > -               de->num_bonds = se->num_bonds;
> > > > -       }
> > > > -
> > > >         return dst;
> > > >  }
> > > >
> > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > -                                    const struct intel_engine_cs *master,
> > > > -                                    const struct intel_engine_cs *sibling)
> > > > -{
> > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > -       struct ve_bond *bond;
> > > > -       int n;
> > > > -
> > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > -               if (sibling == ve->siblings[n])
> > > > -                       break;
> > > > -       if (n == ve->num_siblings)
> > > > -               return -EINVAL;
> > > > -
> > > > -       bond = virtual_find_bond(ve, master);
> > > > -       if (bond) {
> > > > -               bond->sibling_mask |= sibling->mask;
> > > > -               return 0;
> > > > -       }
> > > > -
> > > > -       bond = krealloc(ve->bonds,
> > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > -                       GFP_KERNEL);
> > > > -       if (!bond)
> > > > -               return -ENOMEM;
> > > > -
> > > > -       bond[ve->num_bonds].master = master;
> > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > -
> > > > -       ve->bonds = bond;
> > > > -       ve->num_bonds++;
> > > > -
> > > > -       return 0;
> > > > -}
> > > > -
> > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > >                                    struct drm_printer *m,
> > > >                                    void (*show_request)(struct drm_printer *m,
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > >  struct intel_context *
> > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > >
> > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > -                                    const struct intel_engine_cs *master,
> > > > -                                    const struct intel_engine_cs *sibling);
> > > > -
> > > >  bool
> > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > >         return 0;
> > > >  }
> > > >
> > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > -                              unsigned int class,
> > > > -                              struct intel_engine_cs **siblings,
> > > > -                              unsigned int nsibling,
> > > > -                              unsigned int flags)
> > > > -#define BOND_SCHEDULE BIT(0)
> > > > -{
> > > > -       struct intel_engine_cs *master;
> > > > -       struct i915_request *rq[16];
> > > > -       enum intel_engine_id id;
> > > > -       struct igt_spinner spin;
> > > > -       unsigned long n;
> > > > -       int err;
> > > > -
> > > > -       /*
> > > > -        * A set of bonded requests is intended to be run concurrently
> > > > -        * across a number of engines. We use one request per-engine
> > > > -        * and a magic fence to schedule each of the bonded requests
> > > > -        * at the same time. A consequence of our current scheduler is that
> > > > -        * we only move requests to the HW ready queue when the request
> > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > -        * there is a delay on all secondary fences as the HW may be
> > > > -        * currently busy. Equally, as all the requests are independent,
> > > > -        * they may have other fences that delay individual request
> > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > -        * immediately submitted to HW at the same time, just that if the
> > > > -        * rules are abided by, they are ready at the same time as the
> > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > -        * to ensure parallel execution of its phases as it requires.
> > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > -        * take care of parallel execution, even across preemption events on
> > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > -        *
> > > > -        * With the submit-fence, we have identified three possible phases
> > > > -        * of synchronisation depending on the master fence: queued (not
> > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > -        * and checked below. However, the signaled master fence handling is
> > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > -        * any information about the previous execution. It may even be freed
> > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > -        * as our expectation is that it should not constrain the secondaries
> > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > -        * userspace requests are meant to be running in parallel). As
> > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > -        * check below as normal execution flows are checked extensively above.
> > > > -        *
> > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > -        * expected behaviour for userpace?
> > > > -        */
> > > > -
> > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > -
> > > > -       if (igt_spinner_init(&spin, gt))
> > > > -               return -ENOMEM;
> > > > -
> > > > -       err = 0;
> > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > -       for_each_engine(master, gt, id) {
> > > > -               struct i915_sw_fence fence = {};
> > > > -               struct intel_context *ce;
> > > > -
> > > > -               if (master->class == class)
> > > > -                       continue;
> > > > -
> > > > -               ce = intel_context_create(master);
> > > > -               if (IS_ERR(ce)) {
> > > > -                       err = PTR_ERR(ce);
> > > > -                       goto out;
> > > > -               }
> > > > -
> > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > -
> > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > -               intel_context_put(ce);
> > > > -               if (IS_ERR(rq[0])) {
> > > > -                       err = PTR_ERR(rq[0]);
> > > > -                       goto out;
> > > > -               }
> > > > -               i915_request_get(rq[0]);
> > > > -
> > > > -               if (flags & BOND_SCHEDULE) {
> > > > -                       onstack_fence_init(&fence);
> > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > -                                                              &fence,
> > > > -                                                              GFP_KERNEL);
> > > > -               }
> > > > -
> > > > -               i915_request_add(rq[0]);
> > > > -               if (err < 0)
> > > > -                       goto out;
> > > > -
> > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > -                       err = -EIO;
> > > > -                       goto out;
> > > > -               }
> > > > -
> > > > -               for (n = 0; n < nsibling; n++) {
> > > > -                       struct intel_context *ve;
> > > > -
> > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > -                       if (IS_ERR(ve)) {
> > > > -                               err = PTR_ERR(ve);
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > -                                                              master,
> > > > -                                                              siblings[n]);
> > > > -                       if (err) {
> > > > -                               intel_context_put(ve);
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       err = intel_context_pin(ve);
> > > > -                       intel_context_put(ve);
> > > > -                       if (err) {
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > -                       intel_context_unpin(ve);
> > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -                       i915_request_get(rq[n + 1]);
> > > > -
> > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > -                                                          &rq[0]->fence,
> > > > -                                                          ve->engine->bond_execute);
> > > > -                       i915_request_add(rq[n + 1]);
> > > > -                       if (err < 0) {
> > > > -                               onstack_fence_fini(&fence);
> > > > -                               goto out;
> > > > -                       }
> > > > -               }
> > > > -               onstack_fence_fini(&fence);
> > > > -               intel_engine_flush_submission(master);
> > > > -               igt_spinner_end(&spin);
> > > > -
> > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > -                              rq[0]->engine->name);
> > > > -                       err = -EIO;
> > > > -                       goto out;
> > > > -               }
> > > > -
> > > > -               for (n = 0; n < nsibling; n++) {
> > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > -                               err = -EIO;
> > > > -                               goto out;
> > > > -                       }
> > > > -
> > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > -                                      siblings[n]->name,
> > > > -                                      rq[n + 1]->engine->name,
> > > > -                                      rq[0]->engine->name);
> > > > -                               err = -EINVAL;
> > > > -                               goto out;
> > > > -                       }
> > > > -               }
> > > > -
> > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > -                       i915_request_put(rq[n]);
> > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > -       }
> > > > -
> > > > -out:
> > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > -               i915_request_put(rq[n]);
> > > > -       if (igt_flush_test(gt->i915))
> > > > -               err = -EIO;
> > > > -
> > > > -       igt_spinner_fini(&spin);
> > > > -       return err;
> > > > -}
> > > > -
> > > > -static int live_virtual_bond(void *arg)
> > > > -{
> > > > -       static const struct phase {
> > > > -               const char *name;
> > > > -               unsigned int flags;
> > > > -       } phases[] = {
> > > > -               { "", 0 },
> > > > -               { "schedule", BOND_SCHEDULE },
> > > > -               { },
> > > > -       };
> > > > -       struct intel_gt *gt = arg;
> > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > -       unsigned int class;
> > > > -       int err;
> > > > -
> > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > -               return 0;
> > > > -
> > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > -               const struct phase *p;
> > > > -               int nsibling;
> > > > -
> > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > -               if (nsibling < 2)
> > > > -                       continue;
> > > > -
> > > > -               for (p = phases; p->name; p++) {
> > > > -                       err = bond_virtual_engine(gt,
> > > > -                                                 class, siblings, nsibling,
> > > > -                                                 p->flags);
> > > > -                       if (err) {
> > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > -                                      __func__, p->name, class, nsibling, err);
> > > > -                               return err;
> > > > -                       }
> > > > -               }
> > > > -       }
> > > > -
> > > > -       return 0;
> > > > -}
> > > > -
> > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > >                                 struct intel_engine_cs **siblings,
> > > >                                 unsigned int nsibling)
> > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > >                 SUBTEST(live_virtual_mask),
> > > >                 SUBTEST(live_virtual_preserved),
> > > >                 SUBTEST(live_virtual_slice),
> > > > -               SUBTEST(live_virtual_bond),
> > > >                 SUBTEST(live_virtual_reset),
> > > >         };
> > > >
> > > > --
> > > > 2.31.1
> > > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-28 15:55     ` Tvrtko Ursulin
@ 2021-04-28 17:24       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > Instead of handling it like a context param, unconditionally set it when
> > intel_contexts are created.  This doesn't fix anything but does simplify
> > the code a bit.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> >   .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> >   drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> >   3 files changed, 6 insertions(+), 44 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 35bcdeddfbf3f..1091cc04a242a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> >           intel_engine_has_timeslices(ce->engine))
> >               __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> >
> > -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > +         ctx->i915->params.request_timeout_ms) {
> > +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
>
> Blank line between declarations and code please, or just lose the local.
>
> Otherwise looks okay. Slight change that same GEM context can now have a
> mix of different request expirations isn't interesting I think. At least
> the change goes away by the end of the series.

In order for that to happen, I think you'd have to have a race between
CREATE_CONTEXT and someone smashing the request_timeout_ms param via
sysfs.  Or am I missing something?  Given that timeouts are really
per-engine anyway, I don't think we need to care too much about that.

--Jason

> Regards,
>
> Tvrtko
>
> > +     }
> >   }
> >
> >   static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> > @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
> >       context_apply_all(ctx, __apply_timeline, timeline);
> >   }
> >
> > -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> > -{
> > -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> > -}
> > -
> > -static int
> > -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> > -{
> > -     int ret;
> > -
> > -     ret = context_apply_all(ctx, __apply_watchdog,
> > -                             (void *)(uintptr_t)timeout_us);
> > -     if (!ret)
> > -             ctx->watchdog.timeout_us = timeout_us;
> > -
> > -     return ret;
> > -}
> > -
> > -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> > -{
> > -     struct drm_i915_private *i915 = ctx->i915;
> > -     int ret;
> > -
> > -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> > -         !i915->params.request_timeout_ms)
> > -             return;
> > -
> > -     /* Default expiry for user fences. */
> > -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> > -     if (ret)
> > -             drm_notice(&i915->drm,
> > -                        "Failed to configure default fence expiry! (%d)",
> > -                        ret);
> > -}
> > -
> >   static struct i915_gem_context *
> >   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >   {
> > @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >               intel_timeline_put(timeline);
> >       }
> >
> > -     __set_default_fence_expiry(ctx);
> > -
> >       trace_i915_context_create(ctx);
> >
> >       return ctx;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index 5ae71ec936f7c..676592e27e7d2 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -153,10 +153,6 @@ struct i915_gem_context {
> >        */
> >       atomic_t active_count;
> >
> > -     struct {
> > -             u64 timeout_us;
> > -     } watchdog;
> > -
> >       /**
> >        * @hang_timestamp: The last time(s) this context caused a GPU hang
> >        */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > index dffedd983693d..0c69cb42d075c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > @@ -10,11 +10,10 @@
> >
> >   #include "intel_context.h"
> >
> > -static inline int
> > +static inline void
> >   intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> >   {
> >       ce->watchdog.timeout_us = timeout_us;
> > -     return 0;
> >   }
> >
> >   #endif /* INTEL_CONTEXT_PARAM_H */
> >
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-28 17:24       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > Instead of handling it like a context param, unconditionally set it when
> > intel_contexts are created.  This doesn't fix anything but does simplify
> > the code a bit.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> >   .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> >   drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> >   3 files changed, 6 insertions(+), 44 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 35bcdeddfbf3f..1091cc04a242a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> >           intel_engine_has_timeslices(ce->engine))
> >               __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> >
> > -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > +         ctx->i915->params.request_timeout_ms) {
> > +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
>
> Blank line between declarations and code please, or just lose the local.
>
> Otherwise looks okay. Slight change that same GEM context can now have a
> mix of different request expirations isn't interesting I think. At least
> the change goes away by the end of the series.

In order for that to happen, I think you'd have to have a race between
CREATE_CONTEXT and someone smashing the request_timeout_ms param via
sysfs.  Or am I missing something?  Given that timeouts are really
per-engine anyway, I don't think we need to care too much about that.

--Jason

> Regards,
>
> Tvrtko
>
> > +     }
> >   }
> >
> >   static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> > @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
> >       context_apply_all(ctx, __apply_timeline, timeline);
> >   }
> >
> > -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> > -{
> > -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> > -}
> > -
> > -static int
> > -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> > -{
> > -     int ret;
> > -
> > -     ret = context_apply_all(ctx, __apply_watchdog,
> > -                             (void *)(uintptr_t)timeout_us);
> > -     if (!ret)
> > -             ctx->watchdog.timeout_us = timeout_us;
> > -
> > -     return ret;
> > -}
> > -
> > -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> > -{
> > -     struct drm_i915_private *i915 = ctx->i915;
> > -     int ret;
> > -
> > -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> > -         !i915->params.request_timeout_ms)
> > -             return;
> > -
> > -     /* Default expiry for user fences. */
> > -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> > -     if (ret)
> > -             drm_notice(&i915->drm,
> > -                        "Failed to configure default fence expiry! (%d)",
> > -                        ret);
> > -}
> > -
> >   static struct i915_gem_context *
> >   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >   {
> > @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >               intel_timeline_put(timeline);
> >       }
> >
> > -     __set_default_fence_expiry(ctx);
> > -
> >       trace_i915_context_create(ctx);
> >
> >       return ctx;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index 5ae71ec936f7c..676592e27e7d2 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -153,10 +153,6 @@ struct i915_gem_context {
> >        */
> >       atomic_t active_count;
> >
> > -     struct {
> > -             u64 timeout_us;
> > -     } watchdog;
> > -
> >       /**
> >        * @hang_timestamp: The last time(s) this context caused a GPU hang
> >        */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > index dffedd983693d..0c69cb42d075c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > @@ -10,11 +10,10 @@
> >
> >   #include "intel_context.h"
> >
> > -static inline int
> > +static inline void
> >   intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> >   {
> >       ce->watchdog.timeout_us = timeout_us;
> > -     return 0;
> >   }
> >
> >   #endif /* INTEL_CONTEXT_PARAM_H */
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-28 15:49     ` Tvrtko Ursulin
@ 2021-04-28 17:26       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:26 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > This API is entirely unnecessary and I'd love to get rid of it.  If
> > userspace wants a single timeline across multiple contexts, they can
> > either use implicit synchronization or a syncobj, both of which existed
> > at the time this feature landed.  The justification given at the time
> > was that it would help GL drivers which are inherently single-timeline.
> > However, neither of our GL drivers actually wanted the feature.  i965
> > was already in maintenance mode at the time and iris uses syncobj for
> > everything.
> >
> > Unfortunately, as much as I'd love to get rid of it, it is used by the
> > media driver so we can't do that.  We can, however, do the next-best
> > thing which is to embed a syncobj in the context and do exactly what
> > we'd expect from userspace internally.  This isn't an entirely identical
> > implementation because it's no longer atomic if userspace races with
> > itself by calling execbuffer2 twice simultaneously from different
> > threads.  It won't crash in that case; it just doesn't guarantee any
> > ordering between those two submits.
>
> 1)
>
> Please also mention the difference in context/timeline name when
> observed via the sync file API.
>
> 2)
>
> I don't remember what we have concluded in terms of observable effects
> in sync_file_merge?

I don't see how either of these are observable since this syncobj is
never exposed to userspace in any way.  Please help me understand what
I'm missing here.

--Jason


> Regards,
>
> Tvrtko
>
> > Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> > advantages beyond mere annoyance.  One is that intel_timeline is no
> > longer an api-visible object and can remain entirely an implementation
> > detail.  This may be advantageous as we make scheduler changes going
> > forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> > we should now have a 1:1 mapping between intel_context and
> > intel_timeline which may help us reduce locking.
> >
> > v2 (Jason Ekstrand):
> >   - Update the comment on i915_gem_context::syncobj to mention that it's
> >     an emulation and the possible race if userspace calls execbuffer2
> >     twice on the same context concurrently.
> >   - Wrap the checks for eb.gem_context->syncobj in unlikely()
> >   - Drop the dma_fence reference
> >   - Improved commit message
> >
> > v3 (Jason Ekstrand):
> >   - Move the dma_fence_put() to before the error exit
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
> >   .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
> >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
> >   3 files changed, 40 insertions(+), 39 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 2c2fefa912805..a72c9b256723b 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -67,6 +67,8 @@
> >   #include <linux/log2.h>
> >   #include <linux/nospec.h>
> >
> > +#include <drm/drm_syncobj.h>
> > +
> >   #include "gt/gen6_ppgtt.h"
> >   #include "gt/intel_context.h"
> >   #include "gt/intel_context_param.h"
> > @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
> >               ce->vm = vm;
> >       }
> >
> > -     GEM_BUG_ON(ce->timeline);
> > -     if (ctx->timeline)
> > -             ce->timeline = intel_timeline_get(ctx->timeline);
> > -
> >       if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> >           intel_engine_has_timeslices(ce->engine))
> >               __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
> >       mutex_destroy(&ctx->engines_mutex);
> >       mutex_destroy(&ctx->lut_mutex);
> >
> > -     if (ctx->timeline)
> > -             intel_timeline_put(ctx->timeline);
> > -
> >       put_pid(ctx->pid);
> >       mutex_destroy(&ctx->mutex);
> >
> > @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
> >       if (vm)
> >               i915_vm_close(vm);
> >
> > +     if (ctx->syncobj)
> > +             drm_syncobj_put(ctx->syncobj);
> > +
> >       ctx->file_priv = ERR_PTR(-EBADF);
> >
> >       /*
> > @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> >               i915_vm_close(vm);
> >   }
> >
> > -static void __set_timeline(struct intel_timeline **dst,
> > -                        struct intel_timeline *src)
> > -{
> > -     struct intel_timeline *old = *dst;
> > -
> > -     *dst = src ? intel_timeline_get(src) : NULL;
> > -
> > -     if (old)
> > -             intel_timeline_put(old);
> > -}
> > -
> > -static void __apply_timeline(struct intel_context *ce, void *timeline)
> > -{
> > -     __set_timeline(&ce->timeline, timeline);
> > -}
> > -
> > -static void __assign_timeline(struct i915_gem_context *ctx,
> > -                           struct intel_timeline *timeline)
> > -{
> > -     __set_timeline(&ctx->timeline, timeline);
> > -     context_apply_all(ctx, __apply_timeline, timeline);
> > -}
> > -
> >   static struct i915_gem_context *
> >   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >   {
> >       struct i915_gem_context *ctx;
> > +     int ret;
> >
> >       if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> >           !HAS_EXECLISTS(i915))
> > @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >       }
> >
> >       if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > -             struct intel_timeline *timeline;
> > -
> > -             timeline = intel_timeline_create(&i915->gt);
> > -             if (IS_ERR(timeline)) {
> > +             ret = drm_syncobj_create(&ctx->syncobj,
> > +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
> > +                                      NULL);
> > +             if (ret) {
> >                       context_close(ctx);
> > -                     return ERR_CAST(timeline);
> > +                     return ERR_PTR(ret);
> >               }
> > -
> > -             __assign_timeline(ctx, timeline);
> > -             intel_timeline_put(timeline);
> >       }
> >
> >       trace_i915_context_create(ctx);
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index 676592e27e7d2..df76767f0c41b 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -83,7 +83,19 @@ struct i915_gem_context {
> >       struct i915_gem_engines __rcu *engines;
> >       struct mutex engines_mutex; /* guards writes to engines */
> >
> > -     struct intel_timeline *timeline;
> > +     /**
> > +      * @syncobj: Shared timeline syncobj
> > +      *
> > +      * When the SHARED_TIMELINE flag is set on context creation, we
> > +      * emulate a single timeline across all engines using this syncobj.
> > +      * For every execbuffer2 call, this syncobj is used as both an in-
> > +      * and out-fence.  Unlike the real intel_timeline, this doesn't
> > +      * provide perfect atomic in-order guarantees if the client races
> > +      * with itself by calling execbuffer2 twice concurrently.  However,
> > +      * if userspace races with itself, that's not likely to yield well-
> > +      * defined results anyway so we choose to not care.
> > +      */
> > +     struct drm_syncobj *syncobj;
> >
> >       /**
> >        * @vm: unique address space (GTT)
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index b812f313422a9..d640bba6ad9ab 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >               goto err_vma;
> >       }
> >
> > +     if (unlikely(eb.gem_context->syncobj)) {
> > +             struct dma_fence *fence;
> > +
> > +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> > +             err = i915_request_await_dma_fence(eb.request, fence);
> > +             dma_fence_put(fence);
> > +             if (err)
> > +                     goto err_ext;
> > +     }
> > +
> >       if (in_fence) {
> >               if (args->flags & I915_EXEC_FENCE_SUBMIT)
> >                       err = i915_request_await_execution(eb.request,
> > @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >                       fput(out_fence->file);
> >               }
> >       }
> > +
> > +     if (unlikely(eb.gem_context->syncobj)) {
> > +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
> > +                                       &eb.request->fence);
> > +     }
> > +
> >       i915_request_put(eb.request);
> >
> >   err_vma:
> >
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-28 17:26       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:26 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > This API is entirely unnecessary and I'd love to get rid of it.  If
> > userspace wants a single timeline across multiple contexts, they can
> > either use implicit synchronization or a syncobj, both of which existed
> > at the time this feature landed.  The justification given at the time
> > was that it would help GL drivers which are inherently single-timeline.
> > However, neither of our GL drivers actually wanted the feature.  i965
> > was already in maintenance mode at the time and iris uses syncobj for
> > everything.
> >
> > Unfortunately, as much as I'd love to get rid of it, it is used by the
> > media driver so we can't do that.  We can, however, do the next-best
> > thing which is to embed a syncobj in the context and do exactly what
> > we'd expect from userspace internally.  This isn't an entirely identical
> > implementation because it's no longer atomic if userspace races with
> > itself by calling execbuffer2 twice simultaneously from different
> > threads.  It won't crash in that case; it just doesn't guarantee any
> > ordering between those two submits.
>
> 1)
>
> Please also mention the difference in context/timeline name when
> observed via the sync file API.
>
> 2)
>
> I don't remember what we have concluded in terms of observable effects
> in sync_file_merge?

I don't see how either of these are observable since this syncobj is
never exposed to userspace in any way.  Please help me understand what
I'm missing here.

--Jason


> Regards,
>
> Tvrtko
>
> > Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> > advantages beyond mere annoyance.  One is that intel_timeline is no
> > longer an api-visible object and can remain entirely an implementation
> > detail.  This may be advantageous as we make scheduler changes going
> > forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> > we should now have a 1:1 mapping between intel_context and
> > intel_timeline which may help us reduce locking.
> >
> > v2 (Jason Ekstrand):
> >   - Update the comment on i915_gem_context::syncobj to mention that it's
> >     an emulation and the possible race if userspace calls execbuffer2
> >     twice on the same context concurrently.
> >   - Wrap the checks for eb.gem_context->syncobj in unlikely()
> >   - Drop the dma_fence reference
> >   - Improved commit message
> >
> > v3 (Jason Ekstrand):
> >   - Move the dma_fence_put() to before the error exit
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
> >   .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
> >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
> >   3 files changed, 40 insertions(+), 39 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 2c2fefa912805..a72c9b256723b 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -67,6 +67,8 @@
> >   #include <linux/log2.h>
> >   #include <linux/nospec.h>
> >
> > +#include <drm/drm_syncobj.h>
> > +
> >   #include "gt/gen6_ppgtt.h"
> >   #include "gt/intel_context.h"
> >   #include "gt/intel_context_param.h"
> > @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
> >               ce->vm = vm;
> >       }
> >
> > -     GEM_BUG_ON(ce->timeline);
> > -     if (ctx->timeline)
> > -             ce->timeline = intel_timeline_get(ctx->timeline);
> > -
> >       if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> >           intel_engine_has_timeslices(ce->engine))
> >               __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
> >       mutex_destroy(&ctx->engines_mutex);
> >       mutex_destroy(&ctx->lut_mutex);
> >
> > -     if (ctx->timeline)
> > -             intel_timeline_put(ctx->timeline);
> > -
> >       put_pid(ctx->pid);
> >       mutex_destroy(&ctx->mutex);
> >
> > @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
> >       if (vm)
> >               i915_vm_close(vm);
> >
> > +     if (ctx->syncobj)
> > +             drm_syncobj_put(ctx->syncobj);
> > +
> >       ctx->file_priv = ERR_PTR(-EBADF);
> >
> >       /*
> > @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> >               i915_vm_close(vm);
> >   }
> >
> > -static void __set_timeline(struct intel_timeline **dst,
> > -                        struct intel_timeline *src)
> > -{
> > -     struct intel_timeline *old = *dst;
> > -
> > -     *dst = src ? intel_timeline_get(src) : NULL;
> > -
> > -     if (old)
> > -             intel_timeline_put(old);
> > -}
> > -
> > -static void __apply_timeline(struct intel_context *ce, void *timeline)
> > -{
> > -     __set_timeline(&ce->timeline, timeline);
> > -}
> > -
> > -static void __assign_timeline(struct i915_gem_context *ctx,
> > -                           struct intel_timeline *timeline)
> > -{
> > -     __set_timeline(&ctx->timeline, timeline);
> > -     context_apply_all(ctx, __apply_timeline, timeline);
> > -}
> > -
> >   static struct i915_gem_context *
> >   i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >   {
> >       struct i915_gem_context *ctx;
> > +     int ret;
> >
> >       if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> >           !HAS_EXECLISTS(i915))
> > @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >       }
> >
> >       if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > -             struct intel_timeline *timeline;
> > -
> > -             timeline = intel_timeline_create(&i915->gt);
> > -             if (IS_ERR(timeline)) {
> > +             ret = drm_syncobj_create(&ctx->syncobj,
> > +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
> > +                                      NULL);
> > +             if (ret) {
> >                       context_close(ctx);
> > -                     return ERR_CAST(timeline);
> > +                     return ERR_PTR(ret);
> >               }
> > -
> > -             __assign_timeline(ctx, timeline);
> > -             intel_timeline_put(timeline);
> >       }
> >
> >       trace_i915_context_create(ctx);
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index 676592e27e7d2..df76767f0c41b 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -83,7 +83,19 @@ struct i915_gem_context {
> >       struct i915_gem_engines __rcu *engines;
> >       struct mutex engines_mutex; /* guards writes to engines */
> >
> > -     struct intel_timeline *timeline;
> > +     /**
> > +      * @syncobj: Shared timeline syncobj
> > +      *
> > +      * When the SHARED_TIMELINE flag is set on context creation, we
> > +      * emulate a single timeline across all engines using this syncobj.
> > +      * For every execbuffer2 call, this syncobj is used as both an in-
> > +      * and out-fence.  Unlike the real intel_timeline, this doesn't
> > +      * provide perfect atomic in-order guarantees if the client races
> > +      * with itself by calling execbuffer2 twice concurrently.  However,
> > +      * if userspace races with itself, that's not likely to yield well-
> > +      * defined results anyway so we choose to not care.
> > +      */
> > +     struct drm_syncobj *syncobj;
> >
> >       /**
> >        * @vm: unique address space (GTT)
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index b812f313422a9..d640bba6ad9ab 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >               goto err_vma;
> >       }
> >
> > +     if (unlikely(eb.gem_context->syncobj)) {
> > +             struct dma_fence *fence;
> > +
> > +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> > +             err = i915_request_await_dma_fence(eb.request, fence);
> > +             dma_fence_put(fence);
> > +             if (err)
> > +                     goto err_ext;
> > +     }
> > +
> >       if (in_fence) {
> >               if (args->flags & I915_EXEC_FENCE_SUBMIT)
> >                       err = i915_request_await_execution(eb.request,
> > @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >                       fput(out_fence->file);
> >               }
> >       }
> > +
> > +     if (unlikely(eb.gem_context->syncobj)) {
> > +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
> > +                                       &eb.request->fence);
> > +     }
> > +
> >       i915_request_put(eb.request);
> >
> >   err_vma:
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
  2021-04-27  9:49     ` Daniel Vetter
@ 2021-04-28 17:38       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Tue, Apr 27, 2021 at 4:49 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:15PM -0500, Jason Ekstrand wrote:
> > This API allows one context to grab bits out of another context upon
> > creation.  It can be used as a short-cut for setparam(getparam()) for
> > things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> > real userspace.  It's used by a few IGT tests and that's it.  Since it
> > doesn't add any real value (most of the stuff you can CLONE you can copy
> > in other ways), drop it.
> >
> > There is one thing that this API allows you to clone which you cannot
> > clone via getparam/setparam: timelines.  However, timelines are an
> > implementation detail of i915 and not really something that needs to be
> > exposed to userspace.  Also, sharing timelines between contexts isn't
> > obviously useful and supporting it has the potential to complicate i915
> > internally.  It also doesn't add any functionality that the client can't
> > get in other ways.  If a client really wants a shared timeline, they can
> > use a syncobj and set it as an in and out fence on every submit.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
> >  include/uapi/drm/i915_drm.h                 |  16 +-
> >  2 files changed, 6 insertions(+), 209 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 8a77855123cec..2c2fefa912805 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> >  }
> >
> > -static int clone_engines(struct i915_gem_context *dst,
> > -                      struct i915_gem_context *src)
> > +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
> >  {
> > -     struct i915_gem_engines *clone, *e;
> > -     bool user_engines;
> > -     unsigned long n;
> > -
> > -     e = __context_engines_await(src, &user_engines);
> > -     if (!e)
> > -             return -ENOENT;
> > -
> > -     clone = alloc_engines(e->num_engines);
> > -     if (!clone)
> > -             goto err_unlock;
> > -
> > -     for (n = 0; n < e->num_engines; n++) {
> > -             struct intel_engine_cs *engine;
> > -
> > -             if (!e->engines[n]) {
> > -                     clone->engines[n] = NULL;
> > -                     continue;
> > -             }
> > -             engine = e->engines[n]->engine;
> > -
> > -             /*
> > -              * Virtual engines are singletons; they can only exist
> > -              * inside a single context, because they embed their
> > -              * HW context... As each virtual context implies a single
> > -              * timeline (each engine can only dequeue a single request
> > -              * at any time), it would be surprising for two contexts
> > -              * to use the same engine. So let's create a copy of
> > -              * the virtual engine instead.
> > -              */
> > -             if (intel_engine_is_virtual(engine))
> > -                     clone->engines[n] =
> > -                             intel_execlists_clone_virtual(engine);
>
> You forgot to gc this function here ^^

Done, with pleasure!

> > -             else
> > -                     clone->engines[n] = intel_context_create(engine);
> > -             if (IS_ERR_OR_NULL(clone->engines[n])) {
> > -                     __free_engines(clone, n);
> > -                     goto err_unlock;
> > -             }
> > -
> > -             intel_context_set_gem(clone->engines[n], dst);
>
> Not peeked ahead, but I'm really hoping intel_context_set_gem gets removed
> eventually too ...

I've not gotten rid of it yet but it's on my list of things to clean
up.  The problem is that there are a pile of parameters we want to set
for user engines which we don't set for internal engines:

 - VM
 - priority
 - hangcheck timeout
 - gem_context back-pointer (I'd love to drop this one!)
 - a bunch more when we start shifting more stuff into intel_context

And there are a bunch of places where we create non-user engines.  The
end result being that we have four ugly options:

 1. Set them after the fact as per intel_context_set_gem
 2. Touch all 79 instances of intel_context_create( for each new
create param we add
 3. Add a new struct intel_context_create_args which contains all the
extra stuff and make NULL mean "use the defaults"
 4. Add a new struct i915_gem_engine which is used for client-visible
engines.  When we switch to an engine-based uAPI, this is probably
what would be exposed to userspace.

I'm happy to hear opinions on which of those is the best option. 2. is
clearly a bad idea.


> > -     }
> > -     clone->num_engines = n;
> > -     i915_sw_fence_complete(&e->fence);
> > -
> > -     /* Serialised by constructor */
> > -     engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> > -     if (user_engines)
> > -             i915_gem_context_set_user_engines(dst);
> > -     else
> > -             i915_gem_context_clear_user_engines(dst);
> > -     return 0;
> > -
> > -err_unlock:
> > -     i915_sw_fence_complete(&e->fence);
> > -     return -ENOMEM;
> > -}
> > -
> > -static int clone_flags(struct i915_gem_context *dst,
> > -                    struct i915_gem_context *src)
> > -{
> > -     dst->user_flags = src->user_flags;
> > -     return 0;
> > -}
> > -
> > -static int clone_schedattr(struct i915_gem_context *dst,
> > -                        struct i915_gem_context *src)
> > -{
> > -     dst->sched = src->sched;
> > -     return 0;
> > -}
> > -
> > -static int clone_sseu(struct i915_gem_context *dst,
> > -                   struct i915_gem_context *src)
> > -{
> > -     struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> > -     struct i915_gem_engines *clone;
> > -     unsigned long n;
> > -     int err;
> > -
> > -     /* no locking required; sole access under constructor*/
> > -     clone = __context_engines_static(dst);
> > -     if (e->num_engines != clone->num_engines) {
> > -             err = -EINVAL;
> > -             goto unlock;
> > -     }
> > -
> > -     for (n = 0; n < e->num_engines; n++) {
> > -             struct intel_context *ce = e->engines[n];
> > -
> > -             if (clone->engines[n]->engine->class != ce->engine->class) {
> > -                     /* Must have compatible engine maps! */
> > -                     err = -EINVAL;
> > -                     goto unlock;
> > -             }
> > -
> > -             /* serialises with set_sseu */
> > -             err = intel_context_lock_pinned(ce);
> > -             if (err)
> > -                     goto unlock;
> > -
> > -             clone->engines[n]->sseu = ce->sseu;
> > -             intel_context_unlock_pinned(ce);
> > -     }
> > -
> > -     err = 0;
> > -unlock:
> > -     i915_gem_context_unlock_engines(src);
> > -     return err;
> > -}
> > -
> > -static int clone_timeline(struct i915_gem_context *dst,
> > -                       struct i915_gem_context *src)
> > -{
> > -     if (src->timeline)
> > -             __assign_timeline(dst, src->timeline);
> > -
> > -     return 0;
> > -}
> > -
> > -static int clone_vm(struct i915_gem_context *dst,
> > -                 struct i915_gem_context *src)
> > -{
> > -     struct i915_address_space *vm;
> > -     int err = 0;
> > -
> > -     if (!rcu_access_pointer(src->vm))
> > -             return 0;
> > -
> > -     rcu_read_lock();
> > -     vm = context_get_vm_rcu(src);
> > -     rcu_read_unlock();
> > -
> > -     if (!mutex_lock_interruptible(&dst->mutex)) {
> > -             __assign_ppgtt(dst, vm);
> > -             mutex_unlock(&dst->mutex);
> > -     } else {
> > -             err = -EINTR;
> > -     }
> > -
> > -     i915_vm_put(vm);
> > -     return err;
> > -}
> > -
> > -static int create_clone(struct i915_user_extension __user *ext, void *data)
> > -{
> > -     static int (* const fn[])(struct i915_gem_context *dst,
> > -                               struct i915_gem_context *src) = {
> > -#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
> > -             MAP(ENGINES, clone_engines),
> > -             MAP(FLAGS, clone_flags),
> > -             MAP(SCHEDATTR, clone_schedattr),
> > -             MAP(SSEU, clone_sseu),
> > -             MAP(TIMELINE, clone_timeline),
> > -             MAP(VM, clone_vm),
> > -#undef MAP
> > -     };
> > -     struct drm_i915_gem_context_create_ext_clone local;
> > -     const struct create_ext *arg = data;
> > -     struct i915_gem_context *dst = arg->ctx;
> > -     struct i915_gem_context *src;
> > -     int err, bit;
> > -
> > -     if (copy_from_user(&local, ext, sizeof(local)))
> > -             return -EFAULT;
> > -
> > -     BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
> > -                  I915_CONTEXT_CLONE_UNKNOWN);
> > -
> > -     if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
> > -             return -EINVAL;
> > -
> > -     if (local.rsvd)
> > -             return -EINVAL;
> > -
> > -     rcu_read_lock();
> > -     src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
> > -     rcu_read_unlock();
> > -     if (!src)
> > -             return -ENOENT;
> > -
> > -     GEM_BUG_ON(src == dst);
> > -
> > -     for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
> > -             if (!(local.flags & BIT(bit)))
> > -                     continue;
> > -
> > -             err = fn[bit](dst, src);
> > -             if (err)
> > -                     return err;
> > -     }
> > -
> > -     return 0;
> > +     return -EINVAL;
> >  }
> >
> >  static const i915_user_extension_fn create_extensions[] = {
> >       [I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
> > -     [I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
> > +     [I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
> >  };
> >
> >  static bool client_is_banned(struct drm_i915_file_private *file_priv)
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index a0aaa8298f28d..75a71b6756ed8 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
> >       struct drm_i915_gem_context_param param;
> >  };
> >
> > -struct drm_i915_gem_context_create_ext_clone {
> > +/* This API has been removed.  On the off chance someone somewhere has
> > + * attempted to use it, never re-use this extension number.
> > + */
> >  #define I915_CONTEXT_CREATE_EXT_CLONE 1
>
> I think we need to put these somewhere else now, here it's just plain
> lost. I think in the kerneldoc for
> drm_i915_gem_context_create_ext_setparam would be best, with the #define
> right above and in the kerneldoc an enumeration of all the values and what
> they're for.

I fully agree it's not great.  But I'm not sure create_ext_setparam
makes sense either.  This is it's own extension that's unrelated to
ext_setparam.

--Jason


> I think I'll need to sign up Matt B or you for doing some kerneldoc polish
> on these so they're all collected together.
> -Daniel
>
> > -     struct i915_user_extension base;
> > -     __u32 clone_id;
> > -     __u32 flags;
> > -#define I915_CONTEXT_CLONE_ENGINES   (1u << 0)
> > -#define I915_CONTEXT_CLONE_FLAGS     (1u << 1)
> > -#define I915_CONTEXT_CLONE_SCHEDATTR (1u << 2)
> > -#define I915_CONTEXT_CLONE_SSEU              (1u << 3)
> > -#define I915_CONTEXT_CLONE_TIMELINE  (1u << 4)
> > -#define I915_CONTEXT_CLONE_VM                (1u << 5)
> > -#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
> > -     __u64 rsvd;
> > -};
> >
> >  struct drm_i915_gem_context_destroy {
> >       __u32 ctx_id;
> > --
> > 2.31.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API
@ 2021-04-28 17:38       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Tue, Apr 27, 2021 at 4:49 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:15PM -0500, Jason Ekstrand wrote:
> > This API allows one context to grab bits out of another context upon
> > creation.  It can be used as a short-cut for setparam(getparam()) for
> > things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
> > real userspace.  It's used by a few IGT tests and that's it.  Since it
> > doesn't add any real value (most of the stuff you can CLONE you can copy
> > in other ways), drop it.
> >
> > There is one thing that this API allows you to clone which you cannot
> > clone via getparam/setparam: timelines.  However, timelines are an
> > implementation detail of i915 and not really something that needs to be
> > exposed to userspace.  Also, sharing timelines between contexts isn't
> > obviously useful and supporting it has the potential to complicate i915
> > internally.  It also doesn't add any functionality that the client can't
> > get in other ways.  If a client really wants a shared timeline, they can
> > use a syncobj and set it as an in and out fence on every submit.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 199 +-------------------
> >  include/uapi/drm/i915_drm.h                 |  16 +-
> >  2 files changed, 6 insertions(+), 209 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 8a77855123cec..2c2fefa912805 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1958,207 +1958,14 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> >  }
> >
> > -static int clone_engines(struct i915_gem_context *dst,
> > -                      struct i915_gem_context *src)
> > +static int invalid_ext(struct i915_user_extension __user *ext, void *data)
> >  {
> > -     struct i915_gem_engines *clone, *e;
> > -     bool user_engines;
> > -     unsigned long n;
> > -
> > -     e = __context_engines_await(src, &user_engines);
> > -     if (!e)
> > -             return -ENOENT;
> > -
> > -     clone = alloc_engines(e->num_engines);
> > -     if (!clone)
> > -             goto err_unlock;
> > -
> > -     for (n = 0; n < e->num_engines; n++) {
> > -             struct intel_engine_cs *engine;
> > -
> > -             if (!e->engines[n]) {
> > -                     clone->engines[n] = NULL;
> > -                     continue;
> > -             }
> > -             engine = e->engines[n]->engine;
> > -
> > -             /*
> > -              * Virtual engines are singletons; they can only exist
> > -              * inside a single context, because they embed their
> > -              * HW context... As each virtual context implies a single
> > -              * timeline (each engine can only dequeue a single request
> > -              * at any time), it would be surprising for two contexts
> > -              * to use the same engine. So let's create a copy of
> > -              * the virtual engine instead.
> > -              */
> > -             if (intel_engine_is_virtual(engine))
> > -                     clone->engines[n] =
> > -                             intel_execlists_clone_virtual(engine);
>
> You forgot to gc this function here ^^

Done, with pleasure!

> > -             else
> > -                     clone->engines[n] = intel_context_create(engine);
> > -             if (IS_ERR_OR_NULL(clone->engines[n])) {
> > -                     __free_engines(clone, n);
> > -                     goto err_unlock;
> > -             }
> > -
> > -             intel_context_set_gem(clone->engines[n], dst);
>
> Not peeked ahead, but I'm really hoping intel_context_set_gem gets removed
> eventually too ...

I've not gotten rid of it yet but it's on my list of things to clean
up.  The problem is that there are a pile of parameters we want to set
for user engines which we don't set for internal engines:

 - VM
 - priority
 - hangcheck timeout
 - gem_context back-pointer (I'd love to drop this one!)
 - a bunch more when we start shifting more stuff into intel_context

And there are a bunch of places where we create non-user engines.  The
end result being that we have four ugly options:

 1. Set them after the fact as per intel_context_set_gem
 2. Touch all 79 instances of intel_context_create( for each new
create param we add
 3. Add a new struct intel_context_create_args which contains all the
extra stuff and make NULL mean "use the defaults"
 4. Add a new struct i915_gem_engine which is used for client-visible
engines.  When we switch to an engine-based uAPI, this is probably
what would be exposed to userspace.

I'm happy to hear opinions on which of those is the best option. 2. is
clearly a bad idea.


> > -     }
> > -     clone->num_engines = n;
> > -     i915_sw_fence_complete(&e->fence);
> > -
> > -     /* Serialised by constructor */
> > -     engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
> > -     if (user_engines)
> > -             i915_gem_context_set_user_engines(dst);
> > -     else
> > -             i915_gem_context_clear_user_engines(dst);
> > -     return 0;
> > -
> > -err_unlock:
> > -     i915_sw_fence_complete(&e->fence);
> > -     return -ENOMEM;
> > -}
> > -
> > -static int clone_flags(struct i915_gem_context *dst,
> > -                    struct i915_gem_context *src)
> > -{
> > -     dst->user_flags = src->user_flags;
> > -     return 0;
> > -}
> > -
> > -static int clone_schedattr(struct i915_gem_context *dst,
> > -                        struct i915_gem_context *src)
> > -{
> > -     dst->sched = src->sched;
> > -     return 0;
> > -}
> > -
> > -static int clone_sseu(struct i915_gem_context *dst,
> > -                   struct i915_gem_context *src)
> > -{
> > -     struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
> > -     struct i915_gem_engines *clone;
> > -     unsigned long n;
> > -     int err;
> > -
> > -     /* no locking required; sole access under constructor*/
> > -     clone = __context_engines_static(dst);
> > -     if (e->num_engines != clone->num_engines) {
> > -             err = -EINVAL;
> > -             goto unlock;
> > -     }
> > -
> > -     for (n = 0; n < e->num_engines; n++) {
> > -             struct intel_context *ce = e->engines[n];
> > -
> > -             if (clone->engines[n]->engine->class != ce->engine->class) {
> > -                     /* Must have compatible engine maps! */
> > -                     err = -EINVAL;
> > -                     goto unlock;
> > -             }
> > -
> > -             /* serialises with set_sseu */
> > -             err = intel_context_lock_pinned(ce);
> > -             if (err)
> > -                     goto unlock;
> > -
> > -             clone->engines[n]->sseu = ce->sseu;
> > -             intel_context_unlock_pinned(ce);
> > -     }
> > -
> > -     err = 0;
> > -unlock:
> > -     i915_gem_context_unlock_engines(src);
> > -     return err;
> > -}
> > -
> > -static int clone_timeline(struct i915_gem_context *dst,
> > -                       struct i915_gem_context *src)
> > -{
> > -     if (src->timeline)
> > -             __assign_timeline(dst, src->timeline);
> > -
> > -     return 0;
> > -}
> > -
> > -static int clone_vm(struct i915_gem_context *dst,
> > -                 struct i915_gem_context *src)
> > -{
> > -     struct i915_address_space *vm;
> > -     int err = 0;
> > -
> > -     if (!rcu_access_pointer(src->vm))
> > -             return 0;
> > -
> > -     rcu_read_lock();
> > -     vm = context_get_vm_rcu(src);
> > -     rcu_read_unlock();
> > -
> > -     if (!mutex_lock_interruptible(&dst->mutex)) {
> > -             __assign_ppgtt(dst, vm);
> > -             mutex_unlock(&dst->mutex);
> > -     } else {
> > -             err = -EINTR;
> > -     }
> > -
> > -     i915_vm_put(vm);
> > -     return err;
> > -}
> > -
> > -static int create_clone(struct i915_user_extension __user *ext, void *data)
> > -{
> > -     static int (* const fn[])(struct i915_gem_context *dst,
> > -                               struct i915_gem_context *src) = {
> > -#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
> > -             MAP(ENGINES, clone_engines),
> > -             MAP(FLAGS, clone_flags),
> > -             MAP(SCHEDATTR, clone_schedattr),
> > -             MAP(SSEU, clone_sseu),
> > -             MAP(TIMELINE, clone_timeline),
> > -             MAP(VM, clone_vm),
> > -#undef MAP
> > -     };
> > -     struct drm_i915_gem_context_create_ext_clone local;
> > -     const struct create_ext *arg = data;
> > -     struct i915_gem_context *dst = arg->ctx;
> > -     struct i915_gem_context *src;
> > -     int err, bit;
> > -
> > -     if (copy_from_user(&local, ext, sizeof(local)))
> > -             return -EFAULT;
> > -
> > -     BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
> > -                  I915_CONTEXT_CLONE_UNKNOWN);
> > -
> > -     if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
> > -             return -EINVAL;
> > -
> > -     if (local.rsvd)
> > -             return -EINVAL;
> > -
> > -     rcu_read_lock();
> > -     src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
> > -     rcu_read_unlock();
> > -     if (!src)
> > -             return -ENOENT;
> > -
> > -     GEM_BUG_ON(src == dst);
> > -
> > -     for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
> > -             if (!(local.flags & BIT(bit)))
> > -                     continue;
> > -
> > -             err = fn[bit](dst, src);
> > -             if (err)
> > -                     return err;
> > -     }
> > -
> > -     return 0;
> > +     return -EINVAL;
> >  }
> >
> >  static const i915_user_extension_fn create_extensions[] = {
> >       [I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
> > -     [I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
> > +     [I915_CONTEXT_CREATE_EXT_CLONE] = invalid_ext,
> >  };
> >
> >  static bool client_is_banned(struct drm_i915_file_private *file_priv)
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index a0aaa8298f28d..75a71b6756ed8 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1887,20 +1887,10 @@ struct drm_i915_gem_context_create_ext_setparam {
> >       struct drm_i915_gem_context_param param;
> >  };
> >
> > -struct drm_i915_gem_context_create_ext_clone {
> > +/* This API has been removed.  On the off chance someone somewhere has
> > + * attempted to use it, never re-use this extension number.
> > + */
> >  #define I915_CONTEXT_CREATE_EXT_CLONE 1
>
> I think we need to put these somewhere else now, here it's just plain
> lost. I think in the kerneldoc for
> drm_i915_gem_context_create_ext_setparam would be best, with the #define
> right above and in the kerneldoc an enumeration of all the values and what
> they're for.

I fully agree it's not great.  But I'm not sure create_ext_setparam
makes sense either.  This is it's own extension that's unrelated to
ext_setparam.

--Jason


> I think I'll need to sign up Matt B or you for doing some kerneldoc polish
> on these so they're all collected together.
> -Daniel
>
> > -     struct i915_user_extension base;
> > -     __u32 clone_id;
> > -     __u32 flags;
> > -#define I915_CONTEXT_CLONE_ENGINES   (1u << 0)
> > -#define I915_CONTEXT_CLONE_FLAGS     (1u << 1)
> > -#define I915_CONTEXT_CLONE_SCHEDATTR (1u << 2)
> > -#define I915_CONTEXT_CLONE_SSEU              (1u << 3)
> > -#define I915_CONTEXT_CLONE_TIMELINE  (1u << 4)
> > -#define I915_CONTEXT_CLONE_VM                (1u << 5)
> > -#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
> > -     __u64 rsvd;
> > -};
> >
> >  struct drm_i915_gem_context_destroy {
> >       __u32 ctx_id;
> > --
> > 2.31.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 17:18           ` Matthew Brost
@ 2021-04-28 17:46             ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:46 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > >
> > > > > This adds a bunch of complexity which the media driver has never
> > > > > actually used.  The media driver does technically bond a balanced engine
> > > > > to another engine but the balanced engine only has one engine in the
> > > > > sibling set.  This doesn't actually result in a virtual engine.
> > > > >
> > > > > Unless some userspace badly wants it, there's no good reason to support
> > > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > > leave the validation code in place in case we ever decide we want to do
> > > > > something interesting with the bonding information.
> > > > >
> > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > >         }
> > > > >         virtual = set->engines->engines[idx]->engine;
> > > > >
> > > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > > +               drm_dbg(&i915->drm,
> > > > > +                       "Bonding with virtual engines not allowed\n");
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > >         err = check_user_mbz(&ext->flags);
> > > > >         if (err)
> > > > >                 return err;
> > > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > >                                 n, ci.engine_class, ci.engine_instance);
> > > > >                         return -EINVAL;
> > > > >                 }
> > > > > -
> > > > > -               /*
> > > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > > -                * a submit fence will always be directed to the one engine.
> > > > > -                */
> > > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > > -                                                              master,
> > > > > -                                                              bond);
> > > > > -                       if (err)
> > > > > -                               return err;
> > > > > -               }
> > > > >         }
> > > > >
> > > > >         return 0;
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > >                         err = i915_request_await_execution(eb.request,
> > > > >                                                            in_fence,
> > > > > -                                                          eb.engine->bond_execute);
> > > > > +                                                          NULL);
> > > > >                 else
> > > > >                         err = i915_request_await_dma_fence(eb.request,
> > > > >                                                            in_fence);
> > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > index 883bafc449024..68cfe5080325c 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > > >          */
> > > > >         void            (*submit_request)(struct i915_request *rq);
> > > > >
> > > > > -       /*
> > > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > > -        * request down to the bonded pairs.
> > > > > -        */
> > > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > > -                                       struct dma_fence *signal);
> > > > > -
> > > > >         /*
> > > > >          * Call when the priority on a request has changed and it and its
> > > > >          * dependencies may need rescheduling. Note the request itself may
> > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > index de124870af44d..b6e2b59f133b7 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > > >                 int prio;
> > > > >         } nodes[I915_NUM_ENGINES];
> > > > >
> > > > > -       /*
> > > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > > -        * of physical engines any particular request may be submitted to.
> > > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > > -        * use one of sibling_mask physical engines.
> > > > > -        */
> > > > > -       struct ve_bond {
> > > > > -               const struct intel_engine_cs *master;
> > > > > -               intel_engine_mask_t sibling_mask;
> > > > > -       } *bonds;
> > > > > -       unsigned int num_bonds;
> > > > > -
> > > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > > >         unsigned int num_siblings;
> > > > >         struct intel_engine_cs *siblings[];
> > > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > > >         intel_engine_free_request_pool(&ve->base);
> > > > >
> > > > > -       kfree(ve->bonds);
> > > > >         kfree(ve);
> > > > >  }
> > > > >
> > > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > > >  }
> > > > >
> > > > > -static struct ve_bond *
> > > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > > -                 const struct intel_engine_cs *master)
> > > > > -{
> > > > > -       int i;
> > > > > -
> > > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > > -               if (ve->bonds[i].master == master)
> > > > > -                       return &ve->bonds[i];
> > > > > -       }
> > > > > -
> > > > > -       return NULL;
> > > > > -}
> > > > > -
> > > > > -static void
> > > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > > -{
> > > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > > -       intel_engine_mask_t allowed, exec;
> > > > > -       struct ve_bond *bond;
> > > > > -
> > > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > > -
> > > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > > -       if (bond)
> > > > > -               allowed &= bond->sibling_mask;
> > > > > -
> > > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > > -               ;
> > > > > -
> > > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > > -       to_request(signal)->execution_mask &= ~allowed;
> > > >
> > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > much code.  This function in particular, has to stay, unfortunately.
> > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > the work onto a different engine than than the one it's supposed to
> > > > run in parallel with.  This means we can't dead-code this function or
> > > > the bond_execution function pointer and related stuff.
> > >
> > > Uh that's disappointing, since if I understand your point correctly, the
> > > sibling engines should all be singletons, not load balancing virtual ones.
> > > So there really should not be any need to pick the right one at execution
> > > time.
> >
> > The media driver itself seems to work fine if I delete all the code.
> > It's just an IGT testcase that blows up.  I'll do more digging to see
> > if I can better isolate why.
> >
>
> Jumping on here mid-thread. For what is is worth to make execlists work
> with the upcoming parallel submission extension I leveraged some of the
> existing bonding code so I wouldn't be too eager to delete this code
> until that lands.

Mind being a bit more specific about that?  The motivation for this
patch is that the current bonding handling and uAPI is, well, very odd
and confusing IMO.  It doesn't let you create sets of bonded engines.
Instead you create engines and then bond them together after the fact.
I didn't want to blindly duplicate those oddities with the proto-ctx
stuff unless they were useful.  With parallel submit, I would expect
we want a more explicit API where you specify a set of engine
class/instance pairs to bond together into a single engine similar to
how the current balancing API works.

Of course, that's all focused on the API and not the internals.  But,
again, I'm not sure how we want things to look internally.  What we've
got now doesn't seem great for the GuC submission model but I'm very
much not the expert there.  I don't want to be working at cross
purposes to you and I'm happy to leave bits if you think they're
useful.  But I thought I was clearing things away so that you can put
in what you actually want for GuC/parallel submit.

--Jason

> Matt
>
> > --Jason
> >
> > > At least my understanding is that we're only limiting the engine set
> > > further, so if both signaller and signalled request can only run on
> > > singletons (which must be distinct, or the bonded parameter validation is
> > > busted) there's really nothing to do here.
> > >
> > > Also this is the locking code that freaks me out about the current bonded
> > > execlist code ...
> > >
> > > Dazzled and confused.
> > > -Daniel
> > >
> > > >
> > > > --Jason
> > > >
> > > >
> > > > > -}
> > > > > -
> > > > >  struct intel_context *
> > > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > >                                unsigned int count)
> > > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > >
> > > > >         ve->base.schedule = i915_schedule;
> > > > >         ve->base.submit_request = virtual_submit_request;
> > > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > > >
> > > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > > >         if (IS_ERR(dst))
> > > > >                 return dst;
> > > > >
> > > > > -       if (se->num_bonds) {
> > > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > > -
> > > > > -               de->bonds = kmemdup(se->bonds,
> > > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > > -                                   GFP_KERNEL);
> > > > > -               if (!de->bonds) {
> > > > > -                       intel_context_put(dst);
> > > > > -                       return ERR_PTR(-ENOMEM);
> > > > > -               }
> > > > > -
> > > > > -               de->num_bonds = se->num_bonds;
> > > > > -       }
> > > > > -
> > > > >         return dst;
> > > > >  }
> > > > >
> > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > -                                    const struct intel_engine_cs *master,
> > > > > -                                    const struct intel_engine_cs *sibling)
> > > > > -{
> > > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > > -       struct ve_bond *bond;
> > > > > -       int n;
> > > > > -
> > > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > > -               if (sibling == ve->siblings[n])
> > > > > -                       break;
> > > > > -       if (n == ve->num_siblings)
> > > > > -               return -EINVAL;
> > > > > -
> > > > > -       bond = virtual_find_bond(ve, master);
> > > > > -       if (bond) {
> > > > > -               bond->sibling_mask |= sibling->mask;
> > > > > -               return 0;
> > > > > -       }
> > > > > -
> > > > > -       bond = krealloc(ve->bonds,
> > > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > > -                       GFP_KERNEL);
> > > > > -       if (!bond)
> > > > > -               return -ENOMEM;
> > > > > -
> > > > > -       bond[ve->num_bonds].master = master;
> > > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > > -
> > > > > -       ve->bonds = bond;
> > > > > -       ve->num_bonds++;
> > > > > -
> > > > > -       return 0;
> > > > > -}
> > > > > -
> > > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > > >                                    struct drm_printer *m,
> > > > >                                    void (*show_request)(struct drm_printer *m,
> > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > >  struct intel_context *
> > > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > > >
> > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > -                                    const struct intel_engine_cs *master,
> > > > > -                                    const struct intel_engine_cs *sibling);
> > > > > -
> > > > >  bool
> > > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > > >         return 0;
> > > > >  }
> > > > >
> > > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > > -                              unsigned int class,
> > > > > -                              struct intel_engine_cs **siblings,
> > > > > -                              unsigned int nsibling,
> > > > > -                              unsigned int flags)
> > > > > -#define BOND_SCHEDULE BIT(0)
> > > > > -{
> > > > > -       struct intel_engine_cs *master;
> > > > > -       struct i915_request *rq[16];
> > > > > -       enum intel_engine_id id;
> > > > > -       struct igt_spinner spin;
> > > > > -       unsigned long n;
> > > > > -       int err;
> > > > > -
> > > > > -       /*
> > > > > -        * A set of bonded requests is intended to be run concurrently
> > > > > -        * across a number of engines. We use one request per-engine
> > > > > -        * and a magic fence to schedule each of the bonded requests
> > > > > -        * at the same time. A consequence of our current scheduler is that
> > > > > -        * we only move requests to the HW ready queue when the request
> > > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > > -        * there is a delay on all secondary fences as the HW may be
> > > > > -        * currently busy. Equally, as all the requests are independent,
> > > > > -        * they may have other fences that delay individual request
> > > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > > -        * immediately submitted to HW at the same time, just that if the
> > > > > -        * rules are abided by, they are ready at the same time as the
> > > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > > -        * to ensure parallel execution of its phases as it requires.
> > > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > > -        * take care of parallel execution, even across preemption events on
> > > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > > -        *
> > > > > -        * With the submit-fence, we have identified three possible phases
> > > > > -        * of synchronisation depending on the master fence: queued (not
> > > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > > -        * and checked below. However, the signaled master fence handling is
> > > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > > -        * any information about the previous execution. It may even be freed
> > > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > > -        * as our expectation is that it should not constrain the secondaries
> > > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > > -        * userspace requests are meant to be running in parallel). As
> > > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > > -        * check below as normal execution flows are checked extensively above.
> > > > > -        *
> > > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > > -        * expected behaviour for userpace?
> > > > > -        */
> > > > > -
> > > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > > -
> > > > > -       if (igt_spinner_init(&spin, gt))
> > > > > -               return -ENOMEM;
> > > > > -
> > > > > -       err = 0;
> > > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > > -       for_each_engine(master, gt, id) {
> > > > > -               struct i915_sw_fence fence = {};
> > > > > -               struct intel_context *ce;
> > > > > -
> > > > > -               if (master->class == class)
> > > > > -                       continue;
> > > > > -
> > > > > -               ce = intel_context_create(master);
> > > > > -               if (IS_ERR(ce)) {
> > > > > -                       err = PTR_ERR(ce);
> > > > > -                       goto out;
> > > > > -               }
> > > > > -
> > > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > > -
> > > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > > -               intel_context_put(ce);
> > > > > -               if (IS_ERR(rq[0])) {
> > > > > -                       err = PTR_ERR(rq[0]);
> > > > > -                       goto out;
> > > > > -               }
> > > > > -               i915_request_get(rq[0]);
> > > > > -
> > > > > -               if (flags & BOND_SCHEDULE) {
> > > > > -                       onstack_fence_init(&fence);
> > > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > > -                                                              &fence,
> > > > > -                                                              GFP_KERNEL);
> > > > > -               }
> > > > > -
> > > > > -               i915_request_add(rq[0]);
> > > > > -               if (err < 0)
> > > > > -                       goto out;
> > > > > -
> > > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > > -                       err = -EIO;
> > > > > -                       goto out;
> > > > > -               }
> > > > > -
> > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > -                       struct intel_context *ve;
> > > > > -
> > > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > > -                       if (IS_ERR(ve)) {
> > > > > -                               err = PTR_ERR(ve);
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > > -                                                              master,
> > > > > -                                                              siblings[n]);
> > > > > -                       if (err) {
> > > > > -                               intel_context_put(ve);
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       err = intel_context_pin(ve);
> > > > > -                       intel_context_put(ve);
> > > > > -                       if (err) {
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > > -                       intel_context_unpin(ve);
> > > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -                       i915_request_get(rq[n + 1]);
> > > > > -
> > > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > > -                                                          &rq[0]->fence,
> > > > > -                                                          ve->engine->bond_execute);
> > > > > -                       i915_request_add(rq[n + 1]);
> > > > > -                       if (err < 0) {
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -               }
> > > > > -               onstack_fence_fini(&fence);
> > > > > -               intel_engine_flush_submission(master);
> > > > > -               igt_spinner_end(&spin);
> > > > > -
> > > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > > -                              rq[0]->engine->name);
> > > > > -                       err = -EIO;
> > > > > -                       goto out;
> > > > > -               }
> > > > > -
> > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > > -                               err = -EIO;
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > > -                                      siblings[n]->name,
> > > > > -                                      rq[n + 1]->engine->name,
> > > > > -                                      rq[0]->engine->name);
> > > > > -                               err = -EINVAL;
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -               }
> > > > > -
> > > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > -                       i915_request_put(rq[n]);
> > > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > > -       }
> > > > > -
> > > > > -out:
> > > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > -               i915_request_put(rq[n]);
> > > > > -       if (igt_flush_test(gt->i915))
> > > > > -               err = -EIO;
> > > > > -
> > > > > -       igt_spinner_fini(&spin);
> > > > > -       return err;
> > > > > -}
> > > > > -
> > > > > -static int live_virtual_bond(void *arg)
> > > > > -{
> > > > > -       static const struct phase {
> > > > > -               const char *name;
> > > > > -               unsigned int flags;
> > > > > -       } phases[] = {
> > > > > -               { "", 0 },
> > > > > -               { "schedule", BOND_SCHEDULE },
> > > > > -               { },
> > > > > -       };
> > > > > -       struct intel_gt *gt = arg;
> > > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > > -       unsigned int class;
> > > > > -       int err;
> > > > > -
> > > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > > -               return 0;
> > > > > -
> > > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > > -               const struct phase *p;
> > > > > -               int nsibling;
> > > > > -
> > > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > > -               if (nsibling < 2)
> > > > > -                       continue;
> > > > > -
> > > > > -               for (p = phases; p->name; p++) {
> > > > > -                       err = bond_virtual_engine(gt,
> > > > > -                                                 class, siblings, nsibling,
> > > > > -                                                 p->flags);
> > > > > -                       if (err) {
> > > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > > -                                      __func__, p->name, class, nsibling, err);
> > > > > -                               return err;
> > > > > -                       }
> > > > > -               }
> > > > > -       }
> > > > > -
> > > > > -       return 0;
> > > > > -}
> > > > > -
> > > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > > >                                 struct intel_engine_cs **siblings,
> > > > >                                 unsigned int nsibling)
> > > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > > >                 SUBTEST(live_virtual_mask),
> > > > >                 SUBTEST(live_virtual_preserved),
> > > > >                 SUBTEST(live_virtual_slice),
> > > > > -               SUBTEST(live_virtual_bond),
> > > > >                 SUBTEST(live_virtual_reset),
> > > > >         };
> > > > >
> > > > > --
> > > > > 2.31.1
> > > > >
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 17:46             ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 17:46 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > >
> > > > > This adds a bunch of complexity which the media driver has never
> > > > > actually used.  The media driver does technically bond a balanced engine
> > > > > to another engine but the balanced engine only has one engine in the
> > > > > sibling set.  This doesn't actually result in a virtual engine.
> > > > >
> > > > > Unless some userspace badly wants it, there's no good reason to support
> > > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > > leave the validation code in place in case we ever decide we want to do
> > > > > something interesting with the bonding information.
> > > > >
> > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > >         }
> > > > >         virtual = set->engines->engines[idx]->engine;
> > > > >
> > > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > > +               drm_dbg(&i915->drm,
> > > > > +                       "Bonding with virtual engines not allowed\n");
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > >         err = check_user_mbz(&ext->flags);
> > > > >         if (err)
> > > > >                 return err;
> > > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > >                                 n, ci.engine_class, ci.engine_instance);
> > > > >                         return -EINVAL;
> > > > >                 }
> > > > > -
> > > > > -               /*
> > > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > > -                * a submit fence will always be directed to the one engine.
> > > > > -                */
> > > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > > -                                                              master,
> > > > > -                                                              bond);
> > > > > -                       if (err)
> > > > > -                               return err;
> > > > > -               }
> > > > >         }
> > > > >
> > > > >         return 0;
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > >                         err = i915_request_await_execution(eb.request,
> > > > >                                                            in_fence,
> > > > > -                                                          eb.engine->bond_execute);
> > > > > +                                                          NULL);
> > > > >                 else
> > > > >                         err = i915_request_await_dma_fence(eb.request,
> > > > >                                                            in_fence);
> > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > index 883bafc449024..68cfe5080325c 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > > >          */
> > > > >         void            (*submit_request)(struct i915_request *rq);
> > > > >
> > > > > -       /*
> > > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > > -        * request down to the bonded pairs.
> > > > > -        */
> > > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > > -                                       struct dma_fence *signal);
> > > > > -
> > > > >         /*
> > > > >          * Call when the priority on a request has changed and it and its
> > > > >          * dependencies may need rescheduling. Note the request itself may
> > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > index de124870af44d..b6e2b59f133b7 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > > >                 int prio;
> > > > >         } nodes[I915_NUM_ENGINES];
> > > > >
> > > > > -       /*
> > > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > > -        * of physical engines any particular request may be submitted to.
> > > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > > -        * use one of sibling_mask physical engines.
> > > > > -        */
> > > > > -       struct ve_bond {
> > > > > -               const struct intel_engine_cs *master;
> > > > > -               intel_engine_mask_t sibling_mask;
> > > > > -       } *bonds;
> > > > > -       unsigned int num_bonds;
> > > > > -
> > > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > > >         unsigned int num_siblings;
> > > > >         struct intel_engine_cs *siblings[];
> > > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > > >         intel_engine_free_request_pool(&ve->base);
> > > > >
> > > > > -       kfree(ve->bonds);
> > > > >         kfree(ve);
> > > > >  }
> > > > >
> > > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > > >  }
> > > > >
> > > > > -static struct ve_bond *
> > > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > > -                 const struct intel_engine_cs *master)
> > > > > -{
> > > > > -       int i;
> > > > > -
> > > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > > -               if (ve->bonds[i].master == master)
> > > > > -                       return &ve->bonds[i];
> > > > > -       }
> > > > > -
> > > > > -       return NULL;
> > > > > -}
> > > > > -
> > > > > -static void
> > > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > > -{
> > > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > > -       intel_engine_mask_t allowed, exec;
> > > > > -       struct ve_bond *bond;
> > > > > -
> > > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > > -
> > > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > > -       if (bond)
> > > > > -               allowed &= bond->sibling_mask;
> > > > > -
> > > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > > -               ;
> > > > > -
> > > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > > -       to_request(signal)->execution_mask &= ~allowed;
> > > >
> > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > much code.  This function in particular, has to stay, unfortunately.
> > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > the work onto a different engine than than the one it's supposed to
> > > > run in parallel with.  This means we can't dead-code this function or
> > > > the bond_execution function pointer and related stuff.
> > >
> > > Uh that's disappointing, since if I understand your point correctly, the
> > > sibling engines should all be singletons, not load balancing virtual ones.
> > > So there really should not be any need to pick the right one at execution
> > > time.
> >
> > The media driver itself seems to work fine if I delete all the code.
> > It's just an IGT testcase that blows up.  I'll do more digging to see
> > if I can better isolate why.
> >
>
> Jumping on here mid-thread. For what is is worth to make execlists work
> with the upcoming parallel submission extension I leveraged some of the
> existing bonding code so I wouldn't be too eager to delete this code
> until that lands.

Mind being a bit more specific about that?  The motivation for this
patch is that the current bonding handling and uAPI is, well, very odd
and confusing IMO.  It doesn't let you create sets of bonded engines.
Instead you create engines and then bond them together after the fact.
I didn't want to blindly duplicate those oddities with the proto-ctx
stuff unless they were useful.  With parallel submit, I would expect
we want a more explicit API where you specify a set of engine
class/instance pairs to bond together into a single engine similar to
how the current balancing API works.

Of course, that's all focused on the API and not the internals.  But,
again, I'm not sure how we want things to look internally.  What we've
got now doesn't seem great for the GuC submission model but I'm very
much not the expert there.  I don't want to be working at cross
purposes to you and I'm happy to leave bits if you think they're
useful.  But I thought I was clearing things away so that you can put
in what you actually want for GuC/parallel submit.

--Jason

> Matt
>
> > --Jason
> >
> > > At least my understanding is that we're only limiting the engine set
> > > further, so if both signaller and signalled request can only run on
> > > singletons (which must be distinct, or the bonded parameter validation is
> > > busted) there's really nothing to do here.
> > >
> > > Also this is the locking code that freaks me out about the current bonded
> > > execlist code ...
> > >
> > > Dazzled and confused.
> > > -Daniel
> > >
> > > >
> > > > --Jason
> > > >
> > > >
> > > > > -}
> > > > > -
> > > > >  struct intel_context *
> > > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > >                                unsigned int count)
> > > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > >
> > > > >         ve->base.schedule = i915_schedule;
> > > > >         ve->base.submit_request = virtual_submit_request;
> > > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > > >
> > > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > > >         if (IS_ERR(dst))
> > > > >                 return dst;
> > > > >
> > > > > -       if (se->num_bonds) {
> > > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > > -
> > > > > -               de->bonds = kmemdup(se->bonds,
> > > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > > -                                   GFP_KERNEL);
> > > > > -               if (!de->bonds) {
> > > > > -                       intel_context_put(dst);
> > > > > -                       return ERR_PTR(-ENOMEM);
> > > > > -               }
> > > > > -
> > > > > -               de->num_bonds = se->num_bonds;
> > > > > -       }
> > > > > -
> > > > >         return dst;
> > > > >  }
> > > > >
> > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > -                                    const struct intel_engine_cs *master,
> > > > > -                                    const struct intel_engine_cs *sibling)
> > > > > -{
> > > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > > -       struct ve_bond *bond;
> > > > > -       int n;
> > > > > -
> > > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > > -               if (sibling == ve->siblings[n])
> > > > > -                       break;
> > > > > -       if (n == ve->num_siblings)
> > > > > -               return -EINVAL;
> > > > > -
> > > > > -       bond = virtual_find_bond(ve, master);
> > > > > -       if (bond) {
> > > > > -               bond->sibling_mask |= sibling->mask;
> > > > > -               return 0;
> > > > > -       }
> > > > > -
> > > > > -       bond = krealloc(ve->bonds,
> > > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > > -                       GFP_KERNEL);
> > > > > -       if (!bond)
> > > > > -               return -ENOMEM;
> > > > > -
> > > > > -       bond[ve->num_bonds].master = master;
> > > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > > -
> > > > > -       ve->bonds = bond;
> > > > > -       ve->num_bonds++;
> > > > > -
> > > > > -       return 0;
> > > > > -}
> > > > > -
> > > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > > >                                    struct drm_printer *m,
> > > > >                                    void (*show_request)(struct drm_printer *m,
> > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > >  struct intel_context *
> > > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > > >
> > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > -                                    const struct intel_engine_cs *master,
> > > > > -                                    const struct intel_engine_cs *sibling);
> > > > > -
> > > > >  bool
> > > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > > >         return 0;
> > > > >  }
> > > > >
> > > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > > -                              unsigned int class,
> > > > > -                              struct intel_engine_cs **siblings,
> > > > > -                              unsigned int nsibling,
> > > > > -                              unsigned int flags)
> > > > > -#define BOND_SCHEDULE BIT(0)
> > > > > -{
> > > > > -       struct intel_engine_cs *master;
> > > > > -       struct i915_request *rq[16];
> > > > > -       enum intel_engine_id id;
> > > > > -       struct igt_spinner spin;
> > > > > -       unsigned long n;
> > > > > -       int err;
> > > > > -
> > > > > -       /*
> > > > > -        * A set of bonded requests is intended to be run concurrently
> > > > > -        * across a number of engines. We use one request per-engine
> > > > > -        * and a magic fence to schedule each of the bonded requests
> > > > > -        * at the same time. A consequence of our current scheduler is that
> > > > > -        * we only move requests to the HW ready queue when the request
> > > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > > -        * there is a delay on all secondary fences as the HW may be
> > > > > -        * currently busy. Equally, as all the requests are independent,
> > > > > -        * they may have other fences that delay individual request
> > > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > > -        * immediately submitted to HW at the same time, just that if the
> > > > > -        * rules are abided by, they are ready at the same time as the
> > > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > > -        * to ensure parallel execution of its phases as it requires.
> > > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > > -        * take care of parallel execution, even across preemption events on
> > > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > > -        *
> > > > > -        * With the submit-fence, we have identified three possible phases
> > > > > -        * of synchronisation depending on the master fence: queued (not
> > > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > > -        * and checked below. However, the signaled master fence handling is
> > > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > > -        * any information about the previous execution. It may even be freed
> > > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > > -        * as our expectation is that it should not constrain the secondaries
> > > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > > -        * userspace requests are meant to be running in parallel). As
> > > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > > -        * check below as normal execution flows are checked extensively above.
> > > > > -        *
> > > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > > -        * expected behaviour for userpace?
> > > > > -        */
> > > > > -
> > > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > > -
> > > > > -       if (igt_spinner_init(&spin, gt))
> > > > > -               return -ENOMEM;
> > > > > -
> > > > > -       err = 0;
> > > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > > -       for_each_engine(master, gt, id) {
> > > > > -               struct i915_sw_fence fence = {};
> > > > > -               struct intel_context *ce;
> > > > > -
> > > > > -               if (master->class == class)
> > > > > -                       continue;
> > > > > -
> > > > > -               ce = intel_context_create(master);
> > > > > -               if (IS_ERR(ce)) {
> > > > > -                       err = PTR_ERR(ce);
> > > > > -                       goto out;
> > > > > -               }
> > > > > -
> > > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > > -
> > > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > > -               intel_context_put(ce);
> > > > > -               if (IS_ERR(rq[0])) {
> > > > > -                       err = PTR_ERR(rq[0]);
> > > > > -                       goto out;
> > > > > -               }
> > > > > -               i915_request_get(rq[0]);
> > > > > -
> > > > > -               if (flags & BOND_SCHEDULE) {
> > > > > -                       onstack_fence_init(&fence);
> > > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > > -                                                              &fence,
> > > > > -                                                              GFP_KERNEL);
> > > > > -               }
> > > > > -
> > > > > -               i915_request_add(rq[0]);
> > > > > -               if (err < 0)
> > > > > -                       goto out;
> > > > > -
> > > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > > -                       err = -EIO;
> > > > > -                       goto out;
> > > > > -               }
> > > > > -
> > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > -                       struct intel_context *ve;
> > > > > -
> > > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > > -                       if (IS_ERR(ve)) {
> > > > > -                               err = PTR_ERR(ve);
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > > -                                                              master,
> > > > > -                                                              siblings[n]);
> > > > > -                       if (err) {
> > > > > -                               intel_context_put(ve);
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       err = intel_context_pin(ve);
> > > > > -                       intel_context_put(ve);
> > > > > -                       if (err) {
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > > -                       intel_context_unpin(ve);
> > > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -                       i915_request_get(rq[n + 1]);
> > > > > -
> > > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > > -                                                          &rq[0]->fence,
> > > > > -                                                          ve->engine->bond_execute);
> > > > > -                       i915_request_add(rq[n + 1]);
> > > > > -                       if (err < 0) {
> > > > > -                               onstack_fence_fini(&fence);
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -               }
> > > > > -               onstack_fence_fini(&fence);
> > > > > -               intel_engine_flush_submission(master);
> > > > > -               igt_spinner_end(&spin);
> > > > > -
> > > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > > -                              rq[0]->engine->name);
> > > > > -                       err = -EIO;
> > > > > -                       goto out;
> > > > > -               }
> > > > > -
> > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > > -                               err = -EIO;
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -
> > > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > > -                                      siblings[n]->name,
> > > > > -                                      rq[n + 1]->engine->name,
> > > > > -                                      rq[0]->engine->name);
> > > > > -                               err = -EINVAL;
> > > > > -                               goto out;
> > > > > -                       }
> > > > > -               }
> > > > > -
> > > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > -                       i915_request_put(rq[n]);
> > > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > > -       }
> > > > > -
> > > > > -out:
> > > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > -               i915_request_put(rq[n]);
> > > > > -       if (igt_flush_test(gt->i915))
> > > > > -               err = -EIO;
> > > > > -
> > > > > -       igt_spinner_fini(&spin);
> > > > > -       return err;
> > > > > -}
> > > > > -
> > > > > -static int live_virtual_bond(void *arg)
> > > > > -{
> > > > > -       static const struct phase {
> > > > > -               const char *name;
> > > > > -               unsigned int flags;
> > > > > -       } phases[] = {
> > > > > -               { "", 0 },
> > > > > -               { "schedule", BOND_SCHEDULE },
> > > > > -               { },
> > > > > -       };
> > > > > -       struct intel_gt *gt = arg;
> > > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > > -       unsigned int class;
> > > > > -       int err;
> > > > > -
> > > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > > -               return 0;
> > > > > -
> > > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > > -               const struct phase *p;
> > > > > -               int nsibling;
> > > > > -
> > > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > > -               if (nsibling < 2)
> > > > > -                       continue;
> > > > > -
> > > > > -               for (p = phases; p->name; p++) {
> > > > > -                       err = bond_virtual_engine(gt,
> > > > > -                                                 class, siblings, nsibling,
> > > > > -                                                 p->flags);
> > > > > -                       if (err) {
> > > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > > -                                      __func__, p->name, class, nsibling, err);
> > > > > -                               return err;
> > > > > -                       }
> > > > > -               }
> > > > > -       }
> > > > > -
> > > > > -       return 0;
> > > > > -}
> > > > > -
> > > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > > >                                 struct intel_engine_cs **siblings,
> > > > >                                 unsigned int nsibling)
> > > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > > >                 SUBTEST(live_virtual_mask),
> > > > >                 SUBTEST(live_virtual_preserved),
> > > > >                 SUBTEST(live_virtual_slice),
> > > > > -               SUBTEST(live_virtual_bond),
> > > > >                 SUBTEST(live_virtual_reset),
> > > > >         };
> > > > >
> > > > > --
> > > > > 2.31.1
> > > > >
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 17:46             ` Jason Ekstrand
@ 2021-04-28 17:55               ` Matthew Brost
  -1 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-04-28 17:55 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > > >
> > > > > > This adds a bunch of complexity which the media driver has never
> > > > > > actually used.  The media driver does technically bond a balanced engine
> > > > > > to another engine but the balanced engine only has one engine in the
> > > > > > sibling set.  This doesn't actually result in a virtual engine.
> > > > > >
> > > > > > Unless some userspace badly wants it, there's no good reason to support
> > > > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > > > leave the validation code in place in case we ever decide we want to do
> > > > > > something interesting with the bonding information.
> > > > > >
> > > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > >         }
> > > > > >         virtual = set->engines->engines[idx]->engine;
> > > > > >
> > > > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > > > +               drm_dbg(&i915->drm,
> > > > > > +                       "Bonding with virtual engines not allowed\n");
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > >         err = check_user_mbz(&ext->flags);
> > > > > >         if (err)
> > > > > >                 return err;
> > > > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > >                                 n, ci.engine_class, ci.engine_instance);
> > > > > >                         return -EINVAL;
> > > > > >                 }
> > > > > > -
> > > > > > -               /*
> > > > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > > > -                * a submit fence will always be directed to the one engine.
> > > > > > -                */
> > > > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > > > -                                                              master,
> > > > > > -                                                              bond);
> > > > > > -                       if (err)
> > > > > > -                               return err;
> > > > > > -               }
> > > > > >         }
> > > > > >
> > > > > >         return 0;
> > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > > >                         err = i915_request_await_execution(eb.request,
> > > > > >                                                            in_fence,
> > > > > > -                                                          eb.engine->bond_execute);
> > > > > > +                                                          NULL);
> > > > > >                 else
> > > > > >                         err = i915_request_await_dma_fence(eb.request,
> > > > > >                                                            in_fence);
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > index 883bafc449024..68cfe5080325c 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > > > >          */
> > > > > >         void            (*submit_request)(struct i915_request *rq);
> > > > > >
> > > > > > -       /*
> > > > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > > > -        * request down to the bonded pairs.
> > > > > > -        */
> > > > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > > > -                                       struct dma_fence *signal);
> > > > > > -
> > > > > >         /*
> > > > > >          * Call when the priority on a request has changed and it and its
> > > > > >          * dependencies may need rescheduling. Note the request itself may
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > index de124870af44d..b6e2b59f133b7 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > > > >                 int prio;
> > > > > >         } nodes[I915_NUM_ENGINES];
> > > > > >
> > > > > > -       /*
> > > > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > > > -        * of physical engines any particular request may be submitted to.
> > > > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > > > -        * use one of sibling_mask physical engines.
> > > > > > -        */
> > > > > > -       struct ve_bond {
> > > > > > -               const struct intel_engine_cs *master;
> > > > > > -               intel_engine_mask_t sibling_mask;
> > > > > > -       } *bonds;
> > > > > > -       unsigned int num_bonds;
> > > > > > -
> > > > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > > > >         unsigned int num_siblings;
> > > > > >         struct intel_engine_cs *siblings[];
> > > > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > > > >         intel_engine_free_request_pool(&ve->base);
> > > > > >
> > > > > > -       kfree(ve->bonds);
> > > > > >         kfree(ve);
> > > > > >  }
> > > > > >
> > > > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > > > >  }
> > > > > >
> > > > > > -static struct ve_bond *
> > > > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > > > -                 const struct intel_engine_cs *master)
> > > > > > -{
> > > > > > -       int i;
> > > > > > -
> > > > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > > > -               if (ve->bonds[i].master == master)
> > > > > > -                       return &ve->bonds[i];
> > > > > > -       }
> > > > > > -
> > > > > > -       return NULL;
> > > > > > -}
> > > > > > -
> > > > > > -static void
> > > > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > > > -{
> > > > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > > > -       intel_engine_mask_t allowed, exec;
> > > > > > -       struct ve_bond *bond;
> > > > > > -
> > > > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > > > -
> > > > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > > > -       if (bond)
> > > > > > -               allowed &= bond->sibling_mask;
> > > > > > -
> > > > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > > > -               ;
> > > > > > -
> > > > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > > > -       to_request(signal)->execution_mask &= ~allowed;
> > > > >
> > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > the work onto a different engine than than the one it's supposed to
> > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > the bond_execution function pointer and related stuff.
> > > >
> > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > So there really should not be any need to pick the right one at execution
> > > > time.
> > >
> > > The media driver itself seems to work fine if I delete all the code.
> > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > if I can better isolate why.
> > >
> >
> > Jumping on here mid-thread. For what is is worth to make execlists work
> > with the upcoming parallel submission extension I leveraged some of the
> > existing bonding code so I wouldn't be too eager to delete this code
> > until that lands.
> 
> Mind being a bit more specific about that?  The motivation for this
> patch is that the current bonding handling and uAPI is, well, very odd
> and confusing IMO.  It doesn't let you create sets of bonded engines.
> Instead you create engines and then bond them together after the fact.
> I didn't want to blindly duplicate those oddities with the proto-ctx
> stuff unless they were useful.  With parallel submit, I would expect
> we want a more explicit API where you specify a set of engine
> class/instance pairs to bond together into a single engine similar to
> how the current balancing API works.
> 
> Of course, that's all focused on the API and not the internals.  But,
> again, I'm not sure how we want things to look internally.  What we've
> got now doesn't seem great for the GuC submission model but I'm very
> much not the expert there.  I don't want to be working at cross
> purposes to you and I'm happy to leave bits if you think they're
> useful.  But I thought I was clearing things away so that you can put
> in what you actually want for GuC/parallel submit.
> 

Removing all the UAPI things are fine but I wouldn't delete some of the
internal stuff (e.g. intel_virtual_engine_attach_bond, bond
intel_context_ops, the hook for a submit fence, etc...) as that will
still likely be used for the new parallel submission interface with
execlists. As you say the new UAPI wont allow crazy configurations,
only simple ones.

Matt

> --Jason
> 
> > Matt
> >
> > > --Jason
> > >
> > > > At least my understanding is that we're only limiting the engine set
> > > > further, so if both signaller and signalled request can only run on
> > > > singletons (which must be distinct, or the bonded parameter validation is
> > > > busted) there's really nothing to do here.
> > > >
> > > > Also this is the locking code that freaks me out about the current bonded
> > > > execlist code ...
> > > >
> > > > Dazzled and confused.
> > > > -Daniel
> > > >
> > > > >
> > > > > --Jason
> > > > >
> > > > >
> > > > > > -}
> > > > > > -
> > > > > >  struct intel_context *
> > > > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > >                                unsigned int count)
> > > > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > >
> > > > > >         ve->base.schedule = i915_schedule;
> > > > > >         ve->base.submit_request = virtual_submit_request;
> > > > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > > > >
> > > > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > > > >         if (IS_ERR(dst))
> > > > > >                 return dst;
> > > > > >
> > > > > > -       if (se->num_bonds) {
> > > > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > > > -
> > > > > > -               de->bonds = kmemdup(se->bonds,
> > > > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > > > -                                   GFP_KERNEL);
> > > > > > -               if (!de->bonds) {
> > > > > > -                       intel_context_put(dst);
> > > > > > -                       return ERR_PTR(-ENOMEM);
> > > > > > -               }
> > > > > > -
> > > > > > -               de->num_bonds = se->num_bonds;
> > > > > > -       }
> > > > > > -
> > > > > >         return dst;
> > > > > >  }
> > > > > >
> > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > -                                    const struct intel_engine_cs *sibling)
> > > > > > -{
> > > > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > > > -       struct ve_bond *bond;
> > > > > > -       int n;
> > > > > > -
> > > > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > > > -               if (sibling == ve->siblings[n])
> > > > > > -                       break;
> > > > > > -       if (n == ve->num_siblings)
> > > > > > -               return -EINVAL;
> > > > > > -
> > > > > > -       bond = virtual_find_bond(ve, master);
> > > > > > -       if (bond) {
> > > > > > -               bond->sibling_mask |= sibling->mask;
> > > > > > -               return 0;
> > > > > > -       }
> > > > > > -
> > > > > > -       bond = krealloc(ve->bonds,
> > > > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > > > -                       GFP_KERNEL);
> > > > > > -       if (!bond)
> > > > > > -               return -ENOMEM;
> > > > > > -
> > > > > > -       bond[ve->num_bonds].master = master;
> > > > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > > > -
> > > > > > -       ve->bonds = bond;
> > > > > > -       ve->num_bonds++;
> > > > > > -
> > > > > > -       return 0;
> > > > > > -}
> > > > > > -
> > > > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > > > >                                    struct drm_printer *m,
> > > > > >                                    void (*show_request)(struct drm_printer *m,
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > >  struct intel_context *
> > > > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > > > >
> > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > -                                    const struct intel_engine_cs *sibling);
> > > > > > -
> > > > > >  bool
> > > > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > > > >         return 0;
> > > > > >  }
> > > > > >
> > > > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > > > -                              unsigned int class,
> > > > > > -                              struct intel_engine_cs **siblings,
> > > > > > -                              unsigned int nsibling,
> > > > > > -                              unsigned int flags)
> > > > > > -#define BOND_SCHEDULE BIT(0)
> > > > > > -{
> > > > > > -       struct intel_engine_cs *master;
> > > > > > -       struct i915_request *rq[16];
> > > > > > -       enum intel_engine_id id;
> > > > > > -       struct igt_spinner spin;
> > > > > > -       unsigned long n;
> > > > > > -       int err;
> > > > > > -
> > > > > > -       /*
> > > > > > -        * A set of bonded requests is intended to be run concurrently
> > > > > > -        * across a number of engines. We use one request per-engine
> > > > > > -        * and a magic fence to schedule each of the bonded requests
> > > > > > -        * at the same time. A consequence of our current scheduler is that
> > > > > > -        * we only move requests to the HW ready queue when the request
> > > > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > > > -        * there is a delay on all secondary fences as the HW may be
> > > > > > -        * currently busy. Equally, as all the requests are independent,
> > > > > > -        * they may have other fences that delay individual request
> > > > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > > > -        * immediately submitted to HW at the same time, just that if the
> > > > > > -        * rules are abided by, they are ready at the same time as the
> > > > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > > > -        * to ensure parallel execution of its phases as it requires.
> > > > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > > > -        * take care of parallel execution, even across preemption events on
> > > > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > > > -        *
> > > > > > -        * With the submit-fence, we have identified three possible phases
> > > > > > -        * of synchronisation depending on the master fence: queued (not
> > > > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > > > -        * and checked below. However, the signaled master fence handling is
> > > > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > > > -        * any information about the previous execution. It may even be freed
> > > > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > > > -        * as our expectation is that it should not constrain the secondaries
> > > > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > > > -        * userspace requests are meant to be running in parallel). As
> > > > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > > > -        * check below as normal execution flows are checked extensively above.
> > > > > > -        *
> > > > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > > > -        * expected behaviour for userpace?
> > > > > > -        */
> > > > > > -
> > > > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > > > -
> > > > > > -       if (igt_spinner_init(&spin, gt))
> > > > > > -               return -ENOMEM;
> > > > > > -
> > > > > > -       err = 0;
> > > > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > > > -       for_each_engine(master, gt, id) {
> > > > > > -               struct i915_sw_fence fence = {};
> > > > > > -               struct intel_context *ce;
> > > > > > -
> > > > > > -               if (master->class == class)
> > > > > > -                       continue;
> > > > > > -
> > > > > > -               ce = intel_context_create(master);
> > > > > > -               if (IS_ERR(ce)) {
> > > > > > -                       err = PTR_ERR(ce);
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -
> > > > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > > > -
> > > > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > > > -               intel_context_put(ce);
> > > > > > -               if (IS_ERR(rq[0])) {
> > > > > > -                       err = PTR_ERR(rq[0]);
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -               i915_request_get(rq[0]);
> > > > > > -
> > > > > > -               if (flags & BOND_SCHEDULE) {
> > > > > > -                       onstack_fence_init(&fence);
> > > > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > > > -                                                              &fence,
> > > > > > -                                                              GFP_KERNEL);
> > > > > > -               }
> > > > > > -
> > > > > > -               i915_request_add(rq[0]);
> > > > > > -               if (err < 0)
> > > > > > -                       goto out;
> > > > > > -
> > > > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > > > -                       err = -EIO;
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -
> > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > -                       struct intel_context *ve;
> > > > > > -
> > > > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > > > -                       if (IS_ERR(ve)) {
> > > > > > -                               err = PTR_ERR(ve);
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > > > -                                                              master,
> > > > > > -                                                              siblings[n]);
> > > > > > -                       if (err) {
> > > > > > -                               intel_context_put(ve);
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       err = intel_context_pin(ve);
> > > > > > -                       intel_context_put(ve);
> > > > > > -                       if (err) {
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > > > -                       intel_context_unpin(ve);
> > > > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -                       i915_request_get(rq[n + 1]);
> > > > > > -
> > > > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > > > -                                                          &rq[0]->fence,
> > > > > > -                                                          ve->engine->bond_execute);
> > > > > > -                       i915_request_add(rq[n + 1]);
> > > > > > -                       if (err < 0) {
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -               }
> > > > > > -               onstack_fence_fini(&fence);
> > > > > > -               intel_engine_flush_submission(master);
> > > > > > -               igt_spinner_end(&spin);
> > > > > > -
> > > > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > > > -                              rq[0]->engine->name);
> > > > > > -                       err = -EIO;
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -
> > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > > > -                               err = -EIO;
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > > > -                                      siblings[n]->name,
> > > > > > -                                      rq[n + 1]->engine->name,
> > > > > > -                                      rq[0]->engine->name);
> > > > > > -                               err = -EINVAL;
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -               }
> > > > > > -
> > > > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > -                       i915_request_put(rq[n]);
> > > > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > > > -       }
> > > > > > -
> > > > > > -out:
> > > > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > -               i915_request_put(rq[n]);
> > > > > > -       if (igt_flush_test(gt->i915))
> > > > > > -               err = -EIO;
> > > > > > -
> > > > > > -       igt_spinner_fini(&spin);
> > > > > > -       return err;
> > > > > > -}
> > > > > > -
> > > > > > -static int live_virtual_bond(void *arg)
> > > > > > -{
> > > > > > -       static const struct phase {
> > > > > > -               const char *name;
> > > > > > -               unsigned int flags;
> > > > > > -       } phases[] = {
> > > > > > -               { "", 0 },
> > > > > > -               { "schedule", BOND_SCHEDULE },
> > > > > > -               { },
> > > > > > -       };
> > > > > > -       struct intel_gt *gt = arg;
> > > > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > > > -       unsigned int class;
> > > > > > -       int err;
> > > > > > -
> > > > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > > > -               return 0;
> > > > > > -
> > > > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > > > -               const struct phase *p;
> > > > > > -               int nsibling;
> > > > > > -
> > > > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > > > -               if (nsibling < 2)
> > > > > > -                       continue;
> > > > > > -
> > > > > > -               for (p = phases; p->name; p++) {
> > > > > > -                       err = bond_virtual_engine(gt,
> > > > > > -                                                 class, siblings, nsibling,
> > > > > > -                                                 p->flags);
> > > > > > -                       if (err) {
> > > > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > > > -                                      __func__, p->name, class, nsibling, err);
> > > > > > -                               return err;
> > > > > > -                       }
> > > > > > -               }
> > > > > > -       }
> > > > > > -
> > > > > > -       return 0;
> > > > > > -}
> > > > > > -
> > > > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > > > >                                 struct intel_engine_cs **siblings,
> > > > > >                                 unsigned int nsibling)
> > > > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > > > >                 SUBTEST(live_virtual_mask),
> > > > > >                 SUBTEST(live_virtual_preserved),
> > > > > >                 SUBTEST(live_virtual_slice),
> > > > > > -               SUBTEST(live_virtual_bond),
> > > > > >                 SUBTEST(live_virtual_reset),
> > > > > >         };
> > > > > >
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 17:55               ` Matthew Brost
  0 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-04-28 17:55 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > > >
> > > > > > This adds a bunch of complexity which the media driver has never
> > > > > > actually used.  The media driver does technically bond a balanced engine
> > > > > > to another engine but the balanced engine only has one engine in the
> > > > > > sibling set.  This doesn't actually result in a virtual engine.
> > > > > >
> > > > > > Unless some userspace badly wants it, there's no good reason to support
> > > > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > > > leave the validation code in place in case we ever decide we want to do
> > > > > > something interesting with the bonding information.
> > > > > >
> > > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > >         }
> > > > > >         virtual = set->engines->engines[idx]->engine;
> > > > > >
> > > > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > > > +               drm_dbg(&i915->drm,
> > > > > > +                       "Bonding with virtual engines not allowed\n");
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +
> > > > > >         err = check_user_mbz(&ext->flags);
> > > > > >         if (err)
> > > > > >                 return err;
> > > > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > >                                 n, ci.engine_class, ci.engine_instance);
> > > > > >                         return -EINVAL;
> > > > > >                 }
> > > > > > -
> > > > > > -               /*
> > > > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > > > -                * a submit fence will always be directed to the one engine.
> > > > > > -                */
> > > > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > > > -                                                              master,
> > > > > > -                                                              bond);
> > > > > > -                       if (err)
> > > > > > -                               return err;
> > > > > > -               }
> > > > > >         }
> > > > > >
> > > > > >         return 0;
> > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > > >                         err = i915_request_await_execution(eb.request,
> > > > > >                                                            in_fence,
> > > > > > -                                                          eb.engine->bond_execute);
> > > > > > +                                                          NULL);
> > > > > >                 else
> > > > > >                         err = i915_request_await_dma_fence(eb.request,
> > > > > >                                                            in_fence);
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > index 883bafc449024..68cfe5080325c 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > > > >          */
> > > > > >         void            (*submit_request)(struct i915_request *rq);
> > > > > >
> > > > > > -       /*
> > > > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > > > -        * request down to the bonded pairs.
> > > > > > -        */
> > > > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > > > -                                       struct dma_fence *signal);
> > > > > > -
> > > > > >         /*
> > > > > >          * Call when the priority on a request has changed and it and its
> > > > > >          * dependencies may need rescheduling. Note the request itself may
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > index de124870af44d..b6e2b59f133b7 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > > > >                 int prio;
> > > > > >         } nodes[I915_NUM_ENGINES];
> > > > > >
> > > > > > -       /*
> > > > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > > > -        * of physical engines any particular request may be submitted to.
> > > > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > > > -        * use one of sibling_mask physical engines.
> > > > > > -        */
> > > > > > -       struct ve_bond {
> > > > > > -               const struct intel_engine_cs *master;
> > > > > > -               intel_engine_mask_t sibling_mask;
> > > > > > -       } *bonds;
> > > > > > -       unsigned int num_bonds;
> > > > > > -
> > > > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > > > >         unsigned int num_siblings;
> > > > > >         struct intel_engine_cs *siblings[];
> > > > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > > > >         intel_engine_free_request_pool(&ve->base);
> > > > > >
> > > > > > -       kfree(ve->bonds);
> > > > > >         kfree(ve);
> > > > > >  }
> > > > > >
> > > > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > > > >  }
> > > > > >
> > > > > > -static struct ve_bond *
> > > > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > > > -                 const struct intel_engine_cs *master)
> > > > > > -{
> > > > > > -       int i;
> > > > > > -
> > > > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > > > -               if (ve->bonds[i].master == master)
> > > > > > -                       return &ve->bonds[i];
> > > > > > -       }
> > > > > > -
> > > > > > -       return NULL;
> > > > > > -}
> > > > > > -
> > > > > > -static void
> > > > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > > > -{
> > > > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > > > -       intel_engine_mask_t allowed, exec;
> > > > > > -       struct ve_bond *bond;
> > > > > > -
> > > > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > > > -
> > > > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > > > -       if (bond)
> > > > > > -               allowed &= bond->sibling_mask;
> > > > > > -
> > > > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > > > -               ;
> > > > > > -
> > > > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > > > -       to_request(signal)->execution_mask &= ~allowed;
> > > > >
> > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > the work onto a different engine than than the one it's supposed to
> > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > the bond_execution function pointer and related stuff.
> > > >
> > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > So there really should not be any need to pick the right one at execution
> > > > time.
> > >
> > > The media driver itself seems to work fine if I delete all the code.
> > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > if I can better isolate why.
> > >
> >
> > Jumping on here mid-thread. For what is is worth to make execlists work
> > with the upcoming parallel submission extension I leveraged some of the
> > existing bonding code so I wouldn't be too eager to delete this code
> > until that lands.
> 
> Mind being a bit more specific about that?  The motivation for this
> patch is that the current bonding handling and uAPI is, well, very odd
> and confusing IMO.  It doesn't let you create sets of bonded engines.
> Instead you create engines and then bond them together after the fact.
> I didn't want to blindly duplicate those oddities with the proto-ctx
> stuff unless they were useful.  With parallel submit, I would expect
> we want a more explicit API where you specify a set of engine
> class/instance pairs to bond together into a single engine similar to
> how the current balancing API works.
> 
> Of course, that's all focused on the API and not the internals.  But,
> again, I'm not sure how we want things to look internally.  What we've
> got now doesn't seem great for the GuC submission model but I'm very
> much not the expert there.  I don't want to be working at cross
> purposes to you and I'm happy to leave bits if you think they're
> useful.  But I thought I was clearing things away so that you can put
> in what you actually want for GuC/parallel submit.
> 

Removing all the UAPI things are fine but I wouldn't delete some of the
internal stuff (e.g. intel_virtual_engine_attach_bond, bond
intel_context_ops, the hook for a submit fence, etc...) as that will
still likely be used for the new parallel submission interface with
execlists. As you say the new UAPI wont allow crazy configurations,
only simple ones.

Matt

> --Jason
> 
> > Matt
> >
> > > --Jason
> > >
> > > > At least my understanding is that we're only limiting the engine set
> > > > further, so if both signaller and signalled request can only run on
> > > > singletons (which must be distinct, or the bonded parameter validation is
> > > > busted) there's really nothing to do here.
> > > >
> > > > Also this is the locking code that freaks me out about the current bonded
> > > > execlist code ...
> > > >
> > > > Dazzled and confused.
> > > > -Daniel
> > > >
> > > > >
> > > > > --Jason
> > > > >
> > > > >
> > > > > > -}
> > > > > > -
> > > > > >  struct intel_context *
> > > > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > >                                unsigned int count)
> > > > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > >
> > > > > >         ve->base.schedule = i915_schedule;
> > > > > >         ve->base.submit_request = virtual_submit_request;
> > > > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > > > >
> > > > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > > > >         if (IS_ERR(dst))
> > > > > >                 return dst;
> > > > > >
> > > > > > -       if (se->num_bonds) {
> > > > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > > > -
> > > > > > -               de->bonds = kmemdup(se->bonds,
> > > > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > > > -                                   GFP_KERNEL);
> > > > > > -               if (!de->bonds) {
> > > > > > -                       intel_context_put(dst);
> > > > > > -                       return ERR_PTR(-ENOMEM);
> > > > > > -               }
> > > > > > -
> > > > > > -               de->num_bonds = se->num_bonds;
> > > > > > -       }
> > > > > > -
> > > > > >         return dst;
> > > > > >  }
> > > > > >
> > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > -                                    const struct intel_engine_cs *sibling)
> > > > > > -{
> > > > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > > > -       struct ve_bond *bond;
> > > > > > -       int n;
> > > > > > -
> > > > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > > > -               if (sibling == ve->siblings[n])
> > > > > > -                       break;
> > > > > > -       if (n == ve->num_siblings)
> > > > > > -               return -EINVAL;
> > > > > > -
> > > > > > -       bond = virtual_find_bond(ve, master);
> > > > > > -       if (bond) {
> > > > > > -               bond->sibling_mask |= sibling->mask;
> > > > > > -               return 0;
> > > > > > -       }
> > > > > > -
> > > > > > -       bond = krealloc(ve->bonds,
> > > > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > > > -                       GFP_KERNEL);
> > > > > > -       if (!bond)
> > > > > > -               return -ENOMEM;
> > > > > > -
> > > > > > -       bond[ve->num_bonds].master = master;
> > > > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > > > -
> > > > > > -       ve->bonds = bond;
> > > > > > -       ve->num_bonds++;
> > > > > > -
> > > > > > -       return 0;
> > > > > > -}
> > > > > > -
> > > > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > > > >                                    struct drm_printer *m,
> > > > > >                                    void (*show_request)(struct drm_printer *m,
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > >  struct intel_context *
> > > > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > > > >
> > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > -                                    const struct intel_engine_cs *sibling);
> > > > > > -
> > > > > >  bool
> > > > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > > > >         return 0;
> > > > > >  }
> > > > > >
> > > > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > > > -                              unsigned int class,
> > > > > > -                              struct intel_engine_cs **siblings,
> > > > > > -                              unsigned int nsibling,
> > > > > > -                              unsigned int flags)
> > > > > > -#define BOND_SCHEDULE BIT(0)
> > > > > > -{
> > > > > > -       struct intel_engine_cs *master;
> > > > > > -       struct i915_request *rq[16];
> > > > > > -       enum intel_engine_id id;
> > > > > > -       struct igt_spinner spin;
> > > > > > -       unsigned long n;
> > > > > > -       int err;
> > > > > > -
> > > > > > -       /*
> > > > > > -        * A set of bonded requests is intended to be run concurrently
> > > > > > -        * across a number of engines. We use one request per-engine
> > > > > > -        * and a magic fence to schedule each of the bonded requests
> > > > > > -        * at the same time. A consequence of our current scheduler is that
> > > > > > -        * we only move requests to the HW ready queue when the request
> > > > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > > > -        * there is a delay on all secondary fences as the HW may be
> > > > > > -        * currently busy. Equally, as all the requests are independent,
> > > > > > -        * they may have other fences that delay individual request
> > > > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > > > -        * immediately submitted to HW at the same time, just that if the
> > > > > > -        * rules are abided by, they are ready at the same time as the
> > > > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > > > -        * to ensure parallel execution of its phases as it requires.
> > > > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > > > -        * take care of parallel execution, even across preemption events on
> > > > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > > > -        *
> > > > > > -        * With the submit-fence, we have identified three possible phases
> > > > > > -        * of synchronisation depending on the master fence: queued (not
> > > > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > > > -        * and checked below. However, the signaled master fence handling is
> > > > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > > > -        * any information about the previous execution. It may even be freed
> > > > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > > > -        * as our expectation is that it should not constrain the secondaries
> > > > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > > > -        * userspace requests are meant to be running in parallel). As
> > > > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > > > -        * check below as normal execution flows are checked extensively above.
> > > > > > -        *
> > > > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > > > -        * expected behaviour for userpace?
> > > > > > -        */
> > > > > > -
> > > > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > > > -
> > > > > > -       if (igt_spinner_init(&spin, gt))
> > > > > > -               return -ENOMEM;
> > > > > > -
> > > > > > -       err = 0;
> > > > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > > > -       for_each_engine(master, gt, id) {
> > > > > > -               struct i915_sw_fence fence = {};
> > > > > > -               struct intel_context *ce;
> > > > > > -
> > > > > > -               if (master->class == class)
> > > > > > -                       continue;
> > > > > > -
> > > > > > -               ce = intel_context_create(master);
> > > > > > -               if (IS_ERR(ce)) {
> > > > > > -                       err = PTR_ERR(ce);
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -
> > > > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > > > -
> > > > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > > > -               intel_context_put(ce);
> > > > > > -               if (IS_ERR(rq[0])) {
> > > > > > -                       err = PTR_ERR(rq[0]);
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -               i915_request_get(rq[0]);
> > > > > > -
> > > > > > -               if (flags & BOND_SCHEDULE) {
> > > > > > -                       onstack_fence_init(&fence);
> > > > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > > > -                                                              &fence,
> > > > > > -                                                              GFP_KERNEL);
> > > > > > -               }
> > > > > > -
> > > > > > -               i915_request_add(rq[0]);
> > > > > > -               if (err < 0)
> > > > > > -                       goto out;
> > > > > > -
> > > > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > > > -                       err = -EIO;
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -
> > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > -                       struct intel_context *ve;
> > > > > > -
> > > > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > > > -                       if (IS_ERR(ve)) {
> > > > > > -                               err = PTR_ERR(ve);
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > > > -                                                              master,
> > > > > > -                                                              siblings[n]);
> > > > > > -                       if (err) {
> > > > > > -                               intel_context_put(ve);
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       err = intel_context_pin(ve);
> > > > > > -                       intel_context_put(ve);
> > > > > > -                       if (err) {
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > > > -                       intel_context_unpin(ve);
> > > > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -                       i915_request_get(rq[n + 1]);
> > > > > > -
> > > > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > > > -                                                          &rq[0]->fence,
> > > > > > -                                                          ve->engine->bond_execute);
> > > > > > -                       i915_request_add(rq[n + 1]);
> > > > > > -                       if (err < 0) {
> > > > > > -                               onstack_fence_fini(&fence);
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -               }
> > > > > > -               onstack_fence_fini(&fence);
> > > > > > -               intel_engine_flush_submission(master);
> > > > > > -               igt_spinner_end(&spin);
> > > > > > -
> > > > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > > > -                              rq[0]->engine->name);
> > > > > > -                       err = -EIO;
> > > > > > -                       goto out;
> > > > > > -               }
> > > > > > -
> > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > > > -                               err = -EIO;
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -
> > > > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > > > -                                      siblings[n]->name,
> > > > > > -                                      rq[n + 1]->engine->name,
> > > > > > -                                      rq[0]->engine->name);
> > > > > > -                               err = -EINVAL;
> > > > > > -                               goto out;
> > > > > > -                       }
> > > > > > -               }
> > > > > > -
> > > > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > -                       i915_request_put(rq[n]);
> > > > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > > > -       }
> > > > > > -
> > > > > > -out:
> > > > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > -               i915_request_put(rq[n]);
> > > > > > -       if (igt_flush_test(gt->i915))
> > > > > > -               err = -EIO;
> > > > > > -
> > > > > > -       igt_spinner_fini(&spin);
> > > > > > -       return err;
> > > > > > -}
> > > > > > -
> > > > > > -static int live_virtual_bond(void *arg)
> > > > > > -{
> > > > > > -       static const struct phase {
> > > > > > -               const char *name;
> > > > > > -               unsigned int flags;
> > > > > > -       } phases[] = {
> > > > > > -               { "", 0 },
> > > > > > -               { "schedule", BOND_SCHEDULE },
> > > > > > -               { },
> > > > > > -       };
> > > > > > -       struct intel_gt *gt = arg;
> > > > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > > > -       unsigned int class;
> > > > > > -       int err;
> > > > > > -
> > > > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > > > -               return 0;
> > > > > > -
> > > > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > > > -               const struct phase *p;
> > > > > > -               int nsibling;
> > > > > > -
> > > > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > > > -               if (nsibling < 2)
> > > > > > -                       continue;
> > > > > > -
> > > > > > -               for (p = phases; p->name; p++) {
> > > > > > -                       err = bond_virtual_engine(gt,
> > > > > > -                                                 class, siblings, nsibling,
> > > > > > -                                                 p->flags);
> > > > > > -                       if (err) {
> > > > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > > > -                                      __func__, p->name, class, nsibling, err);
> > > > > > -                               return err;
> > > > > > -                       }
> > > > > > -               }
> > > > > > -       }
> > > > > > -
> > > > > > -       return 0;
> > > > > > -}
> > > > > > -
> > > > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > > > >                                 struct intel_engine_cs **siblings,
> > > > > >                                 unsigned int nsibling)
> > > > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > > > >                 SUBTEST(live_virtual_mask),
> > > > > >                 SUBTEST(live_virtual_preserved),
> > > > > >                 SUBTEST(live_virtual_slice),
> > > > > > -               SUBTEST(live_virtual_bond),
> > > > > >                 SUBTEST(live_virtual_reset),
> > > > > >         };
> > > > > >
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 17:55               ` Matthew Brost
@ 2021-04-28 18:17                 ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 18:17 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > > > >
> > > > > > > This adds a bunch of complexity which the media driver has never
> > > > > > > actually used.  The media driver does technically bond a balanced engine
> > > > > > > to another engine but the balanced engine only has one engine in the
> > > > > > > sibling set.  This doesn't actually result in a virtual engine.
> > > > > > >
> > > > > > > Unless some userspace badly wants it, there's no good reason to support
> > > > > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > > > > leave the validation code in place in case we ever decide we want to do
> > > > > > > something interesting with the bonding information.
> > > > > > >
> > > > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > > > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > > > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > > > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > > > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > > > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > > > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > > >         }
> > > > > > >         virtual = set->engines->engines[idx]->engine;
> > > > > > >
> > > > > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > > > > +               drm_dbg(&i915->drm,
> > > > > > > +                       "Bonding with virtual engines not allowed\n");
> > > > > > > +               return -EINVAL;
> > > > > > > +       }
> > > > > > > +
> > > > > > >         err = check_user_mbz(&ext->flags);
> > > > > > >         if (err)
> > > > > > >                 return err;
> > > > > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > > >                                 n, ci.engine_class, ci.engine_instance);
> > > > > > >                         return -EINVAL;
> > > > > > >                 }
> > > > > > > -
> > > > > > > -               /*
> > > > > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > > > > -                * a submit fence will always be directed to the one engine.
> > > > > > > -                */
> > > > > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > > > > -                                                              master,
> > > > > > > -                                                              bond);
> > > > > > > -                       if (err)
> > > > > > > -                               return err;
> > > > > > > -               }
> > > > > > >         }
> > > > > > >
> > > > > > >         return 0;
> > > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > > > >                         err = i915_request_await_execution(eb.request,
> > > > > > >                                                            in_fence,
> > > > > > > -                                                          eb.engine->bond_execute);
> > > > > > > +                                                          NULL);
> > > > > > >                 else
> > > > > > >                         err = i915_request_await_dma_fence(eb.request,
> > > > > > >                                                            in_fence);
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > > index 883bafc449024..68cfe5080325c 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > > > > >          */
> > > > > > >         void            (*submit_request)(struct i915_request *rq);
> > > > > > >
> > > > > > > -       /*
> > > > > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > > > > -        * request down to the bonded pairs.
> > > > > > > -        */
> > > > > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > > > > -                                       struct dma_fence *signal);
> > > > > > > -
> > > > > > >         /*
> > > > > > >          * Call when the priority on a request has changed and it and its
> > > > > > >          * dependencies may need rescheduling. Note the request itself may
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > > index de124870af44d..b6e2b59f133b7 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > > > > >                 int prio;
> > > > > > >         } nodes[I915_NUM_ENGINES];
> > > > > > >
> > > > > > > -       /*
> > > > > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > > > > -        * of physical engines any particular request may be submitted to.
> > > > > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > > > > -        * use one of sibling_mask physical engines.
> > > > > > > -        */
> > > > > > > -       struct ve_bond {
> > > > > > > -               const struct intel_engine_cs *master;
> > > > > > > -               intel_engine_mask_t sibling_mask;
> > > > > > > -       } *bonds;
> > > > > > > -       unsigned int num_bonds;
> > > > > > > -
> > > > > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > > > > >         unsigned int num_siblings;
> > > > > > >         struct intel_engine_cs *siblings[];
> > > > > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > > > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > > > > >         intel_engine_free_request_pool(&ve->base);
> > > > > > >
> > > > > > > -       kfree(ve->bonds);
> > > > > > >         kfree(ve);
> > > > > > >  }
> > > > > > >
> > > > > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > > > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > > > > >  }
> > > > > > >
> > > > > > > -static struct ve_bond *
> > > > > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > > > > -                 const struct intel_engine_cs *master)
> > > > > > > -{
> > > > > > > -       int i;
> > > > > > > -
> > > > > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > > > > -               if (ve->bonds[i].master == master)
> > > > > > > -                       return &ve->bonds[i];
> > > > > > > -       }
> > > > > > > -
> > > > > > > -       return NULL;
> > > > > > > -}
> > > > > > > -
> > > > > > > -static void
> > > > > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > > > > -{
> > > > > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > > > > -       intel_engine_mask_t allowed, exec;
> > > > > > > -       struct ve_bond *bond;
> > > > > > > -
> > > > > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > > > > -
> > > > > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > > > > -       if (bond)
> > > > > > > -               allowed &= bond->sibling_mask;
> > > > > > > -
> > > > > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > > > > -               ;
> > > > > > > -
> > > > > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > > > > -       to_request(signal)->execution_mask &= ~allowed;
> > > > > >
> > > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > > the work onto a different engine than than the one it's supposed to
> > > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > > the bond_execution function pointer and related stuff.
> > > > >
> > > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > > So there really should not be any need to pick the right one at execution
> > > > > time.
> > > >
> > > > The media driver itself seems to work fine if I delete all the code.
> > > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > > if I can better isolate why.
> > > >
> > >
> > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > with the upcoming parallel submission extension I leveraged some of the
> > > existing bonding code so I wouldn't be too eager to delete this code
> > > until that lands.
> >
> > Mind being a bit more specific about that?  The motivation for this
> > patch is that the current bonding handling and uAPI is, well, very odd
> > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > Instead you create engines and then bond them together after the fact.
> > I didn't want to blindly duplicate those oddities with the proto-ctx
> > stuff unless they were useful.  With parallel submit, I would expect
> > we want a more explicit API where you specify a set of engine
> > class/instance pairs to bond together into a single engine similar to
> > how the current balancing API works.
> >
> > Of course, that's all focused on the API and not the internals.  But,
> > again, I'm not sure how we want things to look internally.  What we've
> > got now doesn't seem great for the GuC submission model but I'm very
> > much not the expert there.  I don't want to be working at cross
> > purposes to you and I'm happy to leave bits if you think they're
> > useful.  But I thought I was clearing things away so that you can put
> > in what you actually want for GuC/parallel submit.
> >
>
> Removing all the UAPI things are fine but I wouldn't delete some of the
> internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> intel_context_ops, the hook for a submit fence, etc...) as that will
> still likely be used for the new parallel submission interface with
> execlists. As you say the new UAPI wont allow crazy configurations,
> only simple ones.

I'm fine with leaving some of the internal bits for a little while if
it makes pulling the GuC scheduler in easier.  I'm just a bit
skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
thoughts?

--Jason

> Matt
>
> > --Jason
> >
> > > Matt
> > >
> > > > --Jason
> > > >
> > > > > At least my understanding is that we're only limiting the engine set
> > > > > further, so if both signaller and signalled request can only run on
> > > > > singletons (which must be distinct, or the bonded parameter validation is
> > > > > busted) there's really nothing to do here.
> > > > >
> > > > > Also this is the locking code that freaks me out about the current bonded
> > > > > execlist code ...
> > > > >
> > > > > Dazzled and confused.
> > > > > -Daniel
> > > > >
> > > > > >
> > > > > > --Jason
> > > > > >
> > > > > >
> > > > > > > -}
> > > > > > > -
> > > > > > >  struct intel_context *
> > > > > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > > >                                unsigned int count)
> > > > > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > > >
> > > > > > >         ve->base.schedule = i915_schedule;
> > > > > > >         ve->base.submit_request = virtual_submit_request;
> > > > > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > > > > >
> > > > > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > > > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > > > > >         if (IS_ERR(dst))
> > > > > > >                 return dst;
> > > > > > >
> > > > > > > -       if (se->num_bonds) {
> > > > > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > > > > -
> > > > > > > -               de->bonds = kmemdup(se->bonds,
> > > > > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > > > > -                                   GFP_KERNEL);
> > > > > > > -               if (!de->bonds) {
> > > > > > > -                       intel_context_put(dst);
> > > > > > > -                       return ERR_PTR(-ENOMEM);
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               de->num_bonds = se->num_bonds;
> > > > > > > -       }
> > > > > > > -
> > > > > > >         return dst;
> > > > > > >  }
> > > > > > >
> > > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > > -                                    const struct intel_engine_cs *sibling)
> > > > > > > -{
> > > > > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > > > > -       struct ve_bond *bond;
> > > > > > > -       int n;
> > > > > > > -
> > > > > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > > > > -               if (sibling == ve->siblings[n])
> > > > > > > -                       break;
> > > > > > > -       if (n == ve->num_siblings)
> > > > > > > -               return -EINVAL;
> > > > > > > -
> > > > > > > -       bond = virtual_find_bond(ve, master);
> > > > > > > -       if (bond) {
> > > > > > > -               bond->sibling_mask |= sibling->mask;
> > > > > > > -               return 0;
> > > > > > > -       }
> > > > > > > -
> > > > > > > -       bond = krealloc(ve->bonds,
> > > > > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > > > > -                       GFP_KERNEL);
> > > > > > > -       if (!bond)
> > > > > > > -               return -ENOMEM;
> > > > > > > -
> > > > > > > -       bond[ve->num_bonds].master = master;
> > > > > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > > > > -
> > > > > > > -       ve->bonds = bond;
> > > > > > > -       ve->num_bonds++;
> > > > > > > -
> > > > > > > -       return 0;
> > > > > > > -}
> > > > > > > -
> > > > > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > > > > >                                    struct drm_printer *m,
> > > > > > >                                    void (*show_request)(struct drm_printer *m,
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > > >  struct intel_context *
> > > > > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > > > > >
> > > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > > -                                    const struct intel_engine_cs *sibling);
> > > > > > > -
> > > > > > >  bool
> > > > > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > > > > >         return 0;
> > > > > > >  }
> > > > > > >
> > > > > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > > > > -                              unsigned int class,
> > > > > > > -                              struct intel_engine_cs **siblings,
> > > > > > > -                              unsigned int nsibling,
> > > > > > > -                              unsigned int flags)
> > > > > > > -#define BOND_SCHEDULE BIT(0)
> > > > > > > -{
> > > > > > > -       struct intel_engine_cs *master;
> > > > > > > -       struct i915_request *rq[16];
> > > > > > > -       enum intel_engine_id id;
> > > > > > > -       struct igt_spinner spin;
> > > > > > > -       unsigned long n;
> > > > > > > -       int err;
> > > > > > > -
> > > > > > > -       /*
> > > > > > > -        * A set of bonded requests is intended to be run concurrently
> > > > > > > -        * across a number of engines. We use one request per-engine
> > > > > > > -        * and a magic fence to schedule each of the bonded requests
> > > > > > > -        * at the same time. A consequence of our current scheduler is that
> > > > > > > -        * we only move requests to the HW ready queue when the request
> > > > > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > > > > -        * there is a delay on all secondary fences as the HW may be
> > > > > > > -        * currently busy. Equally, as all the requests are independent,
> > > > > > > -        * they may have other fences that delay individual request
> > > > > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > > > > -        * immediately submitted to HW at the same time, just that if the
> > > > > > > -        * rules are abided by, they are ready at the same time as the
> > > > > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > > > > -        * to ensure parallel execution of its phases as it requires.
> > > > > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > > > > -        * take care of parallel execution, even across preemption events on
> > > > > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > > > > -        *
> > > > > > > -        * With the submit-fence, we have identified three possible phases
> > > > > > > -        * of synchronisation depending on the master fence: queued (not
> > > > > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > > > > -        * and checked below. However, the signaled master fence handling is
> > > > > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > > > > -        * any information about the previous execution. It may even be freed
> > > > > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > > > > -        * as our expectation is that it should not constrain the secondaries
> > > > > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > > > > -        * userspace requests are meant to be running in parallel). As
> > > > > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > > > > -        * check below as normal execution flows are checked extensively above.
> > > > > > > -        *
> > > > > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > > > > -        * expected behaviour for userpace?
> > > > > > > -        */
> > > > > > > -
> > > > > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > > > > -
> > > > > > > -       if (igt_spinner_init(&spin, gt))
> > > > > > > -               return -ENOMEM;
> > > > > > > -
> > > > > > > -       err = 0;
> > > > > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > > > > -       for_each_engine(master, gt, id) {
> > > > > > > -               struct i915_sw_fence fence = {};
> > > > > > > -               struct intel_context *ce;
> > > > > > > -
> > > > > > > -               if (master->class == class)
> > > > > > > -                       continue;
> > > > > > > -
> > > > > > > -               ce = intel_context_create(master);
> > > > > > > -               if (IS_ERR(ce)) {
> > > > > > > -                       err = PTR_ERR(ce);
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > > > > -
> > > > > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > > > > -               intel_context_put(ce);
> > > > > > > -               if (IS_ERR(rq[0])) {
> > > > > > > -                       err = PTR_ERR(rq[0]);
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -               i915_request_get(rq[0]);
> > > > > > > -
> > > > > > > -               if (flags & BOND_SCHEDULE) {
> > > > > > > -                       onstack_fence_init(&fence);
> > > > > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > > > > -                                                              &fence,
> > > > > > > -                                                              GFP_KERNEL);
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               i915_request_add(rq[0]);
> > > > > > > -               if (err < 0)
> > > > > > > -                       goto out;
> > > > > > > -
> > > > > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > > > > -                       err = -EIO;
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > > -                       struct intel_context *ve;
> > > > > > > -
> > > > > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > > > > -                       if (IS_ERR(ve)) {
> > > > > > > -                               err = PTR_ERR(ve);
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > > > > -                                                              master,
> > > > > > > -                                                              siblings[n]);
> > > > > > > -                       if (err) {
> > > > > > > -                               intel_context_put(ve);
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       err = intel_context_pin(ve);
> > > > > > > -                       intel_context_put(ve);
> > > > > > > -                       if (err) {
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > > > > -                       intel_context_unpin(ve);
> > > > > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -                       i915_request_get(rq[n + 1]);
> > > > > > > -
> > > > > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > > > > -                                                          &rq[0]->fence,
> > > > > > > -                                                          ve->engine->bond_execute);
> > > > > > > -                       i915_request_add(rq[n + 1]);
> > > > > > > -                       if (err < 0) {
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -               }
> > > > > > > -               onstack_fence_fini(&fence);
> > > > > > > -               intel_engine_flush_submission(master);
> > > > > > > -               igt_spinner_end(&spin);
> > > > > > > -
> > > > > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > > > > -                              rq[0]->engine->name);
> > > > > > > -                       err = -EIO;
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > > > > -                               err = -EIO;
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > > > > -                                      siblings[n]->name,
> > > > > > > -                                      rq[n + 1]->engine->name,
> > > > > > > -                                      rq[0]->engine->name);
> > > > > > > -                               err = -EINVAL;
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > > -                       i915_request_put(rq[n]);
> > > > > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > > > > -       }
> > > > > > > -
> > > > > > > -out:
> > > > > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > > -               i915_request_put(rq[n]);
> > > > > > > -       if (igt_flush_test(gt->i915))
> > > > > > > -               err = -EIO;
> > > > > > > -
> > > > > > > -       igt_spinner_fini(&spin);
> > > > > > > -       return err;
> > > > > > > -}
> > > > > > > -
> > > > > > > -static int live_virtual_bond(void *arg)
> > > > > > > -{
> > > > > > > -       static const struct phase {
> > > > > > > -               const char *name;
> > > > > > > -               unsigned int flags;
> > > > > > > -       } phases[] = {
> > > > > > > -               { "", 0 },
> > > > > > > -               { "schedule", BOND_SCHEDULE },
> > > > > > > -               { },
> > > > > > > -       };
> > > > > > > -       struct intel_gt *gt = arg;
> > > > > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > > > > -       unsigned int class;
> > > > > > > -       int err;
> > > > > > > -
> > > > > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > > > > -               return 0;
> > > > > > > -
> > > > > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > > > > -               const struct phase *p;
> > > > > > > -               int nsibling;
> > > > > > > -
> > > > > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > > > > -               if (nsibling < 2)
> > > > > > > -                       continue;
> > > > > > > -
> > > > > > > -               for (p = phases; p->name; p++) {
> > > > > > > -                       err = bond_virtual_engine(gt,
> > > > > > > -                                                 class, siblings, nsibling,
> > > > > > > -                                                 p->flags);
> > > > > > > -                       if (err) {
> > > > > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > > > > -                                      __func__, p->name, class, nsibling, err);
> > > > > > > -                               return err;
> > > > > > > -                       }
> > > > > > > -               }
> > > > > > > -       }
> > > > > > > -
> > > > > > > -       return 0;
> > > > > > > -}
> > > > > > > -
> > > > > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > > > > >                                 struct intel_engine_cs **siblings,
> > > > > > >                                 unsigned int nsibling)
> > > > > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > > > > >                 SUBTEST(live_virtual_mask),
> > > > > > >                 SUBTEST(live_virtual_preserved),
> > > > > > >                 SUBTEST(live_virtual_slice),
> > > > > > > -               SUBTEST(live_virtual_bond),
> > > > > > >                 SUBTEST(live_virtual_reset),
> > > > > > >         };
> > > > > > >
> > > > > > > --
> > > > > > > 2.31.1
> > > > > > >
> > > > > > _______________________________________________
> > > > > > dri-devel mailing list
> > > > > > dri-devel@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 18:17                 ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 18:17 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
>
> On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 12:18:29PM -0500, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > > On Fri, Apr 23, 2021 at 5:31 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > > > >
> > > > > > > This adds a bunch of complexity which the media driver has never
> > > > > > > actually used.  The media driver does technically bond a balanced engine
> > > > > > > to another engine but the balanced engine only has one engine in the
> > > > > > > sibling set.  This doesn't actually result in a virtual engine.
> > > > > > >
> > > > > > > Unless some userspace badly wants it, there's no good reason to support
> > > > > > > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > > > > > > leave the validation code in place in case we ever decide we want to do
> > > > > > > something interesting with the bonding information.
> > > > > > >
> > > > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> > > > > > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> > > > > > >  drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> > > > > > >  .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> > > > > > >  .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> > > > > > >  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> > > > > > >  6 files changed, 7 insertions(+), 353 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > > index e8179918fa306..5f8d0faf783aa 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > > > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > > >         }
> > > > > > >         virtual = set->engines->engines[idx]->engine;
> > > > > > >
> > > > > > > +       if (intel_engine_is_virtual(virtual)) {
> > > > > > > +               drm_dbg(&i915->drm,
> > > > > > > +                       "Bonding with virtual engines not allowed\n");
> > > > > > > +               return -EINVAL;
> > > > > > > +       }
> > > > > > > +
> > > > > > >         err = check_user_mbz(&ext->flags);
> > > > > > >         if (err)
> > > > > > >                 return err;
> > > > > > > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> > > > > > >                                 n, ci.engine_class, ci.engine_instance);
> > > > > > >                         return -EINVAL;
> > > > > > >                 }
> > > > > > > -
> > > > > > > -               /*
> > > > > > > -                * A non-virtual engine has no siblings to choose between; and
> > > > > > > -                * a submit fence will always be directed to the one engine.
> > > > > > > -                */
> > > > > > > -               if (intel_engine_is_virtual(virtual)) {
> > > > > > > -                       err = intel_virtual_engine_attach_bond(virtual,
> > > > > > > -                                                              master,
> > > > > > > -                                                              bond);
> > > > > > > -                       if (err)
> > > > > > > -                               return err;
> > > > > > > -               }
> > > > > > >         }
> > > > > > >
> > > > > > >         return 0;
> > > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > > index d640bba6ad9ab..efb2fa3522a42 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > > > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > > > >                 if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > > > >                         err = i915_request_await_execution(eb.request,
> > > > > > >                                                            in_fence,
> > > > > > > -                                                          eb.engine->bond_execute);
> > > > > > > +                                                          NULL);
> > > > > > >                 else
> > > > > > >                         err = i915_request_await_dma_fence(eb.request,
> > > > > > >                                                            in_fence);
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > > index 883bafc449024..68cfe5080325c 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > > > > > > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> > > > > > >          */
> > > > > > >         void            (*submit_request)(struct i915_request *rq);
> > > > > > >
> > > > > > > -       /*
> > > > > > > -        * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > > > > > > -        * request down to the bonded pairs.
> > > > > > > -        */
> > > > > > > -       void            (*bond_execute)(struct i915_request *rq,
> > > > > > > -                                       struct dma_fence *signal);
> > > > > > > -
> > > > > > >         /*
> > > > > > >          * Call when the priority on a request has changed and it and its
> > > > > > >          * dependencies may need rescheduling. Note the request itself may
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > > index de124870af44d..b6e2b59f133b7 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > > > > @@ -181,18 +181,6 @@ struct virtual_engine {
> > > > > > >                 int prio;
> > > > > > >         } nodes[I915_NUM_ENGINES];
> > > > > > >
> > > > > > > -       /*
> > > > > > > -        * Keep track of bonded pairs -- restrictions upon on our selection
> > > > > > > -        * of physical engines any particular request may be submitted to.
> > > > > > > -        * If we receive a submit-fence from a master engine, we will only
> > > > > > > -        * use one of sibling_mask physical engines.
> > > > > > > -        */
> > > > > > > -       struct ve_bond {
> > > > > > > -               const struct intel_engine_cs *master;
> > > > > > > -               intel_engine_mask_t sibling_mask;
> > > > > > > -       } *bonds;
> > > > > > > -       unsigned int num_bonds;
> > > > > > > -
> > > > > > >         /* And finally, which physical engines this virtual engine maps onto. */
> > > > > > >         unsigned int num_siblings;
> > > > > > >         struct intel_engine_cs *siblings[];
> > > > > > > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> > > > > > >         intel_breadcrumbs_free(ve->base.breadcrumbs);
> > > > > > >         intel_engine_free_request_pool(&ve->base);
> > > > > > >
> > > > > > > -       kfree(ve->bonds);
> > > > > > >         kfree(ve);
> > > > > > >  }
> > > > > > >
> > > > > > > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> > > > > > >         spin_unlock_irqrestore(&ve->base.active.lock, flags);
> > > > > > >  }
> > > > > > >
> > > > > > > -static struct ve_bond *
> > > > > > > -virtual_find_bond(struct virtual_engine *ve,
> > > > > > > -                 const struct intel_engine_cs *master)
> > > > > > > -{
> > > > > > > -       int i;
> > > > > > > -
> > > > > > > -       for (i = 0; i < ve->num_bonds; i++) {
> > > > > > > -               if (ve->bonds[i].master == master)
> > > > > > > -                       return &ve->bonds[i];
> > > > > > > -       }
> > > > > > > -
> > > > > > > -       return NULL;
> > > > > > > -}
> > > > > > > -
> > > > > > > -static void
> > > > > > > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > > > > > > -{
> > > > > > > -       struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > > > > > > -       intel_engine_mask_t allowed, exec;
> > > > > > > -       struct ve_bond *bond;
> > > > > > > -
> > > > > > > -       allowed = ~to_request(signal)->engine->mask;
> > > > > > > -
> > > > > > > -       bond = virtual_find_bond(ve, to_request(signal)->engine);
> > > > > > > -       if (bond)
> > > > > > > -               allowed &= bond->sibling_mask;
> > > > > > > -
> > > > > > > -       /* Restrict the bonded request to run on only the available engines */
> > > > > > > -       exec = READ_ONCE(rq->execution_mask);
> > > > > > > -       while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > > > > > > -               ;
> > > > > > > -
> > > > > > > -       /* Prevent the master from being re-run on the bonded engines */
> > > > > > > -       to_request(signal)->execution_mask &= ~allowed;
> > > > > >
> > > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > > the work onto a different engine than than the one it's supposed to
> > > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > > the bond_execution function pointer and related stuff.
> > > > >
> > > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > > So there really should not be any need to pick the right one at execution
> > > > > time.
> > > >
> > > > The media driver itself seems to work fine if I delete all the code.
> > > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > > if I can better isolate why.
> > > >
> > >
> > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > with the upcoming parallel submission extension I leveraged some of the
> > > existing bonding code so I wouldn't be too eager to delete this code
> > > until that lands.
> >
> > Mind being a bit more specific about that?  The motivation for this
> > patch is that the current bonding handling and uAPI is, well, very odd
> > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > Instead you create engines and then bond them together after the fact.
> > I didn't want to blindly duplicate those oddities with the proto-ctx
> > stuff unless they were useful.  With parallel submit, I would expect
> > we want a more explicit API where you specify a set of engine
> > class/instance pairs to bond together into a single engine similar to
> > how the current balancing API works.
> >
> > Of course, that's all focused on the API and not the internals.  But,
> > again, I'm not sure how we want things to look internally.  What we've
> > got now doesn't seem great for the GuC submission model but I'm very
> > much not the expert there.  I don't want to be working at cross
> > purposes to you and I'm happy to leave bits if you think they're
> > useful.  But I thought I was clearing things away so that you can put
> > in what you actually want for GuC/parallel submit.
> >
>
> Removing all the UAPI things are fine but I wouldn't delete some of the
> internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> intel_context_ops, the hook for a submit fence, etc...) as that will
> still likely be used for the new parallel submission interface with
> execlists. As you say the new UAPI wont allow crazy configurations,
> only simple ones.

I'm fine with leaving some of the internal bits for a little while if
it makes pulling the GuC scheduler in easier.  I'm just a bit
skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
thoughts?

--Jason

> Matt
>
> > --Jason
> >
> > > Matt
> > >
> > > > --Jason
> > > >
> > > > > At least my understanding is that we're only limiting the engine set
> > > > > further, so if both signaller and signalled request can only run on
> > > > > singletons (which must be distinct, or the bonded parameter validation is
> > > > > busted) there's really nothing to do here.
> > > > >
> > > > > Also this is the locking code that freaks me out about the current bonded
> > > > > execlist code ...
> > > > >
> > > > > Dazzled and confused.
> > > > > -Daniel
> > > > >
> > > > > >
> > > > > > --Jason
> > > > > >
> > > > > >
> > > > > > > -}
> > > > > > > -
> > > > > > >  struct intel_context *
> > > > > > >  intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > > >                                unsigned int count)
> > > > > > > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > > >
> > > > > > >         ve->base.schedule = i915_schedule;
> > > > > > >         ve->base.submit_request = virtual_submit_request;
> > > > > > > -       ve->base.bond_execute = virtual_bond_execute;
> > > > > > >
> > > > > > >         INIT_LIST_HEAD(virtual_queue(ve));
> > > > > > >         ve->base.execlists.queue_priority_hint = INT_MIN;
> > > > > > > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> > > > > > >         if (IS_ERR(dst))
> > > > > > >                 return dst;
> > > > > > >
> > > > > > > -       if (se->num_bonds) {
> > > > > > > -               struct virtual_engine *de = to_virtual_engine(dst->engine);
> > > > > > > -
> > > > > > > -               de->bonds = kmemdup(se->bonds,
> > > > > > > -                                   sizeof(*se->bonds) * se->num_bonds,
> > > > > > > -                                   GFP_KERNEL);
> > > > > > > -               if (!de->bonds) {
> > > > > > > -                       intel_context_put(dst);
> > > > > > > -                       return ERR_PTR(-ENOMEM);
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               de->num_bonds = se->num_bonds;
> > > > > > > -       }
> > > > > > > -
> > > > > > >         return dst;
> > > > > > >  }
> > > > > > >
> > > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > > -                                    const struct intel_engine_cs *sibling)
> > > > > > > -{
> > > > > > > -       struct virtual_engine *ve = to_virtual_engine(engine);
> > > > > > > -       struct ve_bond *bond;
> > > > > > > -       int n;
> > > > > > > -
> > > > > > > -       /* Sanity check the sibling is part of the virtual engine */
> > > > > > > -       for (n = 0; n < ve->num_siblings; n++)
> > > > > > > -               if (sibling == ve->siblings[n])
> > > > > > > -                       break;
> > > > > > > -       if (n == ve->num_siblings)
> > > > > > > -               return -EINVAL;
> > > > > > > -
> > > > > > > -       bond = virtual_find_bond(ve, master);
> > > > > > > -       if (bond) {
> > > > > > > -               bond->sibling_mask |= sibling->mask;
> > > > > > > -               return 0;
> > > > > > > -       }
> > > > > > > -
> > > > > > > -       bond = krealloc(ve->bonds,
> > > > > > > -                       sizeof(*bond) * (ve->num_bonds + 1),
> > > > > > > -                       GFP_KERNEL);
> > > > > > > -       if (!bond)
> > > > > > > -               return -ENOMEM;
> > > > > > > -
> > > > > > > -       bond[ve->num_bonds].master = master;
> > > > > > > -       bond[ve->num_bonds].sibling_mask = sibling->mask;
> > > > > > > -
> > > > > > > -       ve->bonds = bond;
> > > > > > > -       ve->num_bonds++;
> > > > > > > -
> > > > > > > -       return 0;
> > > > > > > -}
> > > > > > > -
> > > > > > >  void intel_execlists_show_requests(struct intel_engine_cs *engine,
> > > > > > >                                    struct drm_printer *m,
> > > > > > >                                    void (*show_request)(struct drm_printer *m,
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > > index fd61dae820e9e..80cec37a56ba9 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > > > > > > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> > > > > > >  struct intel_context *
> > > > > > >  intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > > > > > >
> > > > > > > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > > > > > > -                                    const struct intel_engine_cs *master,
> > > > > > > -                                    const struct intel_engine_cs *sibling);
> > > > > > > -
> > > > > > >  bool
> > > > > > >  intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > > index 1081cd36a2bd3..f03446d587160 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > > > > > > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> > > > > > >         return 0;
> > > > > > >  }
> > > > > > >
> > > > > > > -static int bond_virtual_engine(struct intel_gt *gt,
> > > > > > > -                              unsigned int class,
> > > > > > > -                              struct intel_engine_cs **siblings,
> > > > > > > -                              unsigned int nsibling,
> > > > > > > -                              unsigned int flags)
> > > > > > > -#define BOND_SCHEDULE BIT(0)
> > > > > > > -{
> > > > > > > -       struct intel_engine_cs *master;
> > > > > > > -       struct i915_request *rq[16];
> > > > > > > -       enum intel_engine_id id;
> > > > > > > -       struct igt_spinner spin;
> > > > > > > -       unsigned long n;
> > > > > > > -       int err;
> > > > > > > -
> > > > > > > -       /*
> > > > > > > -        * A set of bonded requests is intended to be run concurrently
> > > > > > > -        * across a number of engines. We use one request per-engine
> > > > > > > -        * and a magic fence to schedule each of the bonded requests
> > > > > > > -        * at the same time. A consequence of our current scheduler is that
> > > > > > > -        * we only move requests to the HW ready queue when the request
> > > > > > > -        * becomes ready, that is when all of its prerequisite fences have
> > > > > > > -        * been signaled. As one of those fences is the master submit fence,
> > > > > > > -        * there is a delay on all secondary fences as the HW may be
> > > > > > > -        * currently busy. Equally, as all the requests are independent,
> > > > > > > -        * they may have other fences that delay individual request
> > > > > > > -        * submission to HW. Ergo, we do not guarantee that all requests are
> > > > > > > -        * immediately submitted to HW at the same time, just that if the
> > > > > > > -        * rules are abided by, they are ready at the same time as the
> > > > > > > -        * first is submitted. Userspace can embed semaphores in its batch
> > > > > > > -        * to ensure parallel execution of its phases as it requires.
> > > > > > > -        * Though naturally it gets requested that perhaps the scheduler should
> > > > > > > -        * take care of parallel execution, even across preemption events on
> > > > > > > -        * different HW. (The proper answer is of course "lalalala".)
> > > > > > > -        *
> > > > > > > -        * With the submit-fence, we have identified three possible phases
> > > > > > > -        * of synchronisation depending on the master fence: queued (not
> > > > > > > -        * ready), executing, and signaled. The first two are quite simple
> > > > > > > -        * and checked below. However, the signaled master fence handling is
> > > > > > > -        * contentious. Currently we do not distinguish between a signaled
> > > > > > > -        * fence and an expired fence, as once signaled it does not convey
> > > > > > > -        * any information about the previous execution. It may even be freed
> > > > > > > -        * and hence checking later it may not exist at all. Ergo we currently
> > > > > > > -        * do not apply the bonding constraint for an already signaled fence,
> > > > > > > -        * as our expectation is that it should not constrain the secondaries
> > > > > > > -        * and is outside of the scope of the bonded request API (i.e. all
> > > > > > > -        * userspace requests are meant to be running in parallel). As
> > > > > > > -        * it imposes no constraint, and is effectively a no-op, we do not
> > > > > > > -        * check below as normal execution flows are checked extensively above.
> > > > > > > -        *
> > > > > > > -        * XXX Is the degenerate handling of signaled submit fences the
> > > > > > > -        * expected behaviour for userpace?
> > > > > > > -        */
> > > > > > > -
> > > > > > > -       GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > > > > > > -
> > > > > > > -       if (igt_spinner_init(&spin, gt))
> > > > > > > -               return -ENOMEM;
> > > > > > > -
> > > > > > > -       err = 0;
> > > > > > > -       rq[0] = ERR_PTR(-ENOMEM);
> > > > > > > -       for_each_engine(master, gt, id) {
> > > > > > > -               struct i915_sw_fence fence = {};
> > > > > > > -               struct intel_context *ce;
> > > > > > > -
> > > > > > > -               if (master->class == class)
> > > > > > > -                       continue;
> > > > > > > -
> > > > > > > -               ce = intel_context_create(master);
> > > > > > > -               if (IS_ERR(ce)) {
> > > > > > > -                       err = PTR_ERR(ce);
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > > > > > > -
> > > > > > > -               rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > > > > > > -               intel_context_put(ce);
> > > > > > > -               if (IS_ERR(rq[0])) {
> > > > > > > -                       err = PTR_ERR(rq[0]);
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -               i915_request_get(rq[0]);
> > > > > > > -
> > > > > > > -               if (flags & BOND_SCHEDULE) {
> > > > > > > -                       onstack_fence_init(&fence);
> > > > > > > -                       err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > > > > > > -                                                              &fence,
> > > > > > > -                                                              GFP_KERNEL);
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               i915_request_add(rq[0]);
> > > > > > > -               if (err < 0)
> > > > > > > -                       goto out;
> > > > > > > -
> > > > > > > -               if (!(flags & BOND_SCHEDULE) &&
> > > > > > > -                   !igt_wait_for_spinner(&spin, rq[0])) {
> > > > > > > -                       err = -EIO;
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > > -                       struct intel_context *ve;
> > > > > > > -
> > > > > > > -                       ve = intel_execlists_create_virtual(siblings, nsibling);
> > > > > > > -                       if (IS_ERR(ve)) {
> > > > > > > -                               err = PTR_ERR(ve);
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       err = intel_virtual_engine_attach_bond(ve->engine,
> > > > > > > -                                                              master,
> > > > > > > -                                                              siblings[n]);
> > > > > > > -                       if (err) {
> > > > > > > -                               intel_context_put(ve);
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       err = intel_context_pin(ve);
> > > > > > > -                       intel_context_put(ve);
> > > > > > > -                       if (err) {
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       rq[n + 1] = i915_request_create(ve);
> > > > > > > -                       intel_context_unpin(ve);
> > > > > > > -                       if (IS_ERR(rq[n + 1])) {
> > > > > > > -                               err = PTR_ERR(rq[n + 1]);
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -                       i915_request_get(rq[n + 1]);
> > > > > > > -
> > > > > > > -                       err = i915_request_await_execution(rq[n + 1],
> > > > > > > -                                                          &rq[0]->fence,
> > > > > > > -                                                          ve->engine->bond_execute);
> > > > > > > -                       i915_request_add(rq[n + 1]);
> > > > > > > -                       if (err < 0) {
> > > > > > > -                               onstack_fence_fini(&fence);
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -               }
> > > > > > > -               onstack_fence_fini(&fence);
> > > > > > > -               intel_engine_flush_submission(master);
> > > > > > > -               igt_spinner_end(&spin);
> > > > > > > -
> > > > > > > -               if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > > > > > > -                       pr_err("Master request did not execute (on %s)!\n",
> > > > > > > -                              rq[0]->engine->name);
> > > > > > > -                       err = -EIO;
> > > > > > > -                       goto out;
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               for (n = 0; n < nsibling; n++) {
> > > > > > > -                       if (i915_request_wait(rq[n + 1], 0,
> > > > > > > -                                             MAX_SCHEDULE_TIMEOUT) < 0) {
> > > > > > > -                               err = -EIO;
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -
> > > > > > > -                       if (rq[n + 1]->engine != siblings[n]) {
> > > > > > > -                               pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > > > > > > -                                      siblings[n]->name,
> > > > > > > -                                      rq[n + 1]->engine->name,
> > > > > > > -                                      rq[0]->engine->name);
> > > > > > > -                               err = -EINVAL;
> > > > > > > -                               goto out;
> > > > > > > -                       }
> > > > > > > -               }
> > > > > > > -
> > > > > > > -               for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > > -                       i915_request_put(rq[n]);
> > > > > > > -               rq[0] = ERR_PTR(-ENOMEM);
> > > > > > > -       }
> > > > > > > -
> > > > > > > -out:
> > > > > > > -       for (n = 0; !IS_ERR(rq[n]); n++)
> > > > > > > -               i915_request_put(rq[n]);
> > > > > > > -       if (igt_flush_test(gt->i915))
> > > > > > > -               err = -EIO;
> > > > > > > -
> > > > > > > -       igt_spinner_fini(&spin);
> > > > > > > -       return err;
> > > > > > > -}
> > > > > > > -
> > > > > > > -static int live_virtual_bond(void *arg)
> > > > > > > -{
> > > > > > > -       static const struct phase {
> > > > > > > -               const char *name;
> > > > > > > -               unsigned int flags;
> > > > > > > -       } phases[] = {
> > > > > > > -               { "", 0 },
> > > > > > > -               { "schedule", BOND_SCHEDULE },
> > > > > > > -               { },
> > > > > > > -       };
> > > > > > > -       struct intel_gt *gt = arg;
> > > > > > > -       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > > > > > > -       unsigned int class;
> > > > > > > -       int err;
> > > > > > > -
> > > > > > > -       if (intel_uc_uses_guc_submission(&gt->uc))
> > > > > > > -               return 0;
> > > > > > > -
> > > > > > > -       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > > > > > > -               const struct phase *p;
> > > > > > > -               int nsibling;
> > > > > > > -
> > > > > > > -               nsibling = select_siblings(gt, class, siblings);
> > > > > > > -               if (nsibling < 2)
> > > > > > > -                       continue;
> > > > > > > -
> > > > > > > -               for (p = phases; p->name; p++) {
> > > > > > > -                       err = bond_virtual_engine(gt,
> > > > > > > -                                                 class, siblings, nsibling,
> > > > > > > -                                                 p->flags);
> > > > > > > -                       if (err) {
> > > > > > > -                               pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > > > > > > -                                      __func__, p->name, class, nsibling, err);
> > > > > > > -                               return err;
> > > > > > > -                       }
> > > > > > > -               }
> > > > > > > -       }
> > > > > > > -
> > > > > > > -       return 0;
> > > > > > > -}
> > > > > > > -
> > > > > > >  static int reset_virtual_engine(struct intel_gt *gt,
> > > > > > >                                 struct intel_engine_cs **siblings,
> > > > > > >                                 unsigned int nsibling)
> > > > > > > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> > > > > > >                 SUBTEST(live_virtual_mask),
> > > > > > >                 SUBTEST(live_virtual_preserved),
> > > > > > >                 SUBTEST(live_virtual_slice),
> > > > > > > -               SUBTEST(live_virtual_bond),
> > > > > > >                 SUBTEST(live_virtual_reset),
> > > > > > >         };
> > > > > > >
> > > > > > > --
> > > > > > > 2.31.1
> > > > > > >
> > > > > > _______________________________________________
> > > > > > dri-devel mailing list
> > > > > > dri-devel@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
  2021-04-28 10:27     ` [Intel-gfx] " Daniel Vetter
@ 2021-04-28 18:22       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 18:22 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 5:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote:
> > As far as I can tell, the only real reason for this is to avoid taking a
> > reference to the i915_gem_context.  The cost of those two atomics
> > probably pales in comparison to the cost of the ioctl itself so we're
> > really not buying ourselves anything here.  We're about to make context
> > lookup a tiny bit more complicated, so let's get rid of the one hand-
> > rolled case.
>
> I think the historical reason here is that i965_brw checks this before
> every execbuf call, at least for arb_robustness contexts with the right
> flag. But we've fixed that hotpath problem by adding non-recoverable
> contexts. The kernel will tell you now automatically, for proper userspace
> at least (I checked iris and anv, assuming I got it correct), and
> reset_stats ioctl isn't a hot path worth micro-optimizing anymore.

I'm not sure I agree with that bit.  I don't think it was ever worth
micro-optimizing like this.  What does it gain us?  Two fewer atomics?
 It's not like the bad old days when it took a lock.

ANV still calls reset_stats before every set of execbuf (sometimes
more than one) but I've never once seen it show up on a perf trace.
execbuf, on the other hand, that does show up and pretty heavy
sometimes.

> With that bit of more context added to the commit message:

I'd like to agree on what to add before adding something

--Jason

> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
> >  drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
> >  2 files changed, 5 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index ecb3bf5369857..941fbf78267b4 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> >       struct drm_i915_private *i915 = to_i915(dev);
> >       struct drm_i915_reset_stats *args = data;
> >       struct i915_gem_context *ctx;
> > -     int ret;
> >
> >       if (args->flags || args->pad)
> >               return -EINVAL;
> >
> > -     ret = -ENOENT;
> > -     rcu_read_lock();
> > -     ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
> > +     ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> >       if (!ctx)
> > -             goto out;
> > +             return -ENOENT;
> >
> >       /*
> >        * We opt for unserialised reads here. This may result in tearing
> > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> >       args->batch_active = atomic_read(&ctx->guilty_count);
> >       args->batch_pending = atomic_read(&ctx->active_count);
> >
> > -     ret = 0;
> > -out:
> > -     rcu_read_unlock();
> > -     return ret;
> > +     i915_gem_context_put(ctx);
> > +     return 0;
> >  }
> >
> >  /* GEM context-engines iterator: for_each_gem_engine() */
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 0b44333eb7033..8571c5c1509a7 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> >
> >  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
> >
> > -static inline struct i915_gem_context *
> > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
> > -{
> > -     return xa_load(&file_priv->context_xa, id);
> > -}
> > -
> >  static inline struct i915_gem_context *
> >  i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> >  {
> >       struct i915_gem_context *ctx;
> >
> >       rcu_read_lock();
> > -     ctx = __i915_gem_context_lookup_rcu(file_priv, id);
> > +     ctx = xa_load(&file_priv->context_xa, id);
> >       if (ctx && !kref_get_unless_zero(&ctx->ref))
> >               ctx = NULL;
> >       rcu_read_unlock();
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
@ 2021-04-28 18:22       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 18:22 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 5:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote:
> > As far as I can tell, the only real reason for this is to avoid taking a
> > reference to the i915_gem_context.  The cost of those two atomics
> > probably pales in comparison to the cost of the ioctl itself so we're
> > really not buying ourselves anything here.  We're about to make context
> > lookup a tiny bit more complicated, so let's get rid of the one hand-
> > rolled case.
>
> I think the historical reason here is that i965_brw checks this before
> every execbuf call, at least for arb_robustness contexts with the right
> flag. But we've fixed that hotpath problem by adding non-recoverable
> contexts. The kernel will tell you now automatically, for proper userspace
> at least (I checked iris and anv, assuming I got it correct), and
> reset_stats ioctl isn't a hot path worth micro-optimizing anymore.

I'm not sure I agree with that bit.  I don't think it was ever worth
micro-optimizing like this.  What does it gain us?  Two fewer atomics?
 It's not like the bad old days when it took a lock.

ANV still calls reset_stats before every set of execbuf (sometimes
more than one) but I've never once seen it show up on a perf trace.
execbuf, on the other hand, that does show up and pretty heavy
sometimes.

> With that bit of more context added to the commit message:

I'd like to agree on what to add before adding something

--Jason

> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
> >  drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
> >  2 files changed, 5 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index ecb3bf5369857..941fbf78267b4 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> >       struct drm_i915_private *i915 = to_i915(dev);
> >       struct drm_i915_reset_stats *args = data;
> >       struct i915_gem_context *ctx;
> > -     int ret;
> >
> >       if (args->flags || args->pad)
> >               return -EINVAL;
> >
> > -     ret = -ENOENT;
> > -     rcu_read_lock();
> > -     ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
> > +     ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> >       if (!ctx)
> > -             goto out;
> > +             return -ENOENT;
> >
> >       /*
> >        * We opt for unserialised reads here. This may result in tearing
> > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> >       args->batch_active = atomic_read(&ctx->guilty_count);
> >       args->batch_pending = atomic_read(&ctx->active_count);
> >
> > -     ret = 0;
> > -out:
> > -     rcu_read_unlock();
> > -     return ret;
> > +     i915_gem_context_put(ctx);
> > +     return 0;
> >  }
> >
> >  /* GEM context-engines iterator: for_each_gem_engine() */
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 0b44333eb7033..8571c5c1509a7 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> >
> >  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
> >
> > -static inline struct i915_gem_context *
> > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
> > -{
> > -     return xa_load(&file_priv->context_xa, id);
> > -}
> > -
> >  static inline struct i915_gem_context *
> >  i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> >  {
> >       struct i915_gem_context *ctx;
> >
> >       rcu_read_lock();
> > -     ctx = __i915_gem_context_lookup_rcu(file_priv, id);
> > +     ctx = xa_load(&file_priv->context_xa, id);
> >       if (ctx && !kref_get_unless_zero(&ctx->ref))
> >               ctx = NULL;
> >       rcu_read_unlock();
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 17:18         ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-28 18:58           ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 18:58 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Brost; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > much code.  This function in particular, has to stay, unfortunately.
> > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > the work onto a different engine than than the one it's supposed to
> > > run in parallel with.  This means we can't dead-code this function or
> > > the bond_execution function pointer and related stuff.
> >
> > Uh that's disappointing, since if I understand your point correctly, the
> > sibling engines should all be singletons, not load balancing virtual ones.
> > So there really should not be any need to pick the right one at execution
> > time.
>
> The media driver itself seems to work fine if I delete all the code.
> It's just an IGT testcase that blows up.  I'll do more digging to see
> if I can better isolate why.

I did more digging and I figured out why this test hangs.  The test
looks at an engine class where there's more than one of that class
(currently only vcs) and creates a context where engine[0] is all of
the engines of that class bonded together and engine[1-N] is each of
those engines individually.  It then tests that you can submit a batch
to one of the individual engines and then submit with
EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
out.  This doesn't seem like a use-case we care about.

If we cared about anything, I would expect it to be submitting to two
balanced contexts and expecting "pick any two" behavior.  But that's
not what the test is testing for.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-28 18:58           ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-28 18:58 UTC (permalink / raw)
  To: Daniel Vetter, Matthew Brost; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > much code.  This function in particular, has to stay, unfortunately.
> > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > the work onto a different engine than than the one it's supposed to
> > > run in parallel with.  This means we can't dead-code this function or
> > > the bond_execution function pointer and related stuff.
> >
> > Uh that's disappointing, since if I understand your point correctly, the
> > sibling engines should all be singletons, not load balancing virtual ones.
> > So there really should not be any need to pick the right one at execution
> > time.
>
> The media driver itself seems to work fine if I delete all the code.
> It's just an IGT testcase that blows up.  I'll do more digging to see
> if I can better isolate why.

I did more digging and I figured out why this test hangs.  The test
looks at an engine class where there's more than one of that class
(currently only vcs) and creates a context where engine[0] is all of
the engines of that class bonded together and engine[1-N] is each of
those engines individually.  It then tests that you can submit a batch
to one of the individual engines and then submit with
EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
out.  This doesn't seem like a use-case we care about.

If we cared about anything, I would expect it to be submitting to two
balanced contexts and expecting "pick any two" behavior.  But that's
not what the test is testing for.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-28 17:09             ` Jason Ekstrand
@ 2021-04-29  8:01               ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29  8:01 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 28/04/2021 18:09, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 28/04/2021 15:02, Daniel Vetter wrote:
>>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 28/04/2021 11:16, Daniel Vetter wrote:
>>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>>>>>> There's no sense in allowing userspace to create more engines than it
>>>>>> can possibly access via execbuf.
>>>>>>
>>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>>>>>     1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>>>>>                     return -EINVAL;
>>>>>>             }
>>>>>> -  /*
>>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>>>>>> -   * first 64 engines defined here.
>>>>>> -   */
>>>>>>             num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
>>>>>
>>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
>>>>> directly here */ since I had to check that :-)
>>>>>
>>>>> Same story about igt testcases needed, just to be sure.
>>>>>
>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>
>>>> I am not sure about the churn vs benefit ratio here. There are also patches
>>>> which extend the engine selection field in execbuf2 over the unused
>>>> constants bits (with an explicit flag). So churn upstream and churn in
>>>> internal (if interesting) for not much benefit.
>>>
>>> This isn't churn.
>>>
>>> This is "lock done uapi properly".
> 
> Pretty much.

Still haven't heard what concrete problems it solves.

>> IMO it is a "meh" patch. Doesn't fix any problems and will create work
>> for other people and man hours spent which no one will ever properly
>> account against.
>>
>> Number of contexts in the engine map should not really be tied to
>> execbuf2. As is demonstrated by the incoming work to address more than
>> 63 engines, either as an extension to execbuf2 or future execbuf3.
> 
> Which userspace driver has requested more than 64 engines in a single context?

No need to artificially limit hardware capabilities in the uapi by 
implementing a policy in the kernel. Which will need to be 
removed/changed shortly anyway. This particular patch is work and 
creates more work (which other people who will get to fix the fallout 
will spend man hours to figure out what and why broke) for no benefit. 
Or you are yet to explain what the benefit is in concrete terms.

Why don't you limit it to number of physical engines then? Why don't you 
filter out duplicates? Why not limit the number of buffer objects per 
client or global based on available RAM + swap relative to minimum 
object size? Reductio ad absurdum yes, but illustrating the, in this 
case, a thin line between "locking down uapi" and adding too much policy 
where it is not appropriate.

> Also, for execbuf3, I'd like to get rid of contexts entirely and have
> engines be their own userspace-visible object.  If we go this
> direction, you can have UINT32_MAX of them.  Problem solved.

Not the problem I am pointing at though.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-29  8:01               ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29  8:01 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 28/04/2021 18:09, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 28/04/2021 15:02, Daniel Vetter wrote:
>>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 28/04/2021 11:16, Daniel Vetter wrote:
>>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>>>>>> There's no sense in allowing userspace to create more engines than it
>>>>>> can possibly access via execbuf.
>>>>>>
>>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>>>>>     1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>>>>>                     return -EINVAL;
>>>>>>             }
>>>>>> -  /*
>>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>>>>>> -   * first 64 engines defined here.
>>>>>> -   */
>>>>>>             num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
>>>>>
>>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
>>>>> directly here */ since I had to check that :-)
>>>>>
>>>>> Same story about igt testcases needed, just to be sure.
>>>>>
>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>
>>>> I am not sure about the churn vs benefit ratio here. There are also patches
>>>> which extend the engine selection field in execbuf2 over the unused
>>>> constants bits (with an explicit flag). So churn upstream and churn in
>>>> internal (if interesting) for not much benefit.
>>>
>>> This isn't churn.
>>>
>>> This is "lock done uapi properly".
> 
> Pretty much.

Still haven't heard what concrete problems it solves.

>> IMO it is a "meh" patch. Doesn't fix any problems and will create work
>> for other people and man hours spent which no one will ever properly
>> account against.
>>
>> Number of contexts in the engine map should not really be tied to
>> execbuf2. As is demonstrated by the incoming work to address more than
>> 63 engines, either as an extension to execbuf2 or future execbuf3.
> 
> Which userspace driver has requested more than 64 engines in a single context?

No need to artificially limit hardware capabilities in the uapi by 
implementing a policy in the kernel. Which will need to be 
removed/changed shortly anyway. This particular patch is work and 
creates more work (which other people who will get to fix the fallout 
will spend man hours to figure out what and why broke) for no benefit. 
Or you are yet to explain what the benefit is in concrete terms.

Why don't you limit it to number of physical engines then? Why don't you 
filter out duplicates? Why not limit the number of buffer objects per 
client or global based on available RAM + swap relative to minimum 
object size? Reductio ad absurdum yes, but illustrating the, in this 
case, a thin line between "locking down uapi" and adding too much policy 
where it is not appropriate.

> Also, for execbuf3, I'd like to get rid of contexts entirely and have
> engines be their own userspace-visible object.  If we go this
> direction, you can have UINT32_MAX of them.  Problem solved.

Not the problem I am pointing at though.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-28 17:24       ` Jason Ekstrand
@ 2021-04-29  8:04         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29  8:04 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 28/04/2021 18:24, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>> Instead of handling it like a context param, unconditionally set it when
>>> intel_contexts are created.  This doesn't fix anything but does simplify
>>> the code a bit.
>>>
>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>>>    3 files changed, 6 insertions(+), 44 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> index 35bcdeddfbf3f..1091cc04a242a 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>>>            intel_engine_has_timeslices(ce->engine))
>>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>>>
>>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
>>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
>>> +         ctx->i915->params.request_timeout_ms) {
>>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
>>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
>>
>> Blank line between declarations and code please, or just lose the local.
>>
>> Otherwise looks okay. Slight change that same GEM context can now have a
>> mix of different request expirations isn't interesting I think. At least
>> the change goes away by the end of the series.
> 
> In order for that to happen, I think you'd have to have a race between
> CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> sysfs.  Or am I missing something?  Given that timeouts are really
> per-engine anyway, I don't think we need to care too much about that.

We don't care, no.

For completeness only - by the end of the series it is what you say. But 
at _this_ point in the series though it is if modparam changes at any 
point between context create and replacing engines. Which is a change 
compared to before this patch, since modparam was cached in the GEM 
context so far. So one GEM context was a single request_timeout_ms.

Regards,

Tvrtko

> --Jason
> 
>> Regards,
>>
>> Tvrtko
>>
>>> +     }
>>>    }
>>>
>>>    static void __free_engines(struct i915_gem_engines *e, unsigned int count)
>>> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
>>>        context_apply_all(ctx, __apply_timeline, timeline);
>>>    }
>>>
>>> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
>>> -{
>>> -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
>>> -}
>>> -
>>> -static int
>>> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
>>> -{
>>> -     int ret;
>>> -
>>> -     ret = context_apply_all(ctx, __apply_watchdog,
>>> -                             (void *)(uintptr_t)timeout_us);
>>> -     if (!ret)
>>> -             ctx->watchdog.timeout_us = timeout_us;
>>> -
>>> -     return ret;
>>> -}
>>> -
>>> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
>>> -{
>>> -     struct drm_i915_private *i915 = ctx->i915;
>>> -     int ret;
>>> -
>>> -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
>>> -         !i915->params.request_timeout_ms)
>>> -             return;
>>> -
>>> -     /* Default expiry for user fences. */
>>> -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
>>> -     if (ret)
>>> -             drm_notice(&i915->drm,
>>> -                        "Failed to configure default fence expiry! (%d)",
>>> -                        ret);
>>> -}
>>> -
>>>    static struct i915_gem_context *
>>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>    {
>>> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>                intel_timeline_put(timeline);
>>>        }
>>>
>>> -     __set_default_fence_expiry(ctx);
>>> -
>>>        trace_i915_context_create(ctx);
>>>
>>>        return ctx;
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> index 5ae71ec936f7c..676592e27e7d2 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> @@ -153,10 +153,6 @@ struct i915_gem_context {
>>>         */
>>>        atomic_t active_count;
>>>
>>> -     struct {
>>> -             u64 timeout_us;
>>> -     } watchdog;
>>> -
>>>        /**
>>>         * @hang_timestamp: The last time(s) this context caused a GPU hang
>>>         */
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
>>> index dffedd983693d..0c69cb42d075c 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
>>> @@ -10,11 +10,10 @@
>>>
>>>    #include "intel_context.h"
>>>
>>> -static inline int
>>> +static inline void
>>>    intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>>>    {
>>>        ce->watchdog.timeout_us = timeout_us;
>>> -     return 0;
>>>    }
>>>
>>>    #endif /* INTEL_CONTEXT_PARAM_H */
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-29  8:04         ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29  8:04 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 28/04/2021 18:24, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>> Instead of handling it like a context param, unconditionally set it when
>>> intel_contexts are created.  This doesn't fix anything but does simplify
>>> the code a bit.
>>>
>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>>>    3 files changed, 6 insertions(+), 44 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> index 35bcdeddfbf3f..1091cc04a242a 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>>>            intel_engine_has_timeslices(ce->engine))
>>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>>>
>>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
>>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
>>> +         ctx->i915->params.request_timeout_ms) {
>>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
>>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
>>
>> Blank line between declarations and code please, or just lose the local.
>>
>> Otherwise looks okay. Slight change that same GEM context can now have a
>> mix of different request expirations isn't interesting I think. At least
>> the change goes away by the end of the series.
> 
> In order for that to happen, I think you'd have to have a race between
> CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> sysfs.  Or am I missing something?  Given that timeouts are really
> per-engine anyway, I don't think we need to care too much about that.

We don't care, no.

For completeness only - by the end of the series it is what you say. But 
at _this_ point in the series though it is if modparam changes at any 
point between context create and replacing engines. Which is a change 
compared to before this patch, since modparam was cached in the GEM 
context so far. So one GEM context was a single request_timeout_ms.

Regards,

Tvrtko

> --Jason
> 
>> Regards,
>>
>> Tvrtko
>>
>>> +     }
>>>    }
>>>
>>>    static void __free_engines(struct i915_gem_engines *e, unsigned int count)
>>> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
>>>        context_apply_all(ctx, __apply_timeline, timeline);
>>>    }
>>>
>>> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
>>> -{
>>> -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
>>> -}
>>> -
>>> -static int
>>> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
>>> -{
>>> -     int ret;
>>> -
>>> -     ret = context_apply_all(ctx, __apply_watchdog,
>>> -                             (void *)(uintptr_t)timeout_us);
>>> -     if (!ret)
>>> -             ctx->watchdog.timeout_us = timeout_us;
>>> -
>>> -     return ret;
>>> -}
>>> -
>>> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
>>> -{
>>> -     struct drm_i915_private *i915 = ctx->i915;
>>> -     int ret;
>>> -
>>> -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
>>> -         !i915->params.request_timeout_ms)
>>> -             return;
>>> -
>>> -     /* Default expiry for user fences. */
>>> -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
>>> -     if (ret)
>>> -             drm_notice(&i915->drm,
>>> -                        "Failed to configure default fence expiry! (%d)",
>>> -                        ret);
>>> -}
>>> -
>>>    static struct i915_gem_context *
>>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>    {
>>> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>                intel_timeline_put(timeline);
>>>        }
>>>
>>> -     __set_default_fence_expiry(ctx);
>>> -
>>>        trace_i915_context_create(ctx);
>>>
>>>        return ctx;
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> index 5ae71ec936f7c..676592e27e7d2 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> @@ -153,10 +153,6 @@ struct i915_gem_context {
>>>         */
>>>        atomic_t active_count;
>>>
>>> -     struct {
>>> -             u64 timeout_us;
>>> -     } watchdog;
>>> -
>>>        /**
>>>         * @hang_timestamp: The last time(s) this context caused a GPU hang
>>>         */
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
>>> index dffedd983693d..0c69cb42d075c 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
>>> @@ -10,11 +10,10 @@
>>>
>>>    #include "intel_context.h"
>>>
>>> -static inline int
>>> +static inline void
>>>    intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
>>>    {
>>>        ce->watchdog.timeout_us = timeout_us;
>>> -     return 0;
>>>    }
>>>
>>>    #endif /* INTEL_CONTEXT_PARAM_H */
>>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-28 17:26       ` Jason Ekstrand
@ 2021-04-29  8:06         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29  8:06 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 28/04/2021 18:26, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>> This API is entirely unnecessary and I'd love to get rid of it.  If
>>> userspace wants a single timeline across multiple contexts, they can
>>> either use implicit synchronization or a syncobj, both of which existed
>>> at the time this feature landed.  The justification given at the time
>>> was that it would help GL drivers which are inherently single-timeline.
>>> However, neither of our GL drivers actually wanted the feature.  i965
>>> was already in maintenance mode at the time and iris uses syncobj for
>>> everything.
>>>
>>> Unfortunately, as much as I'd love to get rid of it, it is used by the
>>> media driver so we can't do that.  We can, however, do the next-best
>>> thing which is to embed a syncobj in the context and do exactly what
>>> we'd expect from userspace internally.  This isn't an entirely identical
>>> implementation because it's no longer atomic if userspace races with
>>> itself by calling execbuffer2 twice simultaneously from different
>>> threads.  It won't crash in that case; it just doesn't guarantee any
>>> ordering between those two submits.
>>
>> 1)
>>
>> Please also mention the difference in context/timeline name when
>> observed via the sync file API.
>>
>> 2)
>>
>> I don't remember what we have concluded in terms of observable effects
>> in sync_file_merge?
> 
> I don't see how either of these are observable since this syncobj is
> never exposed to userspace in any way.  Please help me understand what
> I'm missing here.

Single timeline context - two execbufs - return two out fences.

Before the patch those two had the same fence context, with the patch 
they have different ones.

Fence context is visible to userspace via sync file info (timeline name 
at least) and rules in sync_file_merge.

Regards,

Tvrtko

> 
> --Jason
> 
> 
>> Regards,
>>
>> Tvrtko
>>
>>> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
>>> advantages beyond mere annoyance.  One is that intel_timeline is no
>>> longer an api-visible object and can remain entirely an implementation
>>> detail.  This may be advantageous as we make scheduler changes going
>>> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
>>> we should now have a 1:1 mapping between intel_context and
>>> intel_timeline which may help us reduce locking.
>>>
>>> v2 (Jason Ekstrand):
>>>    - Update the comment on i915_gem_context::syncobj to mention that it's
>>>      an emulation and the possible race if userspace calls execbuffer2
>>>      twice on the same context concurrently.
>>>    - Wrap the checks for eb.gem_context->syncobj in unlikely()
>>>    - Drop the dma_fence reference
>>>    - Improved commit message
>>>
>>> v3 (Jason Ekstrand):
>>>    - Move the dma_fence_put() to before the error exit
>>>
>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
>>>    .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
>>>    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
>>>    3 files changed, 40 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> index 2c2fefa912805..a72c9b256723b 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> @@ -67,6 +67,8 @@
>>>    #include <linux/log2.h>
>>>    #include <linux/nospec.h>
>>>
>>> +#include <drm/drm_syncobj.h>
>>> +
>>>    #include "gt/gen6_ppgtt.h"
>>>    #include "gt/intel_context.h"
>>>    #include "gt/intel_context_param.h"
>>> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
>>>                ce->vm = vm;
>>>        }
>>>
>>> -     GEM_BUG_ON(ce->timeline);
>>> -     if (ctx->timeline)
>>> -             ce->timeline = intel_timeline_get(ctx->timeline);
>>> -
>>>        if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>>>            intel_engine_has_timeslices(ce->engine))
>>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>>> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>>>        mutex_destroy(&ctx->engines_mutex);
>>>        mutex_destroy(&ctx->lut_mutex);
>>>
>>> -     if (ctx->timeline)
>>> -             intel_timeline_put(ctx->timeline);
>>> -
>>>        put_pid(ctx->pid);
>>>        mutex_destroy(&ctx->mutex);
>>>
>>> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>>>        if (vm)
>>>                i915_vm_close(vm);
>>>
>>> +     if (ctx->syncobj)
>>> +             drm_syncobj_put(ctx->syncobj);
>>> +
>>>        ctx->file_priv = ERR_PTR(-EBADF);
>>>
>>>        /*
>>> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>>>                i915_vm_close(vm);
>>>    }
>>>
>>> -static void __set_timeline(struct intel_timeline **dst,
>>> -                        struct intel_timeline *src)
>>> -{
>>> -     struct intel_timeline *old = *dst;
>>> -
>>> -     *dst = src ? intel_timeline_get(src) : NULL;
>>> -
>>> -     if (old)
>>> -             intel_timeline_put(old);
>>> -}
>>> -
>>> -static void __apply_timeline(struct intel_context *ce, void *timeline)
>>> -{
>>> -     __set_timeline(&ce->timeline, timeline);
>>> -}
>>> -
>>> -static void __assign_timeline(struct i915_gem_context *ctx,
>>> -                           struct intel_timeline *timeline)
>>> -{
>>> -     __set_timeline(&ctx->timeline, timeline);
>>> -     context_apply_all(ctx, __apply_timeline, timeline);
>>> -}
>>> -
>>>    static struct i915_gem_context *
>>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>    {
>>>        struct i915_gem_context *ctx;
>>> +     int ret;
>>>
>>>        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
>>>            !HAS_EXECLISTS(i915))
>>> @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>        }
>>>
>>>        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
>>> -             struct intel_timeline *timeline;
>>> -
>>> -             timeline = intel_timeline_create(&i915->gt);
>>> -             if (IS_ERR(timeline)) {
>>> +             ret = drm_syncobj_create(&ctx->syncobj,
>>> +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
>>> +                                      NULL);
>>> +             if (ret) {
>>>                        context_close(ctx);
>>> -                     return ERR_CAST(timeline);
>>> +                     return ERR_PTR(ret);
>>>                }
>>> -
>>> -             __assign_timeline(ctx, timeline);
>>> -             intel_timeline_put(timeline);
>>>        }
>>>
>>>        trace_i915_context_create(ctx);
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> index 676592e27e7d2..df76767f0c41b 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> @@ -83,7 +83,19 @@ struct i915_gem_context {
>>>        struct i915_gem_engines __rcu *engines;
>>>        struct mutex engines_mutex; /* guards writes to engines */
>>>
>>> -     struct intel_timeline *timeline;
>>> +     /**
>>> +      * @syncobj: Shared timeline syncobj
>>> +      *
>>> +      * When the SHARED_TIMELINE flag is set on context creation, we
>>> +      * emulate a single timeline across all engines using this syncobj.
>>> +      * For every execbuffer2 call, this syncobj is used as both an in-
>>> +      * and out-fence.  Unlike the real intel_timeline, this doesn't
>>> +      * provide perfect atomic in-order guarantees if the client races
>>> +      * with itself by calling execbuffer2 twice concurrently.  However,
>>> +      * if userspace races with itself, that's not likely to yield well-
>>> +      * defined results anyway so we choose to not care.
>>> +      */
>>> +     struct drm_syncobj *syncobj;
>>>
>>>        /**
>>>         * @vm: unique address space (GTT)
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>> index b812f313422a9..d640bba6ad9ab 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>> @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>>                goto err_vma;
>>>        }
>>>
>>> +     if (unlikely(eb.gem_context->syncobj)) {
>>> +             struct dma_fence *fence;
>>> +
>>> +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
>>> +             err = i915_request_await_dma_fence(eb.request, fence);
>>> +             dma_fence_put(fence);
>>> +             if (err)
>>> +                     goto err_ext;
>>> +     }
>>> +
>>>        if (in_fence) {
>>>                if (args->flags & I915_EXEC_FENCE_SUBMIT)
>>>                        err = i915_request_await_execution(eb.request,
>>> @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>>                        fput(out_fence->file);
>>>                }
>>>        }
>>> +
>>> +     if (unlikely(eb.gem_context->syncobj)) {
>>> +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
>>> +                                       &eb.request->fence);
>>> +     }
>>> +
>>>        i915_request_put(eb.request);
>>>
>>>    err_vma:
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-29  8:06         ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29  8:06 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 28/04/2021 18:26, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>> This API is entirely unnecessary and I'd love to get rid of it.  If
>>> userspace wants a single timeline across multiple contexts, they can
>>> either use implicit synchronization or a syncobj, both of which existed
>>> at the time this feature landed.  The justification given at the time
>>> was that it would help GL drivers which are inherently single-timeline.
>>> However, neither of our GL drivers actually wanted the feature.  i965
>>> was already in maintenance mode at the time and iris uses syncobj for
>>> everything.
>>>
>>> Unfortunately, as much as I'd love to get rid of it, it is used by the
>>> media driver so we can't do that.  We can, however, do the next-best
>>> thing which is to embed a syncobj in the context and do exactly what
>>> we'd expect from userspace internally.  This isn't an entirely identical
>>> implementation because it's no longer atomic if userspace races with
>>> itself by calling execbuffer2 twice simultaneously from different
>>> threads.  It won't crash in that case; it just doesn't guarantee any
>>> ordering between those two submits.
>>
>> 1)
>>
>> Please also mention the difference in context/timeline name when
>> observed via the sync file API.
>>
>> 2)
>>
>> I don't remember what we have concluded in terms of observable effects
>> in sync_file_merge?
> 
> I don't see how either of these are observable since this syncobj is
> never exposed to userspace in any way.  Please help me understand what
> I'm missing here.

Single timeline context - two execbufs - return two out fences.

Before the patch those two had the same fence context, with the patch 
they have different ones.

Fence context is visible to userspace via sync file info (timeline name 
at least) and rules in sync_file_merge.

Regards,

Tvrtko

> 
> --Jason
> 
> 
>> Regards,
>>
>> Tvrtko
>>
>>> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
>>> advantages beyond mere annoyance.  One is that intel_timeline is no
>>> longer an api-visible object and can remain entirely an implementation
>>> detail.  This may be advantageous as we make scheduler changes going
>>> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
>>> we should now have a 1:1 mapping between intel_context and
>>> intel_timeline which may help us reduce locking.
>>>
>>> v2 (Jason Ekstrand):
>>>    - Update the comment on i915_gem_context::syncobj to mention that it's
>>>      an emulation and the possible race if userspace calls execbuffer2
>>>      twice on the same context concurrently.
>>>    - Wrap the checks for eb.gem_context->syncobj in unlikely()
>>>    - Drop the dma_fence reference
>>>    - Improved commit message
>>>
>>> v3 (Jason Ekstrand):
>>>    - Move the dma_fence_put() to before the error exit
>>>
>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
>>>    .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
>>>    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
>>>    3 files changed, 40 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> index 2c2fefa912805..a72c9b256723b 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>> @@ -67,6 +67,8 @@
>>>    #include <linux/log2.h>
>>>    #include <linux/nospec.h>
>>>
>>> +#include <drm/drm_syncobj.h>
>>> +
>>>    #include "gt/gen6_ppgtt.h"
>>>    #include "gt/intel_context.h"
>>>    #include "gt/intel_context_param.h"
>>> @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
>>>                ce->vm = vm;
>>>        }
>>>
>>> -     GEM_BUG_ON(ce->timeline);
>>> -     if (ctx->timeline)
>>> -             ce->timeline = intel_timeline_get(ctx->timeline);
>>> -
>>>        if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
>>>            intel_engine_has_timeslices(ce->engine))
>>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>>> @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
>>>        mutex_destroy(&ctx->engines_mutex);
>>>        mutex_destroy(&ctx->lut_mutex);
>>>
>>> -     if (ctx->timeline)
>>> -             intel_timeline_put(ctx->timeline);
>>> -
>>>        put_pid(ctx->pid);
>>>        mutex_destroy(&ctx->mutex);
>>>
>>> @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
>>>        if (vm)
>>>                i915_vm_close(vm);
>>>
>>> +     if (ctx->syncobj)
>>> +             drm_syncobj_put(ctx->syncobj);
>>> +
>>>        ctx->file_priv = ERR_PTR(-EBADF);
>>>
>>>        /*
>>> @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>>>                i915_vm_close(vm);
>>>    }
>>>
>>> -static void __set_timeline(struct intel_timeline **dst,
>>> -                        struct intel_timeline *src)
>>> -{
>>> -     struct intel_timeline *old = *dst;
>>> -
>>> -     *dst = src ? intel_timeline_get(src) : NULL;
>>> -
>>> -     if (old)
>>> -             intel_timeline_put(old);
>>> -}
>>> -
>>> -static void __apply_timeline(struct intel_context *ce, void *timeline)
>>> -{
>>> -     __set_timeline(&ce->timeline, timeline);
>>> -}
>>> -
>>> -static void __assign_timeline(struct i915_gem_context *ctx,
>>> -                           struct intel_timeline *timeline)
>>> -{
>>> -     __set_timeline(&ctx->timeline, timeline);
>>> -     context_apply_all(ctx, __apply_timeline, timeline);
>>> -}
>>> -
>>>    static struct i915_gem_context *
>>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>    {
>>>        struct i915_gem_context *ctx;
>>> +     int ret;
>>>
>>>        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
>>>            !HAS_EXECLISTS(i915))
>>> @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
>>>        }
>>>
>>>        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
>>> -             struct intel_timeline *timeline;
>>> -
>>> -             timeline = intel_timeline_create(&i915->gt);
>>> -             if (IS_ERR(timeline)) {
>>> +             ret = drm_syncobj_create(&ctx->syncobj,
>>> +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
>>> +                                      NULL);
>>> +             if (ret) {
>>>                        context_close(ctx);
>>> -                     return ERR_CAST(timeline);
>>> +                     return ERR_PTR(ret);
>>>                }
>>> -
>>> -             __assign_timeline(ctx, timeline);
>>> -             intel_timeline_put(timeline);
>>>        }
>>>
>>>        trace_i915_context_create(ctx);
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> index 676592e27e7d2..df76767f0c41b 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
>>> @@ -83,7 +83,19 @@ struct i915_gem_context {
>>>        struct i915_gem_engines __rcu *engines;
>>>        struct mutex engines_mutex; /* guards writes to engines */
>>>
>>> -     struct intel_timeline *timeline;
>>> +     /**
>>> +      * @syncobj: Shared timeline syncobj
>>> +      *
>>> +      * When the SHARED_TIMELINE flag is set on context creation, we
>>> +      * emulate a single timeline across all engines using this syncobj.
>>> +      * For every execbuffer2 call, this syncobj is used as both an in-
>>> +      * and out-fence.  Unlike the real intel_timeline, this doesn't
>>> +      * provide perfect atomic in-order guarantees if the client races
>>> +      * with itself by calling execbuffer2 twice concurrently.  However,
>>> +      * if userspace races with itself, that's not likely to yield well-
>>> +      * defined results anyway so we choose to not care.
>>> +      */
>>> +     struct drm_syncobj *syncobj;
>>>
>>>        /**
>>>         * @vm: unique address space (GTT)
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>> index b812f313422a9..d640bba6ad9ab 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>>> @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>>                goto err_vma;
>>>        }
>>>
>>> +     if (unlikely(eb.gem_context->syncobj)) {
>>> +             struct dma_fence *fence;
>>> +
>>> +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
>>> +             err = i915_request_await_dma_fence(eb.request, fence);
>>> +             dma_fence_put(fence);
>>> +             if (err)
>>> +                     goto err_ext;
>>> +     }
>>> +
>>>        if (in_fence) {
>>>                if (args->flags & I915_EXEC_FENCE_SUBMIT)
>>>                        err = i915_request_await_execution(eb.request,
>>> @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>>                        fput(out_fence->file);
>>>                }
>>>        }
>>> +
>>> +     if (unlikely(eb.gem_context->syncobj)) {
>>> +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
>>> +                                       &eb.request->fence);
>>> +     }
>>> +
>>>        i915_request_put(eb.request);
>>>
>>>    err_vma:
>>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-29  8:06         ` Tvrtko Ursulin
@ 2021-04-29 12:08           ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:08 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand

On Thu, Apr 29, 2021 at 09:06:47AM +0100, Tvrtko Ursulin wrote:
> 
> On 28/04/2021 18:26, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> > > 
> > > 
> > > On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > > This API is entirely unnecessary and I'd love to get rid of it.  If
> > > > userspace wants a single timeline across multiple contexts, they can
> > > > either use implicit synchronization or a syncobj, both of which existed
> > > > at the time this feature landed.  The justification given at the time
> > > > was that it would help GL drivers which are inherently single-timeline.
> > > > However, neither of our GL drivers actually wanted the feature.  i965
> > > > was already in maintenance mode at the time and iris uses syncobj for
> > > > everything.
> > > > 
> > > > Unfortunately, as much as I'd love to get rid of it, it is used by the
> > > > media driver so we can't do that.  We can, however, do the next-best
> > > > thing which is to embed a syncobj in the context and do exactly what
> > > > we'd expect from userspace internally.  This isn't an entirely identical
> > > > implementation because it's no longer atomic if userspace races with
> > > > itself by calling execbuffer2 twice simultaneously from different
> > > > threads.  It won't crash in that case; it just doesn't guarantee any
> > > > ordering between those two submits.
> > > 
> > > 1)
> > > 
> > > Please also mention the difference in context/timeline name when
> > > observed via the sync file API.
> > > 
> > > 2)
> > > 
> > > I don't remember what we have concluded in terms of observable effects
> > > in sync_file_merge?
> > 
> > I don't see how either of these are observable since this syncobj is
> > never exposed to userspace in any way.  Please help me understand what
> > I'm missing here.
> 
> Single timeline context - two execbufs - return two out fences.
> 
> Before the patch those two had the same fence context, with the patch they
> have different ones.
> 
> Fence context is visible to userspace via sync file info (timeline name at
> least) and rules in sync_file_merge.

Good point worth mentioninig in the commit message.

media-driver doesn't use any of this in combination with single_timeline,
so we just dont care.
-Daniel

> 
> Regards,
> 
> Tvrtko
> 
> > 
> > --Jason
> > 
> > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> > > > advantages beyond mere annoyance.  One is that intel_timeline is no
> > > > longer an api-visible object and can remain entirely an implementation
> > > > detail.  This may be advantageous as we make scheduler changes going
> > > > forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> > > > we should now have a 1:1 mapping between intel_context and
> > > > intel_timeline which may help us reduce locking.
> > > > 
> > > > v2 (Jason Ekstrand):
> > > >    - Update the comment on i915_gem_context::syncobj to mention that it's
> > > >      an emulation and the possible race if userspace calls execbuffer2
> > > >      twice on the same context concurrently.
> > > >    - Wrap the checks for eb.gem_context->syncobj in unlikely()
> > > >    - Drop the dma_fence reference
> > > >    - Improved commit message
> > > > 
> > > > v3 (Jason Ekstrand):
> > > >    - Move the dma_fence_put() to before the error exit
> > > > 
> > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
> > > >    .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
> > > >    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
> > > >    3 files changed, 40 insertions(+), 39 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index 2c2fefa912805..a72c9b256723b 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -67,6 +67,8 @@
> > > >    #include <linux/log2.h>
> > > >    #include <linux/nospec.h>
> > > > 
> > > > +#include <drm/drm_syncobj.h>
> > > > +
> > > >    #include "gt/gen6_ppgtt.h"
> > > >    #include "gt/intel_context.h"
> > > >    #include "gt/intel_context_param.h"
> > > > @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > >                ce->vm = vm;
> > > >        }
> > > > 
> > > > -     GEM_BUG_ON(ce->timeline);
> > > > -     if (ctx->timeline)
> > > > -             ce->timeline = intel_timeline_get(ctx->timeline);
> > > > -
> > > >        if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > > >            intel_engine_has_timeslices(ce->engine))
> > > >                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
> > > >        mutex_destroy(&ctx->engines_mutex);
> > > >        mutex_destroy(&ctx->lut_mutex);
> > > > 
> > > > -     if (ctx->timeline)
> > > > -             intel_timeline_put(ctx->timeline);
> > > > -
> > > >        put_pid(ctx->pid);
> > > >        mutex_destroy(&ctx->mutex);
> > > > 
> > > > @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
> > > >        if (vm)
> > > >                i915_vm_close(vm);
> > > > 
> > > > +     if (ctx->syncobj)
> > > > +             drm_syncobj_put(ctx->syncobj);
> > > > +
> > > >        ctx->file_priv = ERR_PTR(-EBADF);
> > > > 
> > > >        /*
> > > > @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> > > >                i915_vm_close(vm);
> > > >    }
> > > > 
> > > > -static void __set_timeline(struct intel_timeline **dst,
> > > > -                        struct intel_timeline *src)
> > > > -{
> > > > -     struct intel_timeline *old = *dst;
> > > > -
> > > > -     *dst = src ? intel_timeline_get(src) : NULL;
> > > > -
> > > > -     if (old)
> > > > -             intel_timeline_put(old);
> > > > -}
> > > > -
> > > > -static void __apply_timeline(struct intel_context *ce, void *timeline)
> > > > -{
> > > > -     __set_timeline(&ce->timeline, timeline);
> > > > -}
> > > > -
> > > > -static void __assign_timeline(struct i915_gem_context *ctx,
> > > > -                           struct intel_timeline *timeline)
> > > > -{
> > > > -     __set_timeline(&ctx->timeline, timeline);
> > > > -     context_apply_all(ctx, __apply_timeline, timeline);
> > > > -}
> > > > -
> > > >    static struct i915_gem_context *
> > > >    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > >    {
> > > >        struct i915_gem_context *ctx;
> > > > +     int ret;
> > > > 
> > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > > >            !HAS_EXECLISTS(i915))
> > > > @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > >        }
> > > > 
> > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > > > -             struct intel_timeline *timeline;
> > > > -
> > > > -             timeline = intel_timeline_create(&i915->gt);
> > > > -             if (IS_ERR(timeline)) {
> > > > +             ret = drm_syncobj_create(&ctx->syncobj,
> > > > +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
> > > > +                                      NULL);
> > > > +             if (ret) {
> > > >                        context_close(ctx);
> > > > -                     return ERR_CAST(timeline);
> > > > +                     return ERR_PTR(ret);
> > > >                }
> > > > -
> > > > -             __assign_timeline(ctx, timeline);
> > > > -             intel_timeline_put(timeline);
> > > >        }
> > > > 
> > > >        trace_i915_context_create(ctx);
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > index 676592e27e7d2..df76767f0c41b 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > @@ -83,7 +83,19 @@ struct i915_gem_context {
> > > >        struct i915_gem_engines __rcu *engines;
> > > >        struct mutex engines_mutex; /* guards writes to engines */
> > > > 
> > > > -     struct intel_timeline *timeline;
> > > > +     /**
> > > > +      * @syncobj: Shared timeline syncobj
> > > > +      *
> > > > +      * When the SHARED_TIMELINE flag is set on context creation, we
> > > > +      * emulate a single timeline across all engines using this syncobj.
> > > > +      * For every execbuffer2 call, this syncobj is used as both an in-
> > > > +      * and out-fence.  Unlike the real intel_timeline, this doesn't
> > > > +      * provide perfect atomic in-order guarantees if the client races
> > > > +      * with itself by calling execbuffer2 twice concurrently.  However,
> > > > +      * if userspace races with itself, that's not likely to yield well-
> > > > +      * defined results anyway so we choose to not care.
> > > > +      */
> > > > +     struct drm_syncobj *syncobj;
> > > > 
> > > >        /**
> > > >         * @vm: unique address space (GTT)
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > index b812f313422a9..d640bba6ad9ab 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > >                goto err_vma;
> > > >        }
> > > > 
> > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > +             struct dma_fence *fence;
> > > > +
> > > > +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> > > > +             err = i915_request_await_dma_fence(eb.request, fence);
> > > > +             dma_fence_put(fence);
> > > > +             if (err)
> > > > +                     goto err_ext;
> > > > +     }
> > > > +
> > > >        if (in_fence) {
> > > >                if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > >                        err = i915_request_await_execution(eb.request,
> > > > @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > >                        fput(out_fence->file);
> > > >                }
> > > >        }
> > > > +
> > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
> > > > +                                       &eb.request->fence);
> > > > +     }
> > > > +
> > > >        i915_request_put(eb.request);
> > > > 
> > > >    err_vma:
> > > > 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-29 12:08           ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:08 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 09:06:47AM +0100, Tvrtko Ursulin wrote:
> 
> On 28/04/2021 18:26, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> > > 
> > > 
> > > On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > > This API is entirely unnecessary and I'd love to get rid of it.  If
> > > > userspace wants a single timeline across multiple contexts, they can
> > > > either use implicit synchronization or a syncobj, both of which existed
> > > > at the time this feature landed.  The justification given at the time
> > > > was that it would help GL drivers which are inherently single-timeline.
> > > > However, neither of our GL drivers actually wanted the feature.  i965
> > > > was already in maintenance mode at the time and iris uses syncobj for
> > > > everything.
> > > > 
> > > > Unfortunately, as much as I'd love to get rid of it, it is used by the
> > > > media driver so we can't do that.  We can, however, do the next-best
> > > > thing which is to embed a syncobj in the context and do exactly what
> > > > we'd expect from userspace internally.  This isn't an entirely identical
> > > > implementation because it's no longer atomic if userspace races with
> > > > itself by calling execbuffer2 twice simultaneously from different
> > > > threads.  It won't crash in that case; it just doesn't guarantee any
> > > > ordering between those two submits.
> > > 
> > > 1)
> > > 
> > > Please also mention the difference in context/timeline name when
> > > observed via the sync file API.
> > > 
> > > 2)
> > > 
> > > I don't remember what we have concluded in terms of observable effects
> > > in sync_file_merge?
> > 
> > I don't see how either of these are observable since this syncobj is
> > never exposed to userspace in any way.  Please help me understand what
> > I'm missing here.
> 
> Single timeline context - two execbufs - return two out fences.
> 
> Before the patch those two had the same fence context, with the patch they
> have different ones.
> 
> Fence context is visible to userspace via sync file info (timeline name at
> least) and rules in sync_file_merge.

Good point worth mentioninig in the commit message.

media-driver doesn't use any of this in combination with single_timeline,
so we just dont care.
-Daniel

> 
> Regards,
> 
> Tvrtko
> 
> > 
> > --Jason
> > 
> > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> > > > advantages beyond mere annoyance.  One is that intel_timeline is no
> > > > longer an api-visible object and can remain entirely an implementation
> > > > detail.  This may be advantageous as we make scheduler changes going
> > > > forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> > > > we should now have a 1:1 mapping between intel_context and
> > > > intel_timeline which may help us reduce locking.
> > > > 
> > > > v2 (Jason Ekstrand):
> > > >    - Update the comment on i915_gem_context::syncobj to mention that it's
> > > >      an emulation and the possible race if userspace calls execbuffer2
> > > >      twice on the same context concurrently.
> > > >    - Wrap the checks for eb.gem_context->syncobj in unlikely()
> > > >    - Drop the dma_fence reference
> > > >    - Improved commit message
> > > > 
> > > > v3 (Jason Ekstrand):
> > > >    - Move the dma_fence_put() to before the error exit
> > > > 
> > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
> > > >    .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
> > > >    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
> > > >    3 files changed, 40 insertions(+), 39 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index 2c2fefa912805..a72c9b256723b 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -67,6 +67,8 @@
> > > >    #include <linux/log2.h>
> > > >    #include <linux/nospec.h>
> > > > 
> > > > +#include <drm/drm_syncobj.h>
> > > > +
> > > >    #include "gt/gen6_ppgtt.h"
> > > >    #include "gt/intel_context.h"
> > > >    #include "gt/intel_context_param.h"
> > > > @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > >                ce->vm = vm;
> > > >        }
> > > > 
> > > > -     GEM_BUG_ON(ce->timeline);
> > > > -     if (ctx->timeline)
> > > > -             ce->timeline = intel_timeline_get(ctx->timeline);
> > > > -
> > > >        if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > > >            intel_engine_has_timeslices(ce->engine))
> > > >                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
> > > >        mutex_destroy(&ctx->engines_mutex);
> > > >        mutex_destroy(&ctx->lut_mutex);
> > > > 
> > > > -     if (ctx->timeline)
> > > > -             intel_timeline_put(ctx->timeline);
> > > > -
> > > >        put_pid(ctx->pid);
> > > >        mutex_destroy(&ctx->mutex);
> > > > 
> > > > @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
> > > >        if (vm)
> > > >                i915_vm_close(vm);
> > > > 
> > > > +     if (ctx->syncobj)
> > > > +             drm_syncobj_put(ctx->syncobj);
> > > > +
> > > >        ctx->file_priv = ERR_PTR(-EBADF);
> > > > 
> > > >        /*
> > > > @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> > > >                i915_vm_close(vm);
> > > >    }
> > > > 
> > > > -static void __set_timeline(struct intel_timeline **dst,
> > > > -                        struct intel_timeline *src)
> > > > -{
> > > > -     struct intel_timeline *old = *dst;
> > > > -
> > > > -     *dst = src ? intel_timeline_get(src) : NULL;
> > > > -
> > > > -     if (old)
> > > > -             intel_timeline_put(old);
> > > > -}
> > > > -
> > > > -static void __apply_timeline(struct intel_context *ce, void *timeline)
> > > > -{
> > > > -     __set_timeline(&ce->timeline, timeline);
> > > > -}
> > > > -
> > > > -static void __assign_timeline(struct i915_gem_context *ctx,
> > > > -                           struct intel_timeline *timeline)
> > > > -{
> > > > -     __set_timeline(&ctx->timeline, timeline);
> > > > -     context_apply_all(ctx, __apply_timeline, timeline);
> > > > -}
> > > > -
> > > >    static struct i915_gem_context *
> > > >    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > >    {
> > > >        struct i915_gem_context *ctx;
> > > > +     int ret;
> > > > 
> > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > > >            !HAS_EXECLISTS(i915))
> > > > @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > >        }
> > > > 
> > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > > > -             struct intel_timeline *timeline;
> > > > -
> > > > -             timeline = intel_timeline_create(&i915->gt);
> > > > -             if (IS_ERR(timeline)) {
> > > > +             ret = drm_syncobj_create(&ctx->syncobj,
> > > > +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
> > > > +                                      NULL);
> > > > +             if (ret) {
> > > >                        context_close(ctx);
> > > > -                     return ERR_CAST(timeline);
> > > > +                     return ERR_PTR(ret);
> > > >                }
> > > > -
> > > > -             __assign_timeline(ctx, timeline);
> > > > -             intel_timeline_put(timeline);
> > > >        }
> > > > 
> > > >        trace_i915_context_create(ctx);
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > index 676592e27e7d2..df76767f0c41b 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > @@ -83,7 +83,19 @@ struct i915_gem_context {
> > > >        struct i915_gem_engines __rcu *engines;
> > > >        struct mutex engines_mutex; /* guards writes to engines */
> > > > 
> > > > -     struct intel_timeline *timeline;
> > > > +     /**
> > > > +      * @syncobj: Shared timeline syncobj
> > > > +      *
> > > > +      * When the SHARED_TIMELINE flag is set on context creation, we
> > > > +      * emulate a single timeline across all engines using this syncobj.
> > > > +      * For every execbuffer2 call, this syncobj is used as both an in-
> > > > +      * and out-fence.  Unlike the real intel_timeline, this doesn't
> > > > +      * provide perfect atomic in-order guarantees if the client races
> > > > +      * with itself by calling execbuffer2 twice concurrently.  However,
> > > > +      * if userspace races with itself, that's not likely to yield well-
> > > > +      * defined results anyway so we choose to not care.
> > > > +      */
> > > > +     struct drm_syncobj *syncobj;
> > > > 
> > > >        /**
> > > >         * @vm: unique address space (GTT)
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > index b812f313422a9..d640bba6ad9ab 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > >                goto err_vma;
> > > >        }
> > > > 
> > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > +             struct dma_fence *fence;
> > > > +
> > > > +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> > > > +             err = i915_request_await_dma_fence(eb.request, fence);
> > > > +             dma_fence_put(fence);
> > > > +             if (err)
> > > > +                     goto err_ext;
> > > > +     }
> > > > +
> > > >        if (in_fence) {
> > > >                if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > >                        err = i915_request_await_execution(eb.request,
> > > > @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > >                        fput(out_fence->file);
> > > >                }
> > > >        }
> > > > +
> > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
> > > > +                                       &eb.request->fence);
> > > > +     }
> > > > +
> > > >        i915_request_put(eb.request);
> > > > 
> > > >    err_vma:
> > > > 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 18:17                 ` Jason Ekstrand
@ 2021-04-29 12:14                   ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:14 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Matthew Brost, Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > with the upcoming parallel submission extension I leveraged some of the
> > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > until that lands.
> > >
> > > Mind being a bit more specific about that?  The motivation for this
> > > patch is that the current bonding handling and uAPI is, well, very odd
> > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > Instead you create engines and then bond them together after the fact.
> > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > stuff unless they were useful.  With parallel submit, I would expect
> > > we want a more explicit API where you specify a set of engine
> > > class/instance pairs to bond together into a single engine similar to
> > > how the current balancing API works.
> > >
> > > Of course, that's all focused on the API and not the internals.  But,
> > > again, I'm not sure how we want things to look internally.  What we've
> > > got now doesn't seem great for the GuC submission model but I'm very
> > > much not the expert there.  I don't want to be working at cross
> > > purposes to you and I'm happy to leave bits if you think they're
> > > useful.  But I thought I was clearing things away so that you can put
> > > in what you actually want for GuC/parallel submit.
> > >
> >
> > Removing all the UAPI things are fine but I wouldn't delete some of the
> > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > intel_context_ops, the hook for a submit fence, etc...) as that will
> > still likely be used for the new parallel submission interface with
> > execlists. As you say the new UAPI wont allow crazy configurations,
> > only simple ones.
> 
> I'm fine with leaving some of the internal bits for a little while if
> it makes pulling the GuC scheduler in easier.  I'm just a bit
> skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> thoughts?

Yeah I'm also wondering why we need this. Essentially your insight (and
Tony Ye from media team confirmed) is that media umd never uses bonded on
virtual engines.

So the only thing we need is the await_fence submit_fence logic to stall
the subsequent patches just long enough. I think that stays.

All the additional logic with the cmpxchg lockless trickery and all that
isn't needed, because we _never_ have to select an engine for bonded
submission: It's always the single one available.

This would mean that for execlist parallel submit we can apply a
limitation (beyond what GuC supports perhaps) and it's all ok. With that
everything except the submit fence await logic itself can go I think.

Also one for Matt: We decided to ZBB implementing parallel submit on
execlist, it's going to be just for GuC. At least until someone starts
screaming really loudly.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 12:14                   ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:14 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> >
> > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > with the upcoming parallel submission extension I leveraged some of the
> > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > until that lands.
> > >
> > > Mind being a bit more specific about that?  The motivation for this
> > > patch is that the current bonding handling and uAPI is, well, very odd
> > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > Instead you create engines and then bond them together after the fact.
> > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > stuff unless they were useful.  With parallel submit, I would expect
> > > we want a more explicit API where you specify a set of engine
> > > class/instance pairs to bond together into a single engine similar to
> > > how the current balancing API works.
> > >
> > > Of course, that's all focused on the API and not the internals.  But,
> > > again, I'm not sure how we want things to look internally.  What we've
> > > got now doesn't seem great for the GuC submission model but I'm very
> > > much not the expert there.  I don't want to be working at cross
> > > purposes to you and I'm happy to leave bits if you think they're
> > > useful.  But I thought I was clearing things away so that you can put
> > > in what you actually want for GuC/parallel submit.
> > >
> >
> > Removing all the UAPI things are fine but I wouldn't delete some of the
> > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > intel_context_ops, the hook for a submit fence, etc...) as that will
> > still likely be used for the new parallel submission interface with
> > execlists. As you say the new UAPI wont allow crazy configurations,
> > only simple ones.
> 
> I'm fine with leaving some of the internal bits for a little while if
> it makes pulling the GuC scheduler in easier.  I'm just a bit
> skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> thoughts?

Yeah I'm also wondering why we need this. Essentially your insight (and
Tony Ye from media team confirmed) is that media umd never uses bonded on
virtual engines.

So the only thing we need is the await_fence submit_fence logic to stall
the subsequent patches just long enough. I think that stays.

All the additional logic with the cmpxchg lockless trickery and all that
isn't needed, because we _never_ have to select an engine for bonded
submission: It's always the single one available.

This would mean that for execlist parallel submit we can apply a
limitation (beyond what GuC supports perhaps) and it's all ok. With that
everything except the submit fence await logic itself can go I think.

Also one for Matt: We decided to ZBB implementing parallel submit on
execlist, it's going to be just for GuC. At least until someone starts
screaming really loudly.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 18:58           ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 12:16             ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Matthew Brost, Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 01:58:17PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > much code.  This function in particular, has to stay, unfortunately.
> > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > the work onto a different engine than than the one it's supposed to
> > > > run in parallel with.  This means we can't dead-code this function or
> > > > the bond_execution function pointer and related stuff.
> > >
> > > Uh that's disappointing, since if I understand your point correctly, the
> > > sibling engines should all be singletons, not load balancing virtual ones.
> > > So there really should not be any need to pick the right one at execution
> > > time.
> >
> > The media driver itself seems to work fine if I delete all the code.
> > It's just an IGT testcase that blows up.  I'll do more digging to see
> > if I can better isolate why.
> 
> I did more digging and I figured out why this test hangs.  The test
> looks at an engine class where there's more than one of that class
> (currently only vcs) and creates a context where engine[0] is all of
> the engines of that class bonded together and engine[1-N] is each of
> those engines individually.  It then tests that you can submit a batch
> to one of the individual engines and then submit with
> EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
> out.  This doesn't seem like a use-case we care about.
> 
> If we cared about anything, I would expect it to be submitting to two
> balanced contexts and expecting "pick any two" behavior.  But that's
> not what the test is testing for.

Yeah ditch it.

Instead make sure that the bonded setparam/ctx validation makes sure that
1) no virtual engines are used
2) no engine used twice
3) anything else stupid you can come up with that we should make sure is
blocked.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 12:16             ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 01:58:17PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > much code.  This function in particular, has to stay, unfortunately.
> > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > the work onto a different engine than than the one it's supposed to
> > > > run in parallel with.  This means we can't dead-code this function or
> > > > the bond_execution function pointer and related stuff.
> > >
> > > Uh that's disappointing, since if I understand your point correctly, the
> > > sibling engines should all be singletons, not load balancing virtual ones.
> > > So there really should not be any need to pick the right one at execution
> > > time.
> >
> > The media driver itself seems to work fine if I delete all the code.
> > It's just an IGT testcase that blows up.  I'll do more digging to see
> > if I can better isolate why.
> 
> I did more digging and I figured out why this test hangs.  The test
> looks at an engine class where there's more than one of that class
> (currently only vcs) and creates a context where engine[0] is all of
> the engines of that class bonded together and engine[1-N] is each of
> those engines individually.  It then tests that you can submit a batch
> to one of the individual engines and then submit with
> EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
> out.  This doesn't seem like a use-case we care about.
> 
> If we cared about anything, I would expect it to be submitting to two
> balanced contexts and expecting "pick any two" behavior.  But that's
> not what the test is testing for.

Yeah ditch it.

Instead make sure that the bonded setparam/ctx validation makes sure that
1) no virtual engines are used
2) no engine used twice
3) anything else stupid you can come up with that we should make sure is
blocked.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
  2021-04-28 18:22       ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 12:22         ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:22 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 01:22:14PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 5:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote:
> > > As far as I can tell, the only real reason for this is to avoid taking a
> > > reference to the i915_gem_context.  The cost of those two atomics
> > > probably pales in comparison to the cost of the ioctl itself so we're
> > > really not buying ourselves anything here.  We're about to make context
> > > lookup a tiny bit more complicated, so let's get rid of the one hand-
> > > rolled case.
> >
> > I think the historical reason here is that i965_brw checks this before
> > every execbuf call, at least for arb_robustness contexts with the right
> > flag. But we've fixed that hotpath problem by adding non-recoverable
> > contexts. The kernel will tell you now automatically, for proper userspace
> > at least (I checked iris and anv, assuming I got it correct), and
> > reset_stats ioctl isn't a hot path worth micro-optimizing anymore.
> 
> I'm not sure I agree with that bit.  I don't think it was ever worth
> micro-optimizing like this.  What does it gain us?  Two fewer atomics?
>  It's not like the bad old days when it took a lock.
> 
> ANV still calls reset_stats before every set of execbuf (sometimes
> more than one) but I've never once seen it show up on a perf trace.
> execbuf, on the other hand, that does show up and pretty heavy
> sometimes.

Huh I thought I checked, but I guess got lost.

> > With that bit of more context added to the commit message:
> 
> I'd like to agree on what to add before adding something

Yeah in this case maybe just mention that with non-recoverable ctx there's
no need for userspace to check before every execbuf, so if this ever shows
up there's a proper fix which avoids the ioctl entirely. Like iris does.

Or something like that. I just want to make it clear that if this ever
does show up (once we've made execbuf faster with vm_bind and all that)
then the correct fix isn't to make this ioctl faster. But to just not
call it :-)

Cheers, Daniel

> 
> --Jason
> 
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
> > >  drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
> > >  2 files changed, 5 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index ecb3bf5369857..941fbf78267b4 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> > >       struct drm_i915_private *i915 = to_i915(dev);
> > >       struct drm_i915_reset_stats *args = data;
> > >       struct i915_gem_context *ctx;
> > > -     int ret;
> > >
> > >       if (args->flags || args->pad)
> > >               return -EINVAL;
> > >
> > > -     ret = -ENOENT;
> > > -     rcu_read_lock();
> > > -     ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
> > > +     ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> > >       if (!ctx)
> > > -             goto out;
> > > +             return -ENOENT;
> > >
> > >       /*
> > >        * We opt for unserialised reads here. This may result in tearing
> > > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> > >       args->batch_active = atomic_read(&ctx->guilty_count);
> > >       args->batch_pending = atomic_read(&ctx->active_count);
> > >
> > > -     ret = 0;
> > > -out:
> > > -     rcu_read_unlock();
> > > -     return ret;
> > > +     i915_gem_context_put(ctx);
> > > +     return 0;
> > >  }
> > >
> > >  /* GEM context-engines iterator: for_each_gem_engine() */
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 0b44333eb7033..8571c5c1509a7 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > >
> > >  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
> > >
> > > -static inline struct i915_gem_context *
> > > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
> > > -{
> > > -     return xa_load(&file_priv->context_xa, id);
> > > -}
> > > -
> > >  static inline struct i915_gem_context *
> > >  i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > >  {
> > >       struct i915_gem_context *ctx;
> > >
> > >       rcu_read_lock();
> > > -     ctx = __i915_gem_context_lookup_rcu(file_priv, id);
> > > +     ctx = xa_load(&file_priv->context_xa, id);
> > >       if (ctx && !kref_get_unless_zero(&ctx->ref))
> > >               ctx = NULL;
> > >       rcu_read_unlock();
> > > --
> > > 2.31.1
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl
@ 2021-04-29 12:22         ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:22 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Wed, Apr 28, 2021 at 01:22:14PM -0500, Jason Ekstrand wrote:
> On Wed, Apr 28, 2021 at 5:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Apr 23, 2021 at 05:31:21PM -0500, Jason Ekstrand wrote:
> > > As far as I can tell, the only real reason for this is to avoid taking a
> > > reference to the i915_gem_context.  The cost of those two atomics
> > > probably pales in comparison to the cost of the ioctl itself so we're
> > > really not buying ourselves anything here.  We're about to make context
> > > lookup a tiny bit more complicated, so let's get rid of the one hand-
> > > rolled case.
> >
> > I think the historical reason here is that i965_brw checks this before
> > every execbuf call, at least for arb_robustness contexts with the right
> > flag. But we've fixed that hotpath problem by adding non-recoverable
> > contexts. The kernel will tell you now automatically, for proper userspace
> > at least (I checked iris and anv, assuming I got it correct), and
> > reset_stats ioctl isn't a hot path worth micro-optimizing anymore.
> 
> I'm not sure I agree with that bit.  I don't think it was ever worth
> micro-optimizing like this.  What does it gain us?  Two fewer atomics?
>  It's not like the bad old days when it took a lock.
> 
> ANV still calls reset_stats before every set of execbuf (sometimes
> more than one) but I've never once seen it show up on a perf trace.
> execbuf, on the other hand, that does show up and pretty heavy
> sometimes.

Huh I thought I checked, but I guess got lost.

> > With that bit of more context added to the commit message:
> 
> I'd like to agree on what to add before adding something

Yeah in this case maybe just mention that with non-recoverable ctx there's
no need for userspace to check before every execbuf, so if this ever shows
up there's a proper fix which avoids the ioctl entirely. Like iris does.

Or something like that. I just want to make it clear that if this ever
does show up (once we've made execbuf faster with vm_bind and all that)
then the correct fix isn't to make this ioctl faster. But to just not
call it :-)

Cheers, Daniel

> 
> --Jason
> 
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 ++++---------
> > >  drivers/gpu/drm/i915/i915_drv.h             |  8 +-------
> > >  2 files changed, 5 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index ecb3bf5369857..941fbf78267b4 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> > >       struct drm_i915_private *i915 = to_i915(dev);
> > >       struct drm_i915_reset_stats *args = data;
> > >       struct i915_gem_context *ctx;
> > > -     int ret;
> > >
> > >       if (args->flags || args->pad)
> > >               return -EINVAL;
> > >
> > > -     ret = -ENOENT;
> > > -     rcu_read_lock();
> > > -     ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
> > > +     ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> > >       if (!ctx)
> > > -             goto out;
> > > +             return -ENOENT;
> > >
> > >       /*
> > >        * We opt for unserialised reads here. This may result in tearing
> > > @@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> > >       args->batch_active = atomic_read(&ctx->guilty_count);
> > >       args->batch_pending = atomic_read(&ctx->active_count);
> > >
> > > -     ret = 0;
> > > -out:
> > > -     rcu_read_unlock();
> > > -     return ret;
> > > +     i915_gem_context_put(ctx);
> > > +     return 0;
> > >  }
> > >
> > >  /* GEM context-engines iterator: for_each_gem_engine() */
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 0b44333eb7033..8571c5c1509a7 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1840,19 +1840,13 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> > >
> > >  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
> > >
> > > -static inline struct i915_gem_context *
> > > -__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
> > > -{
> > > -     return xa_load(&file_priv->context_xa, id);
> > > -}
> > > -
> > >  static inline struct i915_gem_context *
> > >  i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > >  {
> > >       struct i915_gem_context *ctx;
> > >
> > >       rcu_read_lock();
> > > -     ctx = __i915_gem_context_lookup_rcu(file_priv, id);
> > > +     ctx = xa_load(&file_priv->context_xa, id);
> > >       if (ctx && !kref_get_unless_zero(&ctx->ref))
> > >               ctx = NULL;
> > >       rcu_read_unlock();
> > > --
> > > 2.31.1
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-28 15:51     ` Tvrtko Ursulin
@ 2021-04-29 12:24       ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel, Jason Ekstrand

On Wed, Apr 28, 2021 at 04:51:19PM +0100, Tvrtko Ursulin wrote:
> 
> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > This adds a bunch of complexity which the media driver has never
> > actually used.  The media driver does technically bond a balanced engine
> > to another engine but the balanced engine only has one engine in the
> > sibling set.  This doesn't actually result in a virtual engine.
> 
> For historical reference, this is not because uapi was over-engineered but
> because certain SKUs never materialized.

Jason said that for SKU with lots of media engines media-driver sets up a
set of ctx in userspace with all the pairings (and I guess then load
balances in userspace or something like that). Tony Ye also seems to have
confirmed that. So I'm not clear on which SKU this is?

Or maybe the real deal is only future platforms, and there we have GuC
scheduler backend.

Not against adding a bit more context to the commit message, but we need
to make sure what we put there is actually correct. Maybe best to ask
Tony/Carl as part of getting an ack from them.
-Daniel

> 
> Regards,
> 
> Tvrtko
> 
> > Unless some userspace badly wants it, there's no good reason to support
> > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > leave the validation code in place in case we ever decide we want to do
> > something interesting with the bonding information.
> > 
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> >   .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> >   .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> >   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> >   6 files changed, 7 insertions(+), 353 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index e8179918fa306..5f8d0faf783aa 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >   	}
> >   	virtual = set->engines->engines[idx]->engine;
> > +	if (intel_engine_is_virtual(virtual)) {
> > +		drm_dbg(&i915->drm,
> > +			"Bonding with virtual engines not allowed\n");
> > +		return -EINVAL;
> > +	}
> > +
> >   	err = check_user_mbz(&ext->flags);
> >   	if (err)
> >   		return err;
> > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >   				n, ci.engine_class, ci.engine_instance);
> >   			return -EINVAL;
> >   		}
> > -
> > -		/*
> > -		 * A non-virtual engine has no siblings to choose between; and
> > -		 * a submit fence will always be directed to the one engine.
> > -		 */
> > -		if (intel_engine_is_virtual(virtual)) {
> > -			err = intel_virtual_engine_attach_bond(virtual,
> > -							       master,
> > -							       bond);
> > -			if (err)
> > -				return err;
> > -		}
> >   	}
> >   	return 0;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index d640bba6ad9ab..efb2fa3522a42 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >   		if (args->flags & I915_EXEC_FENCE_SUBMIT)
> >   			err = i915_request_await_execution(eb.request,
> >   							   in_fence,
> > -							   eb.engine->bond_execute);
> > +							   NULL);
> >   		else
> >   			err = i915_request_await_dma_fence(eb.request,
> >   							   in_fence);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index 883bafc449024..68cfe5080325c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> >   	 */
> >   	void		(*submit_request)(struct i915_request *rq);
> > -	/*
> > -	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > -	 * request down to the bonded pairs.
> > -	 */
> > -	void            (*bond_execute)(struct i915_request *rq,
> > -					struct dma_fence *signal);
> > -
> >   	/*
> >   	 * Call when the priority on a request has changed and it and its
> >   	 * dependencies may need rescheduling. Note the request itself may
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index de124870af44d..b6e2b59f133b7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -181,18 +181,6 @@ struct virtual_engine {
> >   		int prio;
> >   	} nodes[I915_NUM_ENGINES];
> > -	/*
> > -	 * Keep track of bonded pairs -- restrictions upon on our selection
> > -	 * of physical engines any particular request may be submitted to.
> > -	 * If we receive a submit-fence from a master engine, we will only
> > -	 * use one of sibling_mask physical engines.
> > -	 */
> > -	struct ve_bond {
> > -		const struct intel_engine_cs *master;
> > -		intel_engine_mask_t sibling_mask;
> > -	} *bonds;
> > -	unsigned int num_bonds;
> > -
> >   	/* And finally, which physical engines this virtual engine maps onto. */
> >   	unsigned int num_siblings;
> >   	struct intel_engine_cs *siblings[];
> > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> >   	intel_breadcrumbs_free(ve->base.breadcrumbs);
> >   	intel_engine_free_request_pool(&ve->base);
> > -	kfree(ve->bonds);
> >   	kfree(ve);
> >   }
> > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> >   	spin_unlock_irqrestore(&ve->base.active.lock, flags);
> >   }
> > -static struct ve_bond *
> > -virtual_find_bond(struct virtual_engine *ve,
> > -		  const struct intel_engine_cs *master)
> > -{
> > -	int i;
> > -
> > -	for (i = 0; i < ve->num_bonds; i++) {
> > -		if (ve->bonds[i].master == master)
> > -			return &ve->bonds[i];
> > -	}
> > -
> > -	return NULL;
> > -}
> > -
> > -static void
> > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > -{
> > -	struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > -	intel_engine_mask_t allowed, exec;
> > -	struct ve_bond *bond;
> > -
> > -	allowed = ~to_request(signal)->engine->mask;
> > -
> > -	bond = virtual_find_bond(ve, to_request(signal)->engine);
> > -	if (bond)
> > -		allowed &= bond->sibling_mask;
> > -
> > -	/* Restrict the bonded request to run on only the available engines */
> > -	exec = READ_ONCE(rq->execution_mask);
> > -	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > -		;
> > -
> > -	/* Prevent the master from being re-run on the bonded engines */
> > -	to_request(signal)->execution_mask &= ~allowed;
> > -}
> > -
> >   struct intel_context *
> >   intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >   			       unsigned int count)
> > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >   	ve->base.schedule = i915_schedule;
> >   	ve->base.submit_request = virtual_submit_request;
> > -	ve->base.bond_execute = virtual_bond_execute;
> >   	INIT_LIST_HEAD(virtual_queue(ve));
> >   	ve->base.execlists.queue_priority_hint = INT_MIN;
> > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> >   	if (IS_ERR(dst))
> >   		return dst;
> > -	if (se->num_bonds) {
> > -		struct virtual_engine *de = to_virtual_engine(dst->engine);
> > -
> > -		de->bonds = kmemdup(se->bonds,
> > -				    sizeof(*se->bonds) * se->num_bonds,
> > -				    GFP_KERNEL);
> > -		if (!de->bonds) {
> > -			intel_context_put(dst);
> > -			return ERR_PTR(-ENOMEM);
> > -		}
> > -
> > -		de->num_bonds = se->num_bonds;
> > -	}
> > -
> >   	return dst;
> >   }
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -				     const struct intel_engine_cs *master,
> > -				     const struct intel_engine_cs *sibling)
> > -{
> > -	struct virtual_engine *ve = to_virtual_engine(engine);
> > -	struct ve_bond *bond;
> > -	int n;
> > -
> > -	/* Sanity check the sibling is part of the virtual engine */
> > -	for (n = 0; n < ve->num_siblings; n++)
> > -		if (sibling == ve->siblings[n])
> > -			break;
> > -	if (n == ve->num_siblings)
> > -		return -EINVAL;
> > -
> > -	bond = virtual_find_bond(ve, master);
> > -	if (bond) {
> > -		bond->sibling_mask |= sibling->mask;
> > -		return 0;
> > -	}
> > -
> > -	bond = krealloc(ve->bonds,
> > -			sizeof(*bond) * (ve->num_bonds + 1),
> > -			GFP_KERNEL);
> > -	if (!bond)
> > -		return -ENOMEM;
> > -
> > -	bond[ve->num_bonds].master = master;
> > -	bond[ve->num_bonds].sibling_mask = sibling->mask;
> > -
> > -	ve->bonds = bond;
> > -	ve->num_bonds++;
> > -
> > -	return 0;
> > -}
> > -
> >   void intel_execlists_show_requests(struct intel_engine_cs *engine,
> >   				   struct drm_printer *m,
> >   				   void (*show_request)(struct drm_printer *m,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > index fd61dae820e9e..80cec37a56ba9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >   struct intel_context *
> >   intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -				     const struct intel_engine_cs *master,
> > -				     const struct intel_engine_cs *sibling);
> > -
> >   bool
> >   intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > index 1081cd36a2bd3..f03446d587160 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> >   	return 0;
> >   }
> > -static int bond_virtual_engine(struct intel_gt *gt,
> > -			       unsigned int class,
> > -			       struct intel_engine_cs **siblings,
> > -			       unsigned int nsibling,
> > -			       unsigned int flags)
> > -#define BOND_SCHEDULE BIT(0)
> > -{
> > -	struct intel_engine_cs *master;
> > -	struct i915_request *rq[16];
> > -	enum intel_engine_id id;
> > -	struct igt_spinner spin;
> > -	unsigned long n;
> > -	int err;
> > -
> > -	/*
> > -	 * A set of bonded requests is intended to be run concurrently
> > -	 * across a number of engines. We use one request per-engine
> > -	 * and a magic fence to schedule each of the bonded requests
> > -	 * at the same time. A consequence of our current scheduler is that
> > -	 * we only move requests to the HW ready queue when the request
> > -	 * becomes ready, that is when all of its prerequisite fences have
> > -	 * been signaled. As one of those fences is the master submit fence,
> > -	 * there is a delay on all secondary fences as the HW may be
> > -	 * currently busy. Equally, as all the requests are independent,
> > -	 * they may have other fences that delay individual request
> > -	 * submission to HW. Ergo, we do not guarantee that all requests are
> > -	 * immediately submitted to HW at the same time, just that if the
> > -	 * rules are abided by, they are ready at the same time as the
> > -	 * first is submitted. Userspace can embed semaphores in its batch
> > -	 * to ensure parallel execution of its phases as it requires.
> > -	 * Though naturally it gets requested that perhaps the scheduler should
> > -	 * take care of parallel execution, even across preemption events on
> > -	 * different HW. (The proper answer is of course "lalalala".)
> > -	 *
> > -	 * With the submit-fence, we have identified three possible phases
> > -	 * of synchronisation depending on the master fence: queued (not
> > -	 * ready), executing, and signaled. The first two are quite simple
> > -	 * and checked below. However, the signaled master fence handling is
> > -	 * contentious. Currently we do not distinguish between a signaled
> > -	 * fence and an expired fence, as once signaled it does not convey
> > -	 * any information about the previous execution. It may even be freed
> > -	 * and hence checking later it may not exist at all. Ergo we currently
> > -	 * do not apply the bonding constraint for an already signaled fence,
> > -	 * as our expectation is that it should not constrain the secondaries
> > -	 * and is outside of the scope of the bonded request API (i.e. all
> > -	 * userspace requests are meant to be running in parallel). As
> > -	 * it imposes no constraint, and is effectively a no-op, we do not
> > -	 * check below as normal execution flows are checked extensively above.
> > -	 *
> > -	 * XXX Is the degenerate handling of signaled submit fences the
> > -	 * expected behaviour for userpace?
> > -	 */
> > -
> > -	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > -
> > -	if (igt_spinner_init(&spin, gt))
> > -		return -ENOMEM;
> > -
> > -	err = 0;
> > -	rq[0] = ERR_PTR(-ENOMEM);
> > -	for_each_engine(master, gt, id) {
> > -		struct i915_sw_fence fence = {};
> > -		struct intel_context *ce;
> > -
> > -		if (master->class == class)
> > -			continue;
> > -
> > -		ce = intel_context_create(master);
> > -		if (IS_ERR(ce)) {
> > -			err = PTR_ERR(ce);
> > -			goto out;
> > -		}
> > -
> > -		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > -
> > -		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > -		intel_context_put(ce);
> > -		if (IS_ERR(rq[0])) {
> > -			err = PTR_ERR(rq[0]);
> > -			goto out;
> > -		}
> > -		i915_request_get(rq[0]);
> > -
> > -		if (flags & BOND_SCHEDULE) {
> > -			onstack_fence_init(&fence);
> > -			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > -							       &fence,
> > -							       GFP_KERNEL);
> > -		}
> > -
> > -		i915_request_add(rq[0]);
> > -		if (err < 0)
> > -			goto out;
> > -
> > -		if (!(flags & BOND_SCHEDULE) &&
> > -		    !igt_wait_for_spinner(&spin, rq[0])) {
> > -			err = -EIO;
> > -			goto out;
> > -		}
> > -
> > -		for (n = 0; n < nsibling; n++) {
> > -			struct intel_context *ve;
> > -
> > -			ve = intel_execlists_create_virtual(siblings, nsibling);
> > -			if (IS_ERR(ve)) {
> > -				err = PTR_ERR(ve);
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -
> > -			err = intel_virtual_engine_attach_bond(ve->engine,
> > -							       master,
> > -							       siblings[n]);
> > -			if (err) {
> > -				intel_context_put(ve);
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -
> > -			err = intel_context_pin(ve);
> > -			intel_context_put(ve);
> > -			if (err) {
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -
> > -			rq[n + 1] = i915_request_create(ve);
> > -			intel_context_unpin(ve);
> > -			if (IS_ERR(rq[n + 1])) {
> > -				err = PTR_ERR(rq[n + 1]);
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -			i915_request_get(rq[n + 1]);
> > -
> > -			err = i915_request_await_execution(rq[n + 1],
> > -							   &rq[0]->fence,
> > -							   ve->engine->bond_execute);
> > -			i915_request_add(rq[n + 1]);
> > -			if (err < 0) {
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -		}
> > -		onstack_fence_fini(&fence);
> > -		intel_engine_flush_submission(master);
> > -		igt_spinner_end(&spin);
> > -
> > -		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > -			pr_err("Master request did not execute (on %s)!\n",
> > -			       rq[0]->engine->name);
> > -			err = -EIO;
> > -			goto out;
> > -		}
> > -
> > -		for (n = 0; n < nsibling; n++) {
> > -			if (i915_request_wait(rq[n + 1], 0,
> > -					      MAX_SCHEDULE_TIMEOUT) < 0) {
> > -				err = -EIO;
> > -				goto out;
> > -			}
> > -
> > -			if (rq[n + 1]->engine != siblings[n]) {
> > -				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > -				       siblings[n]->name,
> > -				       rq[n + 1]->engine->name,
> > -				       rq[0]->engine->name);
> > -				err = -EINVAL;
> > -				goto out;
> > -			}
> > -		}
> > -
> > -		for (n = 0; !IS_ERR(rq[n]); n++)
> > -			i915_request_put(rq[n]);
> > -		rq[0] = ERR_PTR(-ENOMEM);
> > -	}
> > -
> > -out:
> > -	for (n = 0; !IS_ERR(rq[n]); n++)
> > -		i915_request_put(rq[n]);
> > -	if (igt_flush_test(gt->i915))
> > -		err = -EIO;
> > -
> > -	igt_spinner_fini(&spin);
> > -	return err;
> > -}
> > -
> > -static int live_virtual_bond(void *arg)
> > -{
> > -	static const struct phase {
> > -		const char *name;
> > -		unsigned int flags;
> > -	} phases[] = {
> > -		{ "", 0 },
> > -		{ "schedule", BOND_SCHEDULE },
> > -		{ },
> > -	};
> > -	struct intel_gt *gt = arg;
> > -	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > -	unsigned int class;
> > -	int err;
> > -
> > -	if (intel_uc_uses_guc_submission(&gt->uc))
> > -		return 0;
> > -
> > -	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > -		const struct phase *p;
> > -		int nsibling;
> > -
> > -		nsibling = select_siblings(gt, class, siblings);
> > -		if (nsibling < 2)
> > -			continue;
> > -
> > -		for (p = phases; p->name; p++) {
> > -			err = bond_virtual_engine(gt,
> > -						  class, siblings, nsibling,
> > -						  p->flags);
> > -			if (err) {
> > -				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > -				       __func__, p->name, class, nsibling, err);
> > -				return err;
> > -			}
> > -		}
> > -	}
> > -
> > -	return 0;
> > -}
> > -
> >   static int reset_virtual_engine(struct intel_gt *gt,
> >   				struct intel_engine_cs **siblings,
> >   				unsigned int nsibling)
> > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> >   		SUBTEST(live_virtual_mask),
> >   		SUBTEST(live_virtual_preserved),
> >   		SUBTEST(live_virtual_slice),
> > -		SUBTEST(live_virtual_bond),
> >   		SUBTEST(live_virtual_reset),
> >   	};
> > 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 12:24       ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel

On Wed, Apr 28, 2021 at 04:51:19PM +0100, Tvrtko Ursulin wrote:
> 
> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > This adds a bunch of complexity which the media driver has never
> > actually used.  The media driver does technically bond a balanced engine
> > to another engine but the balanced engine only has one engine in the
> > sibling set.  This doesn't actually result in a virtual engine.
> 
> For historical reference, this is not because uapi was over-engineered but
> because certain SKUs never materialized.

Jason said that for SKU with lots of media engines media-driver sets up a
set of ctx in userspace with all the pairings (and I guess then load
balances in userspace or something like that). Tony Ye also seems to have
confirmed that. So I'm not clear on which SKU this is?

Or maybe the real deal is only future platforms, and there we have GuC
scheduler backend.

Not against adding a bit more context to the commit message, but we need
to make sure what we put there is actually correct. Maybe best to ask
Tony/Carl as part of getting an ack from them.
-Daniel

> 
> Regards,
> 
> Tvrtko
> 
> > Unless some userspace badly wants it, there's no good reason to support
> > this case.  This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We
> > leave the validation code in place in case we ever decide we want to do
> > something interesting with the bonding information.
> > 
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
> >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 -
> >   .../drm/i915/gt/intel_execlists_submission.c  | 100 --------
> >   .../drm/i915/gt/intel_execlists_submission.h  |   4 -
> >   drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 ------------------
> >   6 files changed, 7 insertions(+), 353 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index e8179918fa306..5f8d0faf783aa 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1553,6 +1553,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >   	}
> >   	virtual = set->engines->engines[idx]->engine;
> > +	if (intel_engine_is_virtual(virtual)) {
> > +		drm_dbg(&i915->drm,
> > +			"Bonding with virtual engines not allowed\n");
> > +		return -EINVAL;
> > +	}
> > +
> >   	err = check_user_mbz(&ext->flags);
> >   	if (err)
> >   		return err;
> > @@ -1593,18 +1599,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data)
> >   				n, ci.engine_class, ci.engine_instance);
> >   			return -EINVAL;
> >   		}
> > -
> > -		/*
> > -		 * A non-virtual engine has no siblings to choose between; and
> > -		 * a submit fence will always be directed to the one engine.
> > -		 */
> > -		if (intel_engine_is_virtual(virtual)) {
> > -			err = intel_virtual_engine_attach_bond(virtual,
> > -							       master,
> > -							       bond);
> > -			if (err)
> > -				return err;
> > -		}
> >   	}
> >   	return 0;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index d640bba6ad9ab..efb2fa3522a42 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >   		if (args->flags & I915_EXEC_FENCE_SUBMIT)
> >   			err = i915_request_await_execution(eb.request,
> >   							   in_fence,
> > -							   eb.engine->bond_execute);
> > +							   NULL);
> >   		else
> >   			err = i915_request_await_dma_fence(eb.request,
> >   							   in_fence);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index 883bafc449024..68cfe5080325c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -446,13 +446,6 @@ struct intel_engine_cs {
> >   	 */
> >   	void		(*submit_request)(struct i915_request *rq);
> > -	/*
> > -	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
> > -	 * request down to the bonded pairs.
> > -	 */
> > -	void            (*bond_execute)(struct i915_request *rq,
> > -					struct dma_fence *signal);
> > -
> >   	/*
> >   	 * Call when the priority on a request has changed and it and its
> >   	 * dependencies may need rescheduling. Note the request itself may
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index de124870af44d..b6e2b59f133b7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -181,18 +181,6 @@ struct virtual_engine {
> >   		int prio;
> >   	} nodes[I915_NUM_ENGINES];
> > -	/*
> > -	 * Keep track of bonded pairs -- restrictions upon on our selection
> > -	 * of physical engines any particular request may be submitted to.
> > -	 * If we receive a submit-fence from a master engine, we will only
> > -	 * use one of sibling_mask physical engines.
> > -	 */
> > -	struct ve_bond {
> > -		const struct intel_engine_cs *master;
> > -		intel_engine_mask_t sibling_mask;
> > -	} *bonds;
> > -	unsigned int num_bonds;
> > -
> >   	/* And finally, which physical engines this virtual engine maps onto. */
> >   	unsigned int num_siblings;
> >   	struct intel_engine_cs *siblings[];
> > @@ -3307,7 +3295,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
> >   	intel_breadcrumbs_free(ve->base.breadcrumbs);
> >   	intel_engine_free_request_pool(&ve->base);
> > -	kfree(ve->bonds);
> >   	kfree(ve);
> >   }
> > @@ -3560,42 +3547,6 @@ static void virtual_submit_request(struct i915_request *rq)
> >   	spin_unlock_irqrestore(&ve->base.active.lock, flags);
> >   }
> > -static struct ve_bond *
> > -virtual_find_bond(struct virtual_engine *ve,
> > -		  const struct intel_engine_cs *master)
> > -{
> > -	int i;
> > -
> > -	for (i = 0; i < ve->num_bonds; i++) {
> > -		if (ve->bonds[i].master == master)
> > -			return &ve->bonds[i];
> > -	}
> > -
> > -	return NULL;
> > -}
> > -
> > -static void
> > -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
> > -{
> > -	struct virtual_engine *ve = to_virtual_engine(rq->engine);
> > -	intel_engine_mask_t allowed, exec;
> > -	struct ve_bond *bond;
> > -
> > -	allowed = ~to_request(signal)->engine->mask;
> > -
> > -	bond = virtual_find_bond(ve, to_request(signal)->engine);
> > -	if (bond)
> > -		allowed &= bond->sibling_mask;
> > -
> > -	/* Restrict the bonded request to run on only the available engines */
> > -	exec = READ_ONCE(rq->execution_mask);
> > -	while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
> > -		;
> > -
> > -	/* Prevent the master from being re-run on the bonded engines */
> > -	to_request(signal)->execution_mask &= ~allowed;
> > -}
> > -
> >   struct intel_context *
> >   intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >   			       unsigned int count)
> > @@ -3649,7 +3600,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >   	ve->base.schedule = i915_schedule;
> >   	ve->base.submit_request = virtual_submit_request;
> > -	ve->base.bond_execute = virtual_bond_execute;
> >   	INIT_LIST_HEAD(virtual_queue(ve));
> >   	ve->base.execlists.queue_priority_hint = INT_MIN;
> > @@ -3747,59 +3697,9 @@ intel_execlists_clone_virtual(struct intel_engine_cs *src)
> >   	if (IS_ERR(dst))
> >   		return dst;
> > -	if (se->num_bonds) {
> > -		struct virtual_engine *de = to_virtual_engine(dst->engine);
> > -
> > -		de->bonds = kmemdup(se->bonds,
> > -				    sizeof(*se->bonds) * se->num_bonds,
> > -				    GFP_KERNEL);
> > -		if (!de->bonds) {
> > -			intel_context_put(dst);
> > -			return ERR_PTR(-ENOMEM);
> > -		}
> > -
> > -		de->num_bonds = se->num_bonds;
> > -	}
> > -
> >   	return dst;
> >   }
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -				     const struct intel_engine_cs *master,
> > -				     const struct intel_engine_cs *sibling)
> > -{
> > -	struct virtual_engine *ve = to_virtual_engine(engine);
> > -	struct ve_bond *bond;
> > -	int n;
> > -
> > -	/* Sanity check the sibling is part of the virtual engine */
> > -	for (n = 0; n < ve->num_siblings; n++)
> > -		if (sibling == ve->siblings[n])
> > -			break;
> > -	if (n == ve->num_siblings)
> > -		return -EINVAL;
> > -
> > -	bond = virtual_find_bond(ve, master);
> > -	if (bond) {
> > -		bond->sibling_mask |= sibling->mask;
> > -		return 0;
> > -	}
> > -
> > -	bond = krealloc(ve->bonds,
> > -			sizeof(*bond) * (ve->num_bonds + 1),
> > -			GFP_KERNEL);
> > -	if (!bond)
> > -		return -ENOMEM;
> > -
> > -	bond[ve->num_bonds].master = master;
> > -	bond[ve->num_bonds].sibling_mask = sibling->mask;
> > -
> > -	ve->bonds = bond;
> > -	ve->num_bonds++;
> > -
> > -	return 0;
> > -}
> > -
> >   void intel_execlists_show_requests(struct intel_engine_cs *engine,
> >   				   struct drm_printer *m,
> >   				   void (*show_request)(struct drm_printer *m,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > index fd61dae820e9e..80cec37a56ba9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
> > @@ -39,10 +39,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
> >   struct intel_context *
> >   intel_execlists_clone_virtual(struct intel_engine_cs *src);
> > -int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
> > -				     const struct intel_engine_cs *master,
> > -				     const struct intel_engine_cs *sibling);
> > -
> >   bool
> >   intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > index 1081cd36a2bd3..f03446d587160 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
> > @@ -4311,234 +4311,6 @@ static int live_virtual_preserved(void *arg)
> >   	return 0;
> >   }
> > -static int bond_virtual_engine(struct intel_gt *gt,
> > -			       unsigned int class,
> > -			       struct intel_engine_cs **siblings,
> > -			       unsigned int nsibling,
> > -			       unsigned int flags)
> > -#define BOND_SCHEDULE BIT(0)
> > -{
> > -	struct intel_engine_cs *master;
> > -	struct i915_request *rq[16];
> > -	enum intel_engine_id id;
> > -	struct igt_spinner spin;
> > -	unsigned long n;
> > -	int err;
> > -
> > -	/*
> > -	 * A set of bonded requests is intended to be run concurrently
> > -	 * across a number of engines. We use one request per-engine
> > -	 * and a magic fence to schedule each of the bonded requests
> > -	 * at the same time. A consequence of our current scheduler is that
> > -	 * we only move requests to the HW ready queue when the request
> > -	 * becomes ready, that is when all of its prerequisite fences have
> > -	 * been signaled. As one of those fences is the master submit fence,
> > -	 * there is a delay on all secondary fences as the HW may be
> > -	 * currently busy. Equally, as all the requests are independent,
> > -	 * they may have other fences that delay individual request
> > -	 * submission to HW. Ergo, we do not guarantee that all requests are
> > -	 * immediately submitted to HW at the same time, just that if the
> > -	 * rules are abided by, they are ready at the same time as the
> > -	 * first is submitted. Userspace can embed semaphores in its batch
> > -	 * to ensure parallel execution of its phases as it requires.
> > -	 * Though naturally it gets requested that perhaps the scheduler should
> > -	 * take care of parallel execution, even across preemption events on
> > -	 * different HW. (The proper answer is of course "lalalala".)
> > -	 *
> > -	 * With the submit-fence, we have identified three possible phases
> > -	 * of synchronisation depending on the master fence: queued (not
> > -	 * ready), executing, and signaled. The first two are quite simple
> > -	 * and checked below. However, the signaled master fence handling is
> > -	 * contentious. Currently we do not distinguish between a signaled
> > -	 * fence and an expired fence, as once signaled it does not convey
> > -	 * any information about the previous execution. It may even be freed
> > -	 * and hence checking later it may not exist at all. Ergo we currently
> > -	 * do not apply the bonding constraint for an already signaled fence,
> > -	 * as our expectation is that it should not constrain the secondaries
> > -	 * and is outside of the scope of the bonded request API (i.e. all
> > -	 * userspace requests are meant to be running in parallel). As
> > -	 * it imposes no constraint, and is effectively a no-op, we do not
> > -	 * check below as normal execution flows are checked extensively above.
> > -	 *
> > -	 * XXX Is the degenerate handling of signaled submit fences the
> > -	 * expected behaviour for userpace?
> > -	 */
> > -
> > -	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
> > -
> > -	if (igt_spinner_init(&spin, gt))
> > -		return -ENOMEM;
> > -
> > -	err = 0;
> > -	rq[0] = ERR_PTR(-ENOMEM);
> > -	for_each_engine(master, gt, id) {
> > -		struct i915_sw_fence fence = {};
> > -		struct intel_context *ce;
> > -
> > -		if (master->class == class)
> > -			continue;
> > -
> > -		ce = intel_context_create(master);
> > -		if (IS_ERR(ce)) {
> > -			err = PTR_ERR(ce);
> > -			goto out;
> > -		}
> > -
> > -		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
> > -
> > -		rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
> > -		intel_context_put(ce);
> > -		if (IS_ERR(rq[0])) {
> > -			err = PTR_ERR(rq[0]);
> > -			goto out;
> > -		}
> > -		i915_request_get(rq[0]);
> > -
> > -		if (flags & BOND_SCHEDULE) {
> > -			onstack_fence_init(&fence);
> > -			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
> > -							       &fence,
> > -							       GFP_KERNEL);
> > -		}
> > -
> > -		i915_request_add(rq[0]);
> > -		if (err < 0)
> > -			goto out;
> > -
> > -		if (!(flags & BOND_SCHEDULE) &&
> > -		    !igt_wait_for_spinner(&spin, rq[0])) {
> > -			err = -EIO;
> > -			goto out;
> > -		}
> > -
> > -		for (n = 0; n < nsibling; n++) {
> > -			struct intel_context *ve;
> > -
> > -			ve = intel_execlists_create_virtual(siblings, nsibling);
> > -			if (IS_ERR(ve)) {
> > -				err = PTR_ERR(ve);
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -
> > -			err = intel_virtual_engine_attach_bond(ve->engine,
> > -							       master,
> > -							       siblings[n]);
> > -			if (err) {
> > -				intel_context_put(ve);
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -
> > -			err = intel_context_pin(ve);
> > -			intel_context_put(ve);
> > -			if (err) {
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -
> > -			rq[n + 1] = i915_request_create(ve);
> > -			intel_context_unpin(ve);
> > -			if (IS_ERR(rq[n + 1])) {
> > -				err = PTR_ERR(rq[n + 1]);
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -			i915_request_get(rq[n + 1]);
> > -
> > -			err = i915_request_await_execution(rq[n + 1],
> > -							   &rq[0]->fence,
> > -							   ve->engine->bond_execute);
> > -			i915_request_add(rq[n + 1]);
> > -			if (err < 0) {
> > -				onstack_fence_fini(&fence);
> > -				goto out;
> > -			}
> > -		}
> > -		onstack_fence_fini(&fence);
> > -		intel_engine_flush_submission(master);
> > -		igt_spinner_end(&spin);
> > -
> > -		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> > -			pr_err("Master request did not execute (on %s)!\n",
> > -			       rq[0]->engine->name);
> > -			err = -EIO;
> > -			goto out;
> > -		}
> > -
> > -		for (n = 0; n < nsibling; n++) {
> > -			if (i915_request_wait(rq[n + 1], 0,
> > -					      MAX_SCHEDULE_TIMEOUT) < 0) {
> > -				err = -EIO;
> > -				goto out;
> > -			}
> > -
> > -			if (rq[n + 1]->engine != siblings[n]) {
> > -				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
> > -				       siblings[n]->name,
> > -				       rq[n + 1]->engine->name,
> > -				       rq[0]->engine->name);
> > -				err = -EINVAL;
> > -				goto out;
> > -			}
> > -		}
> > -
> > -		for (n = 0; !IS_ERR(rq[n]); n++)
> > -			i915_request_put(rq[n]);
> > -		rq[0] = ERR_PTR(-ENOMEM);
> > -	}
> > -
> > -out:
> > -	for (n = 0; !IS_ERR(rq[n]); n++)
> > -		i915_request_put(rq[n]);
> > -	if (igt_flush_test(gt->i915))
> > -		err = -EIO;
> > -
> > -	igt_spinner_fini(&spin);
> > -	return err;
> > -}
> > -
> > -static int live_virtual_bond(void *arg)
> > -{
> > -	static const struct phase {
> > -		const char *name;
> > -		unsigned int flags;
> > -	} phases[] = {
> > -		{ "", 0 },
> > -		{ "schedule", BOND_SCHEDULE },
> > -		{ },
> > -	};
> > -	struct intel_gt *gt = arg;
> > -	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > -	unsigned int class;
> > -	int err;
> > -
> > -	if (intel_uc_uses_guc_submission(&gt->uc))
> > -		return 0;
> > -
> > -	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > -		const struct phase *p;
> > -		int nsibling;
> > -
> > -		nsibling = select_siblings(gt, class, siblings);
> > -		if (nsibling < 2)
> > -			continue;
> > -
> > -		for (p = phases; p->name; p++) {
> > -			err = bond_virtual_engine(gt,
> > -						  class, siblings, nsibling,
> > -						  p->flags);
> > -			if (err) {
> > -				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
> > -				       __func__, p->name, class, nsibling, err);
> > -				return err;
> > -			}
> > -		}
> > -	}
> > -
> > -	return 0;
> > -}
> > -
> >   static int reset_virtual_engine(struct intel_gt *gt,
> >   				struct intel_engine_cs **siblings,
> >   				unsigned int nsibling)
> > @@ -4712,7 +4484,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
> >   		SUBTEST(live_virtual_mask),
> >   		SUBTEST(live_virtual_preserved),
> >   		SUBTEST(live_virtual_slice),
> > -		SUBTEST(live_virtual_bond),
> >   		SUBTEST(live_virtual_reset),
> >   	};
> > 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 12:37     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:37 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:25PM -0500, Jason Ekstrand wrote:
> There's a big comment saying how useful it is but no one is using this
> for anything.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

I was trying to find anything before all your deletions, but alas nothing.
I did spent a bit of time on this, and discovered that the debugfs use was
nuked in

db80a1294c23 ("drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects")

After going through quite a few iterations, e.g.

5b5efdf79abf ("drm/i915: Make debugfs/per_file_stats scale better")
f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")

The above removed the need for vm->file because stats debugfs file
filtered using stats->vm instead of stats->file.

History goes on until the original introduction of this (again for
debugfs) in

2bfa996e031b ("drm/i915: Store owning file on the i915_address_space")

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 ---------
>  drivers/gpu/drm/i915/gt/intel_gtt.h         | 10 ----------
>  drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
>  3 files changed, 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 7929d5a8be449..db9153e0f85a7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -921,17 +921,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
>  				u32 *id)
>  {
>  	struct drm_i915_private *i915 = ctx->i915;
> -	struct i915_address_space *vm;
>  	int ret;
>  
>  	ctx->file_priv = fpriv;
>  
> -	mutex_lock(&ctx->mutex);
> -	vm = i915_gem_context_vm(ctx);
> -	if (vm)
> -		WRITE_ONCE(vm->file, fpriv); /* XXX */
> -	mutex_unlock(&ctx->mutex);
> -
>  	ctx->pid = get_task_pid(current, PIDTYPE_PID);
>  	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
>  		 current->comm, pid_nr(ctx->pid)); 
> @@ -1030,8 +1023,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
>  	if (IS_ERR(ppgtt))
>  		return PTR_ERR(ppgtt);
>  
> -	ppgtt->vm.file = file_priv;
> -
>  	if (args->extensions) {
>  		err = i915_user_extensions(u64_to_user_ptr(args->extensions),
>  					   NULL, 0,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index e67e34e179131..4c46068e63c9d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -217,16 +217,6 @@ struct i915_address_space {

Pls also delete the drm_i915_file_private pre-dcl in this file.

With this added and the history adequately covered in the commit message:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


>  	struct intel_gt *gt;
>  	struct drm_i915_private *i915;
>  	struct device *dma;
> -	/*
> -	 * Every address space belongs to a struct file - except for the global
> -	 * GTT that is owned by the driver (and so @file is set to NULL). In
> -	 * principle, no information should leak from one context to another
> -	 * (or between files/processes etc) unless explicitly shared by the
> -	 * owner. Tracking the owner is important in order to free up per-file
> -	 * objects along with the file, to aide resource tracking, and to
> -	 * assign blame.
> -	 */
> -	struct drm_i915_file_private *file;
>  	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
>  	u64 reserved;		/* size addr space reserved */
>  
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> index 5c7ae40bba634..cc047ec594f93 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> @@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
>  	ppgtt->vm.gt = &i915->gt;
>  	ppgtt->vm.i915 = i915;
>  	ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
> -	ppgtt->vm.file = ERR_PTR(-ENODEV);
>  	ppgtt->vm.dma = i915->drm.dev;
>  
>  	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file
@ 2021-04-29 12:37     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 12:37 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:25PM -0500, Jason Ekstrand wrote:
> There's a big comment saying how useful it is but no one is using this
> for anything.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

I was trying to find anything before all your deletions, but alas nothing.
I did spent a bit of time on this, and discovered that the debugfs use was
nuked in

db80a1294c23 ("drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects")

After going through quite a few iterations, e.g.

5b5efdf79abf ("drm/i915: Make debugfs/per_file_stats scale better")
f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")

The above removed the need for vm->file because stats debugfs file
filtered using stats->vm instead of stats->file.

History goes on until the original introduction of this (again for
debugfs) in

2bfa996e031b ("drm/i915: Store owning file on the i915_address_space")

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 ---------
>  drivers/gpu/drm/i915/gt/intel_gtt.h         | 10 ----------
>  drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
>  3 files changed, 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 7929d5a8be449..db9153e0f85a7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -921,17 +921,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
>  				u32 *id)
>  {
>  	struct drm_i915_private *i915 = ctx->i915;
> -	struct i915_address_space *vm;
>  	int ret;
>  
>  	ctx->file_priv = fpriv;
>  
> -	mutex_lock(&ctx->mutex);
> -	vm = i915_gem_context_vm(ctx);
> -	if (vm)
> -		WRITE_ONCE(vm->file, fpriv); /* XXX */
> -	mutex_unlock(&ctx->mutex);
> -
>  	ctx->pid = get_task_pid(current, PIDTYPE_PID);
>  	snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
>  		 current->comm, pid_nr(ctx->pid)); 
> @@ -1030,8 +1023,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
>  	if (IS_ERR(ppgtt))
>  		return PTR_ERR(ppgtt);
>  
> -	ppgtt->vm.file = file_priv;
> -
>  	if (args->extensions) {
>  		err = i915_user_extensions(u64_to_user_ptr(args->extensions),
>  					   NULL, 0,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index e67e34e179131..4c46068e63c9d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -217,16 +217,6 @@ struct i915_address_space {

Pls also delete the drm_i915_file_private pre-dcl in this file.

With this added and the history adequately covered in the commit message:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


>  	struct intel_gt *gt;
>  	struct drm_i915_private *i915;
>  	struct device *dma;
> -	/*
> -	 * Every address space belongs to a struct file - except for the global
> -	 * GTT that is owned by the driver (and so @file is set to NULL). In
> -	 * principle, no information should leak from one context to another
> -	 * (or between files/processes etc) unless explicitly shared by the
> -	 * owner. Tracking the owner is important in order to free up per-file
> -	 * objects along with the file, to aide resource tracking, and to
> -	 * assign blame.
> -	 */
> -	struct drm_i915_file_private *file;
>  	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
>  	u64 reserved;		/* size addr space reserved */
>  
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> index 5c7ae40bba634..cc047ec594f93 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> @@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
>  	ppgtt->vm.gt = &i915->gt;
>  	ppgtt->vm.i915 = i915;
>  	ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
> -	ppgtt->vm.file = ERR_PTR(-ENODEV);
>  	ppgtt->vm.dma = i915->drm.dev;
>  
>  	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-29 12:24       ` Daniel Vetter
@ 2021-04-29 12:54         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29 12:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, dri-devel, Jason Ekstrand


On 29/04/2021 13:24, Daniel Vetter wrote:
> On Wed, Apr 28, 2021 at 04:51:19PM +0100, Tvrtko Ursulin wrote:
>>
>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>> This adds a bunch of complexity which the media driver has never
>>> actually used.  The media driver does technically bond a balanced engine
>>> to another engine but the balanced engine only has one engine in the
>>> sibling set.  This doesn't actually result in a virtual engine.
>>
>> For historical reference, this is not because uapi was over-engineered but
>> because certain SKUs never materialized.
> 
> Jason said that for SKU with lots of media engines media-driver sets up a
> set of ctx in userspace with all the pairings (and I guess then load
> balances in userspace or something like that). Tony Ye also seems to have
> confirmed that. So I'm not clear on which SKU this is?

Not sure if I should disclose it here. But anyway, platform which is 
currently in upstream and was supposed to be the first to use this uapi 
was supposed to have at least 4 vcs engines initially, or even 8 vcs + 4 
vecs at some point. That was the requirement uapi was designed for. For 
that kind of platform there were supposed to be two virtual engines 
created, with bonding, for instance parent = [vcs0, vcs2], child = 
[vcs1, vcs3]; bonds = [vcs0 - vcs1, vcs2 - vcs3]. With more engines the 
merrier.

Userspace load balancing, from memory, came into the picture only as a 
consequence of balancing between two types of media pipelines which was 
either working around the rcs contention or lack of sfc, or both. Along 
the lines of - one stage of a media pipeline can be done either as GPGPU 
work, or on the media engine, and so userspace was deciding to spawn "a 
bit of these and a bit of those" to utilise all the GPU blocks. Not 
really about frame split virtual engines and bonding, but completely 
different load balancing, between gpgpu and fixed pipeline.

> Or maybe the real deal is only future platforms, and there we have GuC
> scheduler backend.

Yes, because SKUs never materialised.

> Not against adding a bit more context to the commit message, but we need
> to make sure what we put there is actually correct. Maybe best to ask
> Tony/Carl as part of getting an ack from them.

I think there is no need - fact uapi was designed for way more engines 
than we got to have is straight forward enough.

Only unasked for flexibility in the uapi was the fact bonding can 
express any dependency and not only N consecutive engines as media fixed 
function needed at the time. I say "at the time" because in fact the 
"consecutive" engines requirement also got more complicated / broken in 
a following gen (via fusing and logical instance remapping), proving the 
point having the uapi disassociated from the hw limitations of the _day_ 
was a good call.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 12:54         ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-29 12:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, dri-devel


On 29/04/2021 13:24, Daniel Vetter wrote:
> On Wed, Apr 28, 2021 at 04:51:19PM +0100, Tvrtko Ursulin wrote:
>>
>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>> This adds a bunch of complexity which the media driver has never
>>> actually used.  The media driver does technically bond a balanced engine
>>> to another engine but the balanced engine only has one engine in the
>>> sibling set.  This doesn't actually result in a virtual engine.
>>
>> For historical reference, this is not because uapi was over-engineered but
>> because certain SKUs never materialized.
> 
> Jason said that for SKU with lots of media engines media-driver sets up a
> set of ctx in userspace with all the pairings (and I guess then load
> balances in userspace or something like that). Tony Ye also seems to have
> confirmed that. So I'm not clear on which SKU this is?

Not sure if I should disclose it here. But anyway, platform which is 
currently in upstream and was supposed to be the first to use this uapi 
was supposed to have at least 4 vcs engines initially, or even 8 vcs + 4 
vecs at some point. That was the requirement uapi was designed for. For 
that kind of platform there were supposed to be two virtual engines 
created, with bonding, for instance parent = [vcs0, vcs2], child = 
[vcs1, vcs3]; bonds = [vcs0 - vcs1, vcs2 - vcs3]. With more engines the 
merrier.

Userspace load balancing, from memory, came into the picture only as a 
consequence of balancing between two types of media pipelines which was 
either working around the rcs contention or lack of sfc, or both. Along 
the lines of - one stage of a media pipeline can be done either as GPGPU 
work, or on the media engine, and so userspace was deciding to spawn "a 
bit of these and a bit of those" to utilise all the GPU blocks. Not 
really about frame split virtual engines and bonding, but completely 
different load balancing, between gpgpu and fixed pipeline.

> Or maybe the real deal is only future platforms, and there we have GuC
> scheduler backend.

Yes, because SKUs never materialised.

> Not against adding a bit more context to the commit message, but we need
> to make sure what we put there is actually correct. Maybe best to ask
> Tony/Carl as part of getting an ack from them.

I think there is no need - fact uapi was designed for way more engines 
than we got to have is straight forward enough.

Only unasked for flexibility in the uapi was the fact bonding can 
express any dependency and not only N consecutive engines as media fixed 
function needed at the time. I say "at the time" because in fact the 
"consecutive" engines requirement also got more complicated / broken in 
a following gen (via fusing and logical instance remapping), proving the 
point having the uapi disassociated from the hw limitations of the _day_ 
was a good call.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 13:02     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 13:02 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

The commit introducing a new data structure really should have a solid
intro in the commit message about. Please cover

- that ctx really should be immutable, safe for exceptions like priority

- that unfortunately we butchered the uapi with setparam and sharing
  setparams between create_ext and setparam

- and how exactly proto ctx fixes this (with stuff like locking design
  used)

Maybe also dupe the kerneldoc into here for completeness.
On Fri, Apr 23, 2021 at 05:31:23PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 143 ++++++++++++++----
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +++
>  .../gpu/drm/i915/gem/selftests/mock_context.c |  16 +-

I'm wondering whether in the end we should split out the proto_ctx into
its own file, with the struct private only to itself. But I guess
impossible during the transition, and also maybe afterwards?

>  3 files changed, 150 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e5efd22c89ba2..3e883daab93bf 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -191,6 +191,95 @@ static int validate_priority(struct drm_i915_private *i915,
>  	return 0;
>  }
>  
> +static void proto_context_close(struct i915_gem_proto_context *pc)
> +{
> +	if (pc->vm)
> +		i915_vm_put(pc->vm);
> +	kfree(pc);
> +}
> +
> +static int proto_context_set_persistence(struct drm_i915_private *i915,
> +					 struct i915_gem_proto_context *pc,
> +					 bool persist)
> +{
> +	if (test_bit(UCONTEXT_PERSISTENCE, &pc->user_flags) == persist)
> +		return 0;

We have compilers to optimize this kind of stuff, pls remove :-)
Especially with the non-atomic bitops there's no point.

> +
> +	if (persist) {
> +		/*
> +		 * Only contexts that are short-lived [that will expire or be
> +		 * reset] are allowed to survive past termination. We require
> +		 * hangcheck to ensure that the persistent requests are healthy.
> +		 */
> +		if (!i915->params.enable_hangcheck)
> +			return -EINVAL;
> +
> +		set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);

It's a bit entetaining, but the bitops in the kernel are atomic. Which is
hella confusing here.

I think open coding is the standard for truly normal bitops.

> +	} else {
> +		/* To cancel a context we use "preempt-to-idle" */
> +		if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
> +			return -ENODEV;
> +
> +		/*
> +		 * If the cancel fails, we then need to reset, cleanly!
> +		 *
> +		 * If the per-engine reset fails, all hope is lost! We resort
> +		 * to a full GPU reset in that unlikely case, but realistically
> +		 * if the engine could not reset, the full reset does not fare
> +		 * much better. The damage has been done.
> +		 *
> +		 * However, if we cannot reset an engine by itself, we cannot
> +		 * cleanup a hanging persistent context without causing
> +		 * colateral damage, and we should not pretend we can by
> +		 * exposing the interface.
> +		 */
> +		if (!intel_has_reset_engine(&i915->gt))
> +			return -ENODEV;
> +
> +		clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);

Same here.

> +	}
> +
> +	return 0;
> +}
> +
> +static struct i915_gem_proto_context *
> +proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> +{
> +	struct i915_gem_proto_context *pc;
> +
> +	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> +	    !HAS_EXECLISTS(i915))
> +		return ERR_PTR(-EINVAL);
> +
> +	pc = kzalloc(sizeof(*pc), GFP_KERNEL);
> +	if (!pc)
> +		return ERR_PTR(-ENOMEM);
> +
> +	if (HAS_FULL_PPGTT(i915)) {
> +		struct i915_ppgtt *ppgtt;
> +
> +		ppgtt = i915_ppgtt_create(&i915->gt);
> +		if (IS_ERR(ppgtt)) {
> +			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> +				PTR_ERR(ppgtt));
> +			proto_context_close(pc);
> +			return ERR_CAST(ppgtt);
> +		}
> +		pc->vm = &ppgtt->vm;
> +	}
> +
> +	pc->user_flags = 0;
> +	set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> +	set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);

Same about atomic bitops here.

> +	proto_context_set_persistence(i915, pc, true);
> +	pc->sched.priority = I915_PRIORITY_NORMAL;
> +
> +	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
> +		pc->single_timeline = true;

bit a bikeshed, but I'd put the error checking in here too and deal with
the unwind pain with the usual goto proto_close. That should also make the
ppgtt unwind path a bit clearer because it sticks out in the standard way.

> +
> +	return pc;
> +}
> +
>  static struct i915_address_space *
>  context_get_vm_rcu(struct i915_gem_context *ctx)
>  {
> @@ -660,7 +749,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
>  }
>  
>  static struct i915_gem_context *
> -__create_context(struct drm_i915_private *i915)
> +__create_context(struct drm_i915_private *i915,
> +		 const struct i915_gem_proto_context *pc)
>  {
>  	struct i915_gem_context *ctx;
>  	struct i915_gem_engines *e;
> @@ -673,7 +763,7 @@ __create_context(struct drm_i915_private *i915)
>  
>  	kref_init(&ctx->ref);
>  	ctx->i915 = i915;
> -	ctx->sched.priority = I915_PRIORITY_NORMAL;
> +	ctx->sched = pc->sched;
>  	mutex_init(&ctx->mutex);
>  	INIT_LIST_HEAD(&ctx->link);
>  
> @@ -696,9 +786,7 @@ __create_context(struct drm_i915_private *i915)
>  	 * is no remap info, it will be a NOP. */
>  	ctx->remap_slice = ALL_L3_SLICES(i915);
>  
> -	i915_gem_context_set_bannable(ctx);
> -	i915_gem_context_set_recoverable(ctx);
> -	__context_set_persistence(ctx, true /* cgroup hook? */);
> +	ctx->user_flags = pc->user_flags;
>  
>  	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
>  		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
> @@ -786,38 +874,23 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>  }
>  
>  static struct i915_gem_context *
> -i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> +i915_gem_create_context(struct drm_i915_private *i915,
> +			const struct i915_gem_proto_context *pc)
>  {
>  	struct i915_gem_context *ctx;
>  	int ret;
>  
> -	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> -	    !HAS_EXECLISTS(i915))
> -		return ERR_PTR(-EINVAL);
> -
> -	ctx = __create_context(i915);
> +	ctx = __create_context(i915, pc);
>  	if (IS_ERR(ctx))
>  		return ctx;
>  
> -	if (HAS_FULL_PPGTT(i915)) {
> -		struct i915_ppgtt *ppgtt;
> -
> -		ppgtt = i915_ppgtt_create(&i915->gt);
> -		if (IS_ERR(ppgtt)) {
> -			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> -				PTR_ERR(ppgtt));
> -			context_close(ctx);
> -			return ERR_CAST(ppgtt);
> -		}
> -
> +	if (pc->vm) {
>  		mutex_lock(&ctx->mutex);

I guess this dies later, but this mutex_lock here is superflous since
right now no one else can get at our ctx struct. And nothing in
__assign_ppgtt checks for us holding the lock.

But fine if it only gets remove in the vm immutable patch.

> -		__assign_ppgtt(ctx, &ppgtt->vm);
> +		__assign_ppgtt(ctx, pc->vm);
>  		mutex_unlock(&ctx->mutex);
> -
> -		i915_vm_put(&ppgtt->vm);
>  	}
>  
> -	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> +	if (pc->single_timeline) {
>  		ret = drm_syncobj_create(&ctx->syncobj,
>  					 DRM_SYNCOBJ_CREATE_SIGNALED,
>  					 NULL);
> @@ -883,6 +956,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  			  struct drm_file *file)
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
>  	u32 id;
> @@ -892,7 +966,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  	/* 0 reserved for invalid/unassigned ppgtt */
>  	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
>  
> -	ctx = i915_gem_create_context(i915, 0);
> +	pc = proto_context_create(i915, 0);
> +	if (IS_ERR(pc)) {
> +		err = PTR_ERR(pc);
> +		goto err;
> +	}
> +
> +	ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ctx)) {
>  		err = PTR_ERR(ctx);
>  		goto err;
> @@ -1884,6 +1965,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_private *i915 = to_i915(dev);
>  	struct drm_i915_gem_context_create_ext *args = data;
> +	struct i915_gem_proto_context *pc;
>  	struct create_ext ext_data;
>  	int ret;
>  	u32 id;
> @@ -1906,7 +1988,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  		return -EIO;
>  	}
>  
> -	ext_data.ctx = i915_gem_create_context(i915, args->flags);
> +	pc = proto_context_create(i915, args->flags);
> +	if (IS_ERR(pc))
> +		return PTR_ERR(pc);
> +
> +	ext_data.ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ext_data.ctx))
>  		return PTR_ERR(ext_data.ctx);
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index df76767f0c41b..a42c429f94577 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -46,6 +46,27 @@ struct i915_gem_engines_iter {
>  	const struct i915_gem_engines *engines;
>  };
>  
> +/**
> + * struct i915_gem_proto_context - prototype context
> + *
> + * The struct i915_gem_proto_context represents the creation parameters for
> + * an i915_gem_context.  This is used to gather parameters provided either
> + * through creation flags or via SET_CONTEXT_PARAM so that, when we create
> + * the final i915_gem_context, those parameters can be immutable.

The patch that puts them on an xa should explain how the locking here
works, even if it's rather trivial.

> + */
> +struct i915_gem_proto_context {
> +	/** @vm: See i915_gem_context::vm */
> +	struct i915_address_space *vm;
> +
> +	/** @user_flags: See i915_gem_context::user_flags */
> +	unsigned long user_flags;
> +
> +	/** @sched: See i915_gem_context::sched */
> +	struct i915_sched_attr sched;
> +

To avoid the kerneldoc warning point at your emulated syncobj here.

Also this file isn't included in the i915 context docs (why would it, the
docs have been left dead for years after all :-/). Please fix that in a
prep patch.

> +	bool single_timeline;
> +};
> +
>  /**
>   * struct i915_gem_context - client state
>   *
> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> index 51b5a3421b400..e0f512ef7f3c6 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> @@ -80,11 +80,17 @@ void mock_init_contexts(struct drm_i915_private *i915)
>  struct i915_gem_context *
>  live_context(struct drm_i915_private *i915, struct file *file)
>  {
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
>  	u32 id;
>  
> -	ctx = i915_gem_create_context(i915, 0);
> +	pc = proto_context_create(i915, 0);
> +	if (IS_ERR(pc))
> +		return ERR_CAST(pc);
> +
> +	ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ctx))
>  		return ctx;
>  
> @@ -142,8 +148,14 @@ struct i915_gem_context *
>  kernel_context(struct drm_i915_private *i915)
>  {
>  	struct i915_gem_context *ctx;
> +	struct i915_gem_proto_context *pc;
> +
> +	pc = proto_context_create(i915, 0);
> +	if (IS_ERR(pc))
> +		return ERR_CAST(pc);
>  
> -	ctx = i915_gem_create_context(i915, 0);
> +	ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ctx))
>  		return ctx;

With all comments addressed: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct
@ 2021-04-29 13:02     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 13:02 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

The commit introducing a new data structure really should have a solid
intro in the commit message about. Please cover

- that ctx really should be immutable, safe for exceptions like priority

- that unfortunately we butchered the uapi with setparam and sharing
  setparams between create_ext and setparam

- and how exactly proto ctx fixes this (with stuff like locking design
  used)

Maybe also dupe the kerneldoc into here for completeness.
On Fri, Apr 23, 2021 at 05:31:23PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 143 ++++++++++++++----
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +++
>  .../gpu/drm/i915/gem/selftests/mock_context.c |  16 +-

I'm wondering whether in the end we should split out the proto_ctx into
its own file, with the struct private only to itself. But I guess
impossible during the transition, and also maybe afterwards?

>  3 files changed, 150 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index e5efd22c89ba2..3e883daab93bf 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -191,6 +191,95 @@ static int validate_priority(struct drm_i915_private *i915,
>  	return 0;
>  }
>  
> +static void proto_context_close(struct i915_gem_proto_context *pc)
> +{
> +	if (pc->vm)
> +		i915_vm_put(pc->vm);
> +	kfree(pc);
> +}
> +
> +static int proto_context_set_persistence(struct drm_i915_private *i915,
> +					 struct i915_gem_proto_context *pc,
> +					 bool persist)
> +{
> +	if (test_bit(UCONTEXT_PERSISTENCE, &pc->user_flags) == persist)
> +		return 0;

We have compilers to optimize this kind of stuff, pls remove :-)
Especially with the non-atomic bitops there's no point.

> +
> +	if (persist) {
> +		/*
> +		 * Only contexts that are short-lived [that will expire or be
> +		 * reset] are allowed to survive past termination. We require
> +		 * hangcheck to ensure that the persistent requests are healthy.
> +		 */
> +		if (!i915->params.enable_hangcheck)
> +			return -EINVAL;
> +
> +		set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);

It's a bit entetaining, but the bitops in the kernel are atomic. Which is
hella confusing here.

I think open coding is the standard for truly normal bitops.

> +	} else {
> +		/* To cancel a context we use "preempt-to-idle" */
> +		if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
> +			return -ENODEV;
> +
> +		/*
> +		 * If the cancel fails, we then need to reset, cleanly!
> +		 *
> +		 * If the per-engine reset fails, all hope is lost! We resort
> +		 * to a full GPU reset in that unlikely case, but realistically
> +		 * if the engine could not reset, the full reset does not fare
> +		 * much better. The damage has been done.
> +		 *
> +		 * However, if we cannot reset an engine by itself, we cannot
> +		 * cleanup a hanging persistent context without causing
> +		 * colateral damage, and we should not pretend we can by
> +		 * exposing the interface.
> +		 */
> +		if (!intel_has_reset_engine(&i915->gt))
> +			return -ENODEV;
> +
> +		clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);

Same here.

> +	}
> +
> +	return 0;
> +}
> +
> +static struct i915_gem_proto_context *
> +proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> +{
> +	struct i915_gem_proto_context *pc;
> +
> +	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> +	    !HAS_EXECLISTS(i915))
> +		return ERR_PTR(-EINVAL);
> +
> +	pc = kzalloc(sizeof(*pc), GFP_KERNEL);
> +	if (!pc)
> +		return ERR_PTR(-ENOMEM);
> +
> +	if (HAS_FULL_PPGTT(i915)) {
> +		struct i915_ppgtt *ppgtt;
> +
> +		ppgtt = i915_ppgtt_create(&i915->gt);
> +		if (IS_ERR(ppgtt)) {
> +			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> +				PTR_ERR(ppgtt));
> +			proto_context_close(pc);
> +			return ERR_CAST(ppgtt);
> +		}
> +		pc->vm = &ppgtt->vm;
> +	}
> +
> +	pc->user_flags = 0;
> +	set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> +	set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);

Same about atomic bitops here.

> +	proto_context_set_persistence(i915, pc, true);
> +	pc->sched.priority = I915_PRIORITY_NORMAL;
> +
> +	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
> +		pc->single_timeline = true;

bit a bikeshed, but I'd put the error checking in here too and deal with
the unwind pain with the usual goto proto_close. That should also make the
ppgtt unwind path a bit clearer because it sticks out in the standard way.

> +
> +	return pc;
> +}
> +
>  static struct i915_address_space *
>  context_get_vm_rcu(struct i915_gem_context *ctx)
>  {
> @@ -660,7 +749,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
>  }
>  
>  static struct i915_gem_context *
> -__create_context(struct drm_i915_private *i915)
> +__create_context(struct drm_i915_private *i915,
> +		 const struct i915_gem_proto_context *pc)
>  {
>  	struct i915_gem_context *ctx;
>  	struct i915_gem_engines *e;
> @@ -673,7 +763,7 @@ __create_context(struct drm_i915_private *i915)
>  
>  	kref_init(&ctx->ref);
>  	ctx->i915 = i915;
> -	ctx->sched.priority = I915_PRIORITY_NORMAL;
> +	ctx->sched = pc->sched;
>  	mutex_init(&ctx->mutex);
>  	INIT_LIST_HEAD(&ctx->link);
>  
> @@ -696,9 +786,7 @@ __create_context(struct drm_i915_private *i915)
>  	 * is no remap info, it will be a NOP. */
>  	ctx->remap_slice = ALL_L3_SLICES(i915);
>  
> -	i915_gem_context_set_bannable(ctx);
> -	i915_gem_context_set_recoverable(ctx);
> -	__context_set_persistence(ctx, true /* cgroup hook? */);
> +	ctx->user_flags = pc->user_flags;
>  
>  	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
>  		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
> @@ -786,38 +874,23 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
>  }
>  
>  static struct i915_gem_context *
> -i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> +i915_gem_create_context(struct drm_i915_private *i915,
> +			const struct i915_gem_proto_context *pc)
>  {
>  	struct i915_gem_context *ctx;
>  	int ret;
>  
> -	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> -	    !HAS_EXECLISTS(i915))
> -		return ERR_PTR(-EINVAL);
> -
> -	ctx = __create_context(i915);
> +	ctx = __create_context(i915, pc);
>  	if (IS_ERR(ctx))
>  		return ctx;
>  
> -	if (HAS_FULL_PPGTT(i915)) {
> -		struct i915_ppgtt *ppgtt;
> -
> -		ppgtt = i915_ppgtt_create(&i915->gt);
> -		if (IS_ERR(ppgtt)) {
> -			drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> -				PTR_ERR(ppgtt));
> -			context_close(ctx);
> -			return ERR_CAST(ppgtt);
> -		}
> -
> +	if (pc->vm) {
>  		mutex_lock(&ctx->mutex);

I guess this dies later, but this mutex_lock here is superflous since
right now no one else can get at our ctx struct. And nothing in
__assign_ppgtt checks for us holding the lock.

But fine if it only gets remove in the vm immutable patch.

> -		__assign_ppgtt(ctx, &ppgtt->vm);
> +		__assign_ppgtt(ctx, pc->vm);
>  		mutex_unlock(&ctx->mutex);
> -
> -		i915_vm_put(&ppgtt->vm);
>  	}
>  
> -	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> +	if (pc->single_timeline) {
>  		ret = drm_syncobj_create(&ctx->syncobj,
>  					 DRM_SYNCOBJ_CREATE_SIGNALED,
>  					 NULL);
> @@ -883,6 +956,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  			  struct drm_file *file)
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
>  	u32 id;
> @@ -892,7 +966,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  	/* 0 reserved for invalid/unassigned ppgtt */
>  	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
>  
> -	ctx = i915_gem_create_context(i915, 0);
> +	pc = proto_context_create(i915, 0);
> +	if (IS_ERR(pc)) {
> +		err = PTR_ERR(pc);
> +		goto err;
> +	}
> +
> +	ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ctx)) {
>  		err = PTR_ERR(ctx);
>  		goto err;
> @@ -1884,6 +1965,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_private *i915 = to_i915(dev);
>  	struct drm_i915_gem_context_create_ext *args = data;
> +	struct i915_gem_proto_context *pc;
>  	struct create_ext ext_data;
>  	int ret;
>  	u32 id;
> @@ -1906,7 +1988,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  		return -EIO;
>  	}
>  
> -	ext_data.ctx = i915_gem_create_context(i915, args->flags);
> +	pc = proto_context_create(i915, args->flags);
> +	if (IS_ERR(pc))
> +		return PTR_ERR(pc);
> +
> +	ext_data.ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ext_data.ctx))
>  		return PTR_ERR(ext_data.ctx);
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index df76767f0c41b..a42c429f94577 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -46,6 +46,27 @@ struct i915_gem_engines_iter {
>  	const struct i915_gem_engines *engines;
>  };
>  
> +/**
> + * struct i915_gem_proto_context - prototype context
> + *
> + * The struct i915_gem_proto_context represents the creation parameters for
> + * an i915_gem_context.  This is used to gather parameters provided either
> + * through creation flags or via SET_CONTEXT_PARAM so that, when we create
> + * the final i915_gem_context, those parameters can be immutable.

The patch that puts them on an xa should explain how the locking here
works, even if it's rather trivial.

> + */
> +struct i915_gem_proto_context {
> +	/** @vm: See i915_gem_context::vm */
> +	struct i915_address_space *vm;
> +
> +	/** @user_flags: See i915_gem_context::user_flags */
> +	unsigned long user_flags;
> +
> +	/** @sched: See i915_gem_context::sched */
> +	struct i915_sched_attr sched;
> +

To avoid the kerneldoc warning point at your emulated syncobj here.

Also this file isn't included in the i915 context docs (why would it, the
docs have been left dead for years after all :-/). Please fix that in a
prep patch.

> +	bool single_timeline;
> +};
> +
>  /**
>   * struct i915_gem_context - client state
>   *
> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> index 51b5a3421b400..e0f512ef7f3c6 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> @@ -80,11 +80,17 @@ void mock_init_contexts(struct drm_i915_private *i915)
>  struct i915_gem_context *
>  live_context(struct drm_i915_private *i915, struct file *file)
>  {
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
>  	u32 id;
>  
> -	ctx = i915_gem_create_context(i915, 0);
> +	pc = proto_context_create(i915, 0);
> +	if (IS_ERR(pc))
> +		return ERR_CAST(pc);
> +
> +	ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ctx))
>  		return ctx;
>  
> @@ -142,8 +148,14 @@ struct i915_gem_context *
>  kernel_context(struct drm_i915_private *i915)
>  {
>  	struct i915_gem_context *ctx;
> +	struct i915_gem_proto_context *pc;
> +
> +	pc = proto_context_create(i915, 0);
> +	if (IS_ERR(pc))
> +		return ERR_CAST(pc);
>  
> -	ctx = i915_gem_create_context(i915, 0);
> +	ctx = i915_gem_create_context(i915, pc);
> +	proto_context_close(pc);
>  	if (IS_ERR(ctx))
>  		return ctx;

With all comments addressed: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 13:27     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 13:27 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:24PM -0500, Jason Ekstrand wrote:
> We're about to start doing lazy context creation which means contexts
> get created in i915_gem_context_lookup and we may start having more
> errors than -ENOENT.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
>  drivers/gpu/drm/i915/i915_drv.h                |  2 +-
>  drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
>  4 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 3e883daab93bf..7929d5a8be449 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  	int ret = 0;
>  
>  	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> -	if (!ctx)
> -		return -ENOENT;
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>  
>  	switch (args->param) {
>  	case I915_CONTEXT_PARAM_GTT_SIZE:
> @@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  	int ret;
>  
>  	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> -	if (!ctx)
> -		return -ENOENT;
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>  
>  	ret = ctx_setparam(file_priv, ctx, args);
>  
> @@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  		return -EINVAL;
>  
>  	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> -	if (!ctx)
> -		return -ENOENT;
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>  
>  	/*
>  	 * We opt for unserialised reads here. This may result in tearing
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 7024adcd5cf15..de14b26f3b2d5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
>  	struct i915_gem_context *ctx;
>  
>  	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
> -	if (unlikely(!ctx))
> -		return -ENOENT;
> +	if (unlikely(IS_ERR(ctx)))
> +		return PTR_ERR(ctx);
>  
>  	eb->gem_context = ctx;
>  	if (rcu_access_pointer(ctx->vm))
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8571c5c1509a7..004ed0e59c999 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h

I just realized that I think __i915_gem_context_lookup_rcu doesn't have
users anymore. Please make sure it's deleted.

> @@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
>  		ctx = NULL;
>  	rcu_read_unlock();
>  
> -	return ctx;
> +	return ctx ? ctx : ERR_PTR(-ENOENT);
>  }
>  
>  /* i915_gem_evict.c */
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 85ad62dbabfab..b86ed03f6a705 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
>  		struct drm_i915_file_private *file_priv = file->driver_priv;
>  
>  		specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
> -		if (!specific_ctx) {
> +		if (IS_ERR(specific_ctx)) {
>  			DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
>  				  ctx_handle);
> -			ret = -ENOENT;
> +			ret = PTR_ERR(specific_ctx);

Yeah this looks like a nice place to integrate this.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

One thing we need to make sure in the next patch or thereabouts is that
lookup can only return ENOENT or ENOMEM, but never EINVAL. I'll drop some
bikesheds on that :-)
-Daniel

>  			goto err;
>  		}
>  	}
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
@ 2021-04-29 13:27     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 13:27 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:24PM -0500, Jason Ekstrand wrote:
> We're about to start doing lazy context creation which means contexts
> get created in i915_gem_context_lookup and we may start having more
> errors than -ENOENT.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
>  drivers/gpu/drm/i915/i915_drv.h                |  2 +-
>  drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
>  4 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 3e883daab93bf..7929d5a8be449 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
>  	int ret = 0;
>  
>  	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> -	if (!ctx)
> -		return -ENOENT;
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>  
>  	switch (args->param) {
>  	case I915_CONTEXT_PARAM_GTT_SIZE:
> @@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  	int ret;
>  
>  	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> -	if (!ctx)
> -		return -ENOENT;
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>  
>  	ret = ctx_setparam(file_priv, ctx, args);
>  
> @@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>  		return -EINVAL;
>  
>  	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> -	if (!ctx)
> -		return -ENOENT;
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>  
>  	/*
>  	 * We opt for unserialised reads here. This may result in tearing
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 7024adcd5cf15..de14b26f3b2d5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
>  	struct i915_gem_context *ctx;
>  
>  	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
> -	if (unlikely(!ctx))
> -		return -ENOENT;
> +	if (unlikely(IS_ERR(ctx)))
> +		return PTR_ERR(ctx);
>  
>  	eb->gem_context = ctx;
>  	if (rcu_access_pointer(ctx->vm))
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8571c5c1509a7..004ed0e59c999 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h

I just realized that I think __i915_gem_context_lookup_rcu doesn't have
users anymore. Please make sure it's deleted.

> @@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
>  		ctx = NULL;
>  	rcu_read_unlock();
>  
> -	return ctx;
> +	return ctx ? ctx : ERR_PTR(-ENOENT);
>  }
>  
>  /* i915_gem_evict.c */
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 85ad62dbabfab..b86ed03f6a705 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
>  		struct drm_i915_file_private *file_priv = file->driver_priv;
>  
>  		specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
> -		if (!specific_ctx) {
> +		if (IS_ERR(specific_ctx)) {
>  			DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
>  				  ctx_handle);
> -			ret = -ENOENT;
> +			ret = PTR_ERR(specific_ctx);

Yeah this looks like a nice place to integrate this.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

One thing we need to make sure in the next patch or thereabouts is that
lookup can only return ENOENT or ENOMEM, but never EINVAL. I'll drop some
bikesheds on that :-)
-Daniel

>  			goto err;
>  		}
>  	}
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
  2021-04-29 12:08           ` Daniel Vetter
@ 2021-04-29 14:47             ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 14:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Tvrtko Ursulin, Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:08 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 09:06:47AM +0100, Tvrtko Ursulin wrote:
> >
> > On 28/04/2021 18:26, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >
> > > >
> > > > On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > > > This API is entirely unnecessary and I'd love to get rid of it.  If
> > > > > userspace wants a single timeline across multiple contexts, they can
> > > > > either use implicit synchronization or a syncobj, both of which existed
> > > > > at the time this feature landed.  The justification given at the time
> > > > > was that it would help GL drivers which are inherently single-timeline.
> > > > > However, neither of our GL drivers actually wanted the feature.  i965
> > > > > was already in maintenance mode at the time and iris uses syncobj for
> > > > > everything.
> > > > >
> > > > > Unfortunately, as much as I'd love to get rid of it, it is used by the
> > > > > media driver so we can't do that.  We can, however, do the next-best
> > > > > thing which is to embed a syncobj in the context and do exactly what
> > > > > we'd expect from userspace internally.  This isn't an entirely identical
> > > > > implementation because it's no longer atomic if userspace races with
> > > > > itself by calling execbuffer2 twice simultaneously from different
> > > > > threads.  It won't crash in that case; it just doesn't guarantee any
> > > > > ordering between those two submits.
> > > >
> > > > 1)
> > > >
> > > > Please also mention the difference in context/timeline name when
> > > > observed via the sync file API.
> > > >
> > > > 2)
> > > >
> > > > I don't remember what we have concluded in terms of observable effects
> > > > in sync_file_merge?
> > >
> > > I don't see how either of these are observable since this syncobj is
> > > never exposed to userspace in any way.  Please help me understand what
> > > I'm missing here.
> >
> > Single timeline context - two execbufs - return two out fences.
> >
> > Before the patch those two had the same fence context, with the patch they
> > have different ones.
> >
> > Fence context is visible to userspace via sync file info (timeline name at
> > least) and rules in sync_file_merge.

Thanks!  How about adding this to the commit message:

It also means that sync files exported from different engines on a
SINGLE_TIMELINE context will have different fence contexts.  This is
visible to userspace if it looks at the obj_name field of
sync_fence_info.

I don't think sync_file_merge is significantly affected.  If there are
ever multiple fences in a sync_file, it gets a driver name of
"dma_fence_array" and an object name of "unbound".

--Jason

> Good point worth mentioninig in the commit message.
>
> media-driver doesn't use any of this in combination with single_timeline,
> so we just dont care.
> -Daniel
>
> >
> > Regards,
> >
> > Tvrtko
> >
> > >
> > > --Jason
> > >
> > >
> > > > Regards,
> > > >
> > > > Tvrtko
> > > >
> > > > > Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> > > > > advantages beyond mere annoyance.  One is that intel_timeline is no
> > > > > longer an api-visible object and can remain entirely an implementation
> > > > > detail.  This may be advantageous as we make scheduler changes going
> > > > > forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> > > > > we should now have a 1:1 mapping between intel_context and
> > > > > intel_timeline which may help us reduce locking.
> > > > >
> > > > > v2 (Jason Ekstrand):
> > > > >    - Update the comment on i915_gem_context::syncobj to mention that it's
> > > > >      an emulation and the possible race if userspace calls execbuffer2
> > > > >      twice on the same context concurrently.
> > > > >    - Wrap the checks for eb.gem_context->syncobj in unlikely()
> > > > >    - Drop the dma_fence reference
> > > > >    - Improved commit message
> > > > >
> > > > > v3 (Jason Ekstrand):
> > > > >    - Move the dma_fence_put() to before the error exit
> > > > >
> > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
> > > > >    .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
> > > > >    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
> > > > >    3 files changed, 40 insertions(+), 39 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > index 2c2fefa912805..a72c9b256723b 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > @@ -67,6 +67,8 @@
> > > > >    #include <linux/log2.h>
> > > > >    #include <linux/nospec.h>
> > > > >
> > > > > +#include <drm/drm_syncobj.h>
> > > > > +
> > > > >    #include "gt/gen6_ppgtt.h"
> > > > >    #include "gt/intel_context.h"
> > > > >    #include "gt/intel_context_param.h"
> > > > > @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > > >                ce->vm = vm;
> > > > >        }
> > > > >
> > > > > -     GEM_BUG_ON(ce->timeline);
> > > > > -     if (ctx->timeline)
> > > > > -             ce->timeline = intel_timeline_get(ctx->timeline);
> > > > > -
> > > > >        if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > > > >            intel_engine_has_timeslices(ce->engine))
> > > > >                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > > @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
> > > > >        mutex_destroy(&ctx->engines_mutex);
> > > > >        mutex_destroy(&ctx->lut_mutex);
> > > > >
> > > > > -     if (ctx->timeline)
> > > > > -             intel_timeline_put(ctx->timeline);
> > > > > -
> > > > >        put_pid(ctx->pid);
> > > > >        mutex_destroy(&ctx->mutex);
> > > > >
> > > > > @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
> > > > >        if (vm)
> > > > >                i915_vm_close(vm);
> > > > >
> > > > > +     if (ctx->syncobj)
> > > > > +             drm_syncobj_put(ctx->syncobj);
> > > > > +
> > > > >        ctx->file_priv = ERR_PTR(-EBADF);
> > > > >
> > > > >        /*
> > > > > @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> > > > >                i915_vm_close(vm);
> > > > >    }
> > > > >
> > > > > -static void __set_timeline(struct intel_timeline **dst,
> > > > > -                        struct intel_timeline *src)
> > > > > -{
> > > > > -     struct intel_timeline *old = *dst;
> > > > > -
> > > > > -     *dst = src ? intel_timeline_get(src) : NULL;
> > > > > -
> > > > > -     if (old)
> > > > > -             intel_timeline_put(old);
> > > > > -}
> > > > > -
> > > > > -static void __apply_timeline(struct intel_context *ce, void *timeline)
> > > > > -{
> > > > > -     __set_timeline(&ce->timeline, timeline);
> > > > > -}
> > > > > -
> > > > > -static void __assign_timeline(struct i915_gem_context *ctx,
> > > > > -                           struct intel_timeline *timeline)
> > > > > -{
> > > > > -     __set_timeline(&ctx->timeline, timeline);
> > > > > -     context_apply_all(ctx, __apply_timeline, timeline);
> > > > > -}
> > > > > -
> > > > >    static struct i915_gem_context *
> > > > >    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > > >    {
> > > > >        struct i915_gem_context *ctx;
> > > > > +     int ret;
> > > > >
> > > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > > > >            !HAS_EXECLISTS(i915))
> > > > > @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > > >        }
> > > > >
> > > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > > > > -             struct intel_timeline *timeline;
> > > > > -
> > > > > -             timeline = intel_timeline_create(&i915->gt);
> > > > > -             if (IS_ERR(timeline)) {
> > > > > +             ret = drm_syncobj_create(&ctx->syncobj,
> > > > > +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
> > > > > +                                      NULL);
> > > > > +             if (ret) {
> > > > >                        context_close(ctx);
> > > > > -                     return ERR_CAST(timeline);
> > > > > +                     return ERR_PTR(ret);
> > > > >                }
> > > > > -
> > > > > -             __assign_timeline(ctx, timeline);
> > > > > -             intel_timeline_put(timeline);
> > > > >        }
> > > > >
> > > > >        trace_i915_context_create(ctx);
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > > index 676592e27e7d2..df76767f0c41b 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > > @@ -83,7 +83,19 @@ struct i915_gem_context {
> > > > >        struct i915_gem_engines __rcu *engines;
> > > > >        struct mutex engines_mutex; /* guards writes to engines */
> > > > >
> > > > > -     struct intel_timeline *timeline;
> > > > > +     /**
> > > > > +      * @syncobj: Shared timeline syncobj
> > > > > +      *
> > > > > +      * When the SHARED_TIMELINE flag is set on context creation, we
> > > > > +      * emulate a single timeline across all engines using this syncobj.
> > > > > +      * For every execbuffer2 call, this syncobj is used as both an in-
> > > > > +      * and out-fence.  Unlike the real intel_timeline, this doesn't
> > > > > +      * provide perfect atomic in-order guarantees if the client races
> > > > > +      * with itself by calling execbuffer2 twice concurrently.  However,
> > > > > +      * if userspace races with itself, that's not likely to yield well-
> > > > > +      * defined results anyway so we choose to not care.
> > > > > +      */
> > > > > +     struct drm_syncobj *syncobj;
> > > > >
> > > > >        /**
> > > > >         * @vm: unique address space (GTT)
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > index b812f313422a9..d640bba6ad9ab 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > >                goto err_vma;
> > > > >        }
> > > > >
> > > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > > +             struct dma_fence *fence;
> > > > > +
> > > > > +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> > > > > +             err = i915_request_await_dma_fence(eb.request, fence);
> > > > > +             dma_fence_put(fence);
> > > > > +             if (err)
> > > > > +                     goto err_ext;
> > > > > +     }
> > > > > +
> > > > >        if (in_fence) {
> > > > >                if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > >                        err = i915_request_await_execution(eb.request,
> > > > > @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > >                        fput(out_fence->file);
> > > > >                }
> > > > >        }
> > > > > +
> > > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > > +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
> > > > > +                                       &eb.request->fence);
> > > > > +     }
> > > > > +
> > > > >        i915_request_put(eb.request);
> > > > >
> > > > >    err_vma:
> > > > >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3)
@ 2021-04-29 14:47             ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 14:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:08 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 09:06:47AM +0100, Tvrtko Ursulin wrote:
> >
> > On 28/04/2021 18:26, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 10:49 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >
> > > >
> > > > On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > > > This API is entirely unnecessary and I'd love to get rid of it.  If
> > > > > userspace wants a single timeline across multiple contexts, they can
> > > > > either use implicit synchronization or a syncobj, both of which existed
> > > > > at the time this feature landed.  The justification given at the time
> > > > > was that it would help GL drivers which are inherently single-timeline.
> > > > > However, neither of our GL drivers actually wanted the feature.  i965
> > > > > was already in maintenance mode at the time and iris uses syncobj for
> > > > > everything.
> > > > >
> > > > > Unfortunately, as much as I'd love to get rid of it, it is used by the
> > > > > media driver so we can't do that.  We can, however, do the next-best
> > > > > thing which is to embed a syncobj in the context and do exactly what
> > > > > we'd expect from userspace internally.  This isn't an entirely identical
> > > > > implementation because it's no longer atomic if userspace races with
> > > > > itself by calling execbuffer2 twice simultaneously from different
> > > > > threads.  It won't crash in that case; it just doesn't guarantee any
> > > > > ordering between those two submits.
> > > >
> > > > 1)
> > > >
> > > > Please also mention the difference in context/timeline name when
> > > > observed via the sync file API.
> > > >
> > > > 2)
> > > >
> > > > I don't remember what we have concluded in terms of observable effects
> > > > in sync_file_merge?
> > >
> > > I don't see how either of these are observable since this syncobj is
> > > never exposed to userspace in any way.  Please help me understand what
> > > I'm missing here.
> >
> > Single timeline context - two execbufs - return two out fences.
> >
> > Before the patch those two had the same fence context, with the patch they
> > have different ones.
> >
> > Fence context is visible to userspace via sync file info (timeline name at
> > least) and rules in sync_file_merge.

Thanks!  How about adding this to the commit message:

It also means that sync files exported from different engines on a
SINGLE_TIMELINE context will have different fence contexts.  This is
visible to userspace if it looks at the obj_name field of
sync_fence_info.

I don't think sync_file_merge is significantly affected.  If there are
ever multiple fences in a sync_file, it gets a driver name of
"dma_fence_array" and an object name of "unbound".

--Jason

> Good point worth mentioninig in the commit message.
>
> media-driver doesn't use any of this in combination with single_timeline,
> so we just dont care.
> -Daniel
>
> >
> > Regards,
> >
> > Tvrtko
> >
> > >
> > > --Jason
> > >
> > >
> > > > Regards,
> > > >
> > > > Tvrtko
> > > >
> > > > > Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> > > > > advantages beyond mere annoyance.  One is that intel_timeline is no
> > > > > longer an api-visible object and can remain entirely an implementation
> > > > > detail.  This may be advantageous as we make scheduler changes going
> > > > > forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> > > > > we should now have a 1:1 mapping between intel_context and
> > > > > intel_timeline which may help us reduce locking.
> > > > >
> > > > > v2 (Jason Ekstrand):
> > > > >    - Update the comment on i915_gem_context::syncobj to mention that it's
> > > > >      an emulation and the possible race if userspace calls execbuffer2
> > > > >      twice on the same context concurrently.
> > > > >    - Wrap the checks for eb.gem_context->syncobj in unlikely()
> > > > >    - Drop the dma_fence reference
> > > > >    - Improved commit message
> > > > >
> > > > > v3 (Jason Ekstrand):
> > > > >    - Move the dma_fence_put() to before the error exit
> > > > >
> > > > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 49 +++++--------------
> > > > >    .../gpu/drm/i915/gem/i915_gem_context_types.h | 14 +++++-
> > > > >    .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 16 ++++++
> > > > >    3 files changed, 40 insertions(+), 39 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > index 2c2fefa912805..a72c9b256723b 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > @@ -67,6 +67,8 @@
> > > > >    #include <linux/log2.h>
> > > > >    #include <linux/nospec.h>
> > > > >
> > > > > +#include <drm/drm_syncobj.h>
> > > > > +
> > > > >    #include "gt/gen6_ppgtt.h"
> > > > >    #include "gt/intel_context.h"
> > > > >    #include "gt/intel_context_param.h"
> > > > > @@ -225,10 +227,6 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > > >                ce->vm = vm;
> > > > >        }
> > > > >
> > > > > -     GEM_BUG_ON(ce->timeline);
> > > > > -     if (ctx->timeline)
> > > > > -             ce->timeline = intel_timeline_get(ctx->timeline);
> > > > > -
> > > > >        if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > > > >            intel_engine_has_timeslices(ce->engine))
> > > > >                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > > @@ -351,9 +349,6 @@ void i915_gem_context_release(struct kref *ref)
> > > > >        mutex_destroy(&ctx->engines_mutex);
> > > > >        mutex_destroy(&ctx->lut_mutex);
> > > > >
> > > > > -     if (ctx->timeline)
> > > > > -             intel_timeline_put(ctx->timeline);
> > > > > -
> > > > >        put_pid(ctx->pid);
> > > > >        mutex_destroy(&ctx->mutex);
> > > > >
> > > > > @@ -570,6 +565,9 @@ static void context_close(struct i915_gem_context *ctx)
> > > > >        if (vm)
> > > > >                i915_vm_close(vm);
> > > > >
> > > > > +     if (ctx->syncobj)
> > > > > +             drm_syncobj_put(ctx->syncobj);
> > > > > +
> > > > >        ctx->file_priv = ERR_PTR(-EBADF);
> > > > >
> > > > >        /*
> > > > > @@ -765,33 +763,11 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> > > > >                i915_vm_close(vm);
> > > > >    }
> > > > >
> > > > > -static void __set_timeline(struct intel_timeline **dst,
> > > > > -                        struct intel_timeline *src)
> > > > > -{
> > > > > -     struct intel_timeline *old = *dst;
> > > > > -
> > > > > -     *dst = src ? intel_timeline_get(src) : NULL;
> > > > > -
> > > > > -     if (old)
> > > > > -             intel_timeline_put(old);
> > > > > -}
> > > > > -
> > > > > -static void __apply_timeline(struct intel_context *ce, void *timeline)
> > > > > -{
> > > > > -     __set_timeline(&ce->timeline, timeline);
> > > > > -}
> > > > > -
> > > > > -static void __assign_timeline(struct i915_gem_context *ctx,
> > > > > -                           struct intel_timeline *timeline)
> > > > > -{
> > > > > -     __set_timeline(&ctx->timeline, timeline);
> > > > > -     context_apply_all(ctx, __apply_timeline, timeline);
> > > > > -}
> > > > > -
> > > > >    static struct i915_gem_context *
> > > > >    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > > >    {
> > > > >        struct i915_gem_context *ctx;
> > > > > +     int ret;
> > > > >
> > > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > > > >            !HAS_EXECLISTS(i915))
> > > > > @@ -820,16 +796,13 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > > > >        }
> > > > >
> > > > >        if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > > > > -             struct intel_timeline *timeline;
> > > > > -
> > > > > -             timeline = intel_timeline_create(&i915->gt);
> > > > > -             if (IS_ERR(timeline)) {
> > > > > +             ret = drm_syncobj_create(&ctx->syncobj,
> > > > > +                                      DRM_SYNCOBJ_CREATE_SIGNALED,
> > > > > +                                      NULL);
> > > > > +             if (ret) {
> > > > >                        context_close(ctx);
> > > > > -                     return ERR_CAST(timeline);
> > > > > +                     return ERR_PTR(ret);
> > > > >                }
> > > > > -
> > > > > -             __assign_timeline(ctx, timeline);
> > > > > -             intel_timeline_put(timeline);
> > > > >        }
> > > > >
> > > > >        trace_i915_context_create(ctx);
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > > index 676592e27e7d2..df76767f0c41b 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > > > > @@ -83,7 +83,19 @@ struct i915_gem_context {
> > > > >        struct i915_gem_engines __rcu *engines;
> > > > >        struct mutex engines_mutex; /* guards writes to engines */
> > > > >
> > > > > -     struct intel_timeline *timeline;
> > > > > +     /**
> > > > > +      * @syncobj: Shared timeline syncobj
> > > > > +      *
> > > > > +      * When the SHARED_TIMELINE flag is set on context creation, we
> > > > > +      * emulate a single timeline across all engines using this syncobj.
> > > > > +      * For every execbuffer2 call, this syncobj is used as both an in-
> > > > > +      * and out-fence.  Unlike the real intel_timeline, this doesn't
> > > > > +      * provide perfect atomic in-order guarantees if the client races
> > > > > +      * with itself by calling execbuffer2 twice concurrently.  However,
> > > > > +      * if userspace races with itself, that's not likely to yield well-
> > > > > +      * defined results anyway so we choose to not care.
> > > > > +      */
> > > > > +     struct drm_syncobj *syncobj;
> > > > >
> > > > >        /**
> > > > >         * @vm: unique address space (GTT)
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > index b812f313422a9..d640bba6ad9ab 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > > > @@ -3460,6 +3460,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > >                goto err_vma;
> > > > >        }
> > > > >
> > > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > > +             struct dma_fence *fence;
> > > > > +
> > > > > +             fence = drm_syncobj_fence_get(eb.gem_context->syncobj);
> > > > > +             err = i915_request_await_dma_fence(eb.request, fence);
> > > > > +             dma_fence_put(fence);
> > > > > +             if (err)
> > > > > +                     goto err_ext;
> > > > > +     }
> > > > > +
> > > > >        if (in_fence) {
> > > > >                if (args->flags & I915_EXEC_FENCE_SUBMIT)
> > > > >                        err = i915_request_await_execution(eb.request,
> > > > > @@ -3517,6 +3527,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > > > >                        fput(out_fence->file);
> > > > >                }
> > > > >        }
> > > > > +
> > > > > +     if (unlikely(eb.gem_context->syncobj)) {
> > > > > +             drm_syncobj_replace_fence(eb.gem_context->syncobj,
> > > > > +                                       &eb.request->fence);
> > > > > +     }
> > > > > +
> > > > >        i915_request_put(eb.request);
> > > > >
> > > > >    err_vma:
> > > > >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-29  8:04         ` Tvrtko Ursulin
@ 2021-04-29 14:54           ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 14:54 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 28/04/2021 18:24, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> >>> Instead of handling it like a context param, unconditionally set it when
> >>> intel_contexts are created.  This doesn't fix anything but does simplify
> >>> the code a bit.
> >>>
> >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>> ---
> >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> >>>            intel_engine_has_timeslices(ce->engine))
> >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> >>>
> >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> >>> +         ctx->i915->params.request_timeout_ms) {
> >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> >>
> >> Blank line between declarations and code please, or just lose the local.
> >>
> >> Otherwise looks okay. Slight change that same GEM context can now have a
> >> mix of different request expirations isn't interesting I think. At least
> >> the change goes away by the end of the series.
> >
> > In order for that to happen, I think you'd have to have a race between
> > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > sysfs.  Or am I missing something?  Given that timeouts are really
> > per-engine anyway, I don't think we need to care too much about that.
>
> We don't care, no.
>
> For completeness only - by the end of the series it is what you say. But
> at _this_ point in the series though it is if modparam changes at any
> point between context create and replacing engines. Which is a change
> compared to before this patch, since modparam was cached in the GEM
> context so far. So one GEM context was a single request_timeout_ms.

I've added the following to the commit message:

It also means that sync files exported from different engines on a
SINGLE_TIMELINE context will have different fence contexts.  This is
visible to userspace if it looks at the obj_name field of
sync_fence_info.

How's that sound?

--Jason

> Regards,
>
> Tvrtko
>
> > --Jason
> >
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> +     }
> >>>    }
> >>>
> >>>    static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> >>> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
> >>>        context_apply_all(ctx, __apply_timeline, timeline);
> >>>    }
> >>>
> >>> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> >>> -{
> >>> -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> >>> -}
> >>> -
> >>> -static int
> >>> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> >>> -{
> >>> -     int ret;
> >>> -
> >>> -     ret = context_apply_all(ctx, __apply_watchdog,
> >>> -                             (void *)(uintptr_t)timeout_us);
> >>> -     if (!ret)
> >>> -             ctx->watchdog.timeout_us = timeout_us;
> >>> -
> >>> -     return ret;
> >>> -}
> >>> -
> >>> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> >>> -{
> >>> -     struct drm_i915_private *i915 = ctx->i915;
> >>> -     int ret;
> >>> -
> >>> -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> >>> -         !i915->params.request_timeout_ms)
> >>> -             return;
> >>> -
> >>> -     /* Default expiry for user fences. */
> >>> -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> >>> -     if (ret)
> >>> -             drm_notice(&i915->drm,
> >>> -                        "Failed to configure default fence expiry! (%d)",
> >>> -                        ret);
> >>> -}
> >>> -
> >>>    static struct i915_gem_context *
> >>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >>>    {
> >>> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >>>                intel_timeline_put(timeline);
> >>>        }
> >>>
> >>> -     __set_default_fence_expiry(ctx);
> >>> -
> >>>        trace_i915_context_create(ctx);
> >>>
> >>>        return ctx;
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> >>> index 5ae71ec936f7c..676592e27e7d2 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> >>> @@ -153,10 +153,6 @@ struct i915_gem_context {
> >>>         */
> >>>        atomic_t active_count;
> >>>
> >>> -     struct {
> >>> -             u64 timeout_us;
> >>> -     } watchdog;
> >>> -
> >>>        /**
> >>>         * @hang_timestamp: The last time(s) this context caused a GPU hang
> >>>         */
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> >>> index dffedd983693d..0c69cb42d075c 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> >>> @@ -10,11 +10,10 @@
> >>>
> >>>    #include "intel_context.h"
> >>>
> >>> -static inline int
> >>> +static inline void
> >>>    intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> >>>    {
> >>>        ce->watchdog.timeout_us = timeout_us;
> >>> -     return 0;
> >>>    }
> >>>
> >>>    #endif /* INTEL_CONTEXT_PARAM_H */
> >>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-29 14:54           ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 14:54 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 28/04/2021 18:24, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> >>> Instead of handling it like a context param, unconditionally set it when
> >>> intel_contexts are created.  This doesn't fix anything but does simplify
> >>> the code a bit.
> >>>
> >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>> ---
> >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> >>>            intel_engine_has_timeslices(ce->engine))
> >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> >>>
> >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> >>> +         ctx->i915->params.request_timeout_ms) {
> >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> >>
> >> Blank line between declarations and code please, or just lose the local.
> >>
> >> Otherwise looks okay. Slight change that same GEM context can now have a
> >> mix of different request expirations isn't interesting I think. At least
> >> the change goes away by the end of the series.
> >
> > In order for that to happen, I think you'd have to have a race between
> > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > sysfs.  Or am I missing something?  Given that timeouts are really
> > per-engine anyway, I don't think we need to care too much about that.
>
> We don't care, no.
>
> For completeness only - by the end of the series it is what you say. But
> at _this_ point in the series though it is if modparam changes at any
> point between context create and replacing engines. Which is a change
> compared to before this patch, since modparam was cached in the GEM
> context so far. So one GEM context was a single request_timeout_ms.

I've added the following to the commit message:

It also means that sync files exported from different engines on a
SINGLE_TIMELINE context will have different fence contexts.  This is
visible to userspace if it looks at the obj_name field of
sync_fence_info.

How's that sound?

--Jason

> Regards,
>
> Tvrtko
>
> > --Jason
> >
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> +     }
> >>>    }
> >>>
> >>>    static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> >>> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
> >>>        context_apply_all(ctx, __apply_timeline, timeline);
> >>>    }
> >>>
> >>> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> >>> -{
> >>> -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> >>> -}
> >>> -
> >>> -static int
> >>> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> >>> -{
> >>> -     int ret;
> >>> -
> >>> -     ret = context_apply_all(ctx, __apply_watchdog,
> >>> -                             (void *)(uintptr_t)timeout_us);
> >>> -     if (!ret)
> >>> -             ctx->watchdog.timeout_us = timeout_us;
> >>> -
> >>> -     return ret;
> >>> -}
> >>> -
> >>> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> >>> -{
> >>> -     struct drm_i915_private *i915 = ctx->i915;
> >>> -     int ret;
> >>> -
> >>> -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> >>> -         !i915->params.request_timeout_ms)
> >>> -             return;
> >>> -
> >>> -     /* Default expiry for user fences. */
> >>> -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> >>> -     if (ret)
> >>> -             drm_notice(&i915->drm,
> >>> -                        "Failed to configure default fence expiry! (%d)",
> >>> -                        ret);
> >>> -}
> >>> -
> >>>    static struct i915_gem_context *
> >>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >>>    {
> >>> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> >>>                intel_timeline_put(timeline);
> >>>        }
> >>>
> >>> -     __set_default_fence_expiry(ctx);
> >>> -
> >>>        trace_i915_context_create(ctx);
> >>>
> >>>        return ctx;
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> >>> index 5ae71ec936f7c..676592e27e7d2 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> >>> @@ -153,10 +153,6 @@ struct i915_gem_context {
> >>>         */
> >>>        atomic_t active_count;
> >>>
> >>> -     struct {
> >>> -             u64 timeout_us;
> >>> -     } watchdog;
> >>> -
> >>>        /**
> >>>         * @hang_timestamp: The last time(s) this context caused a GPU hang
> >>>         */
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> >>> index dffedd983693d..0c69cb42d075c 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> >>> @@ -10,11 +10,10 @@
> >>>
> >>>    #include "intel_context.h"
> >>>
> >>> -static inline int
> >>> +static inline void
> >>>    intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> >>>    {
> >>>        ce->watchdog.timeout_us = timeout_us;
> >>> -     return 0;
> >>>    }
> >>>
> >>>    #endif /* INTEL_CONTEXT_PARAM_H */
> >>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file
  2021-04-29 12:37     ` [Intel-gfx] " Daniel Vetter
@ 2021-04-29 15:26       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 15:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:37 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:25PM -0500, Jason Ekstrand wrote:
> > There's a big comment saying how useful it is but no one is using this
> > for anything.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>
> I was trying to find anything before all your deletions, but alas nothing.
> I did spent a bit of time on this, and discovered that the debugfs use was
> nuked in
>
> db80a1294c23 ("drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects")
>
> After going through quite a few iterations, e.g.
>
> 5b5efdf79abf ("drm/i915: Make debugfs/per_file_stats scale better")
> f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")
>
> The above removed the need for vm->file because stats debugfs file
> filtered using stats->vm instead of stats->file.
>
> History goes on until the original introduction of this (again for
> debugfs) in
>
> 2bfa996e031b ("drm/i915: Store owning file on the i915_address_space")

I've added the following to the commit message:

    It was added in 2bfa996e031b ("drm/i915: Store owning file on the
    i915_address_space") and used for debugfs at the time as well as telling
    the difference between the global GTT and a PPGTT.  In f6e8aa387171
    ("drm/i915: Report the number of closed vma held by each context in
    debugfs") we removed one use of it by switching to a context walk and
    comparing with the VM in the context.  Finally, VM stats for debugfs
    were entirely nuked in db80a1294c23 ("drm/i915/gem: Remove per-client
    stats from debugfs/i915_gem_objects")


> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 ---------
> >  drivers/gpu/drm/i915/gt/intel_gtt.h         | 10 ----------
> >  drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
> >  3 files changed, 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 7929d5a8be449..db9153e0f85a7 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -921,17 +921,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >                               u32 *id)
> >  {
> >       struct drm_i915_private *i915 = ctx->i915;
> > -     struct i915_address_space *vm;
> >       int ret;
> >
> >       ctx->file_priv = fpriv;
> >
> > -     mutex_lock(&ctx->mutex);
> > -     vm = i915_gem_context_vm(ctx);
> > -     if (vm)
> > -             WRITE_ONCE(vm->file, fpriv); /* XXX */
> > -     mutex_unlock(&ctx->mutex);
> > -
> >       ctx->pid = get_task_pid(current, PIDTYPE_PID);
> >       snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
> >                current->comm, pid_nr(ctx->pid));
> > @@ -1030,8 +1023,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
> >       if (IS_ERR(ppgtt))
> >               return PTR_ERR(ppgtt);
> >
> > -     ppgtt->vm.file = file_priv;
> > -
> >       if (args->extensions) {
> >               err = i915_user_extensions(u64_to_user_ptr(args->extensions),
> >                                          NULL, 0,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > index e67e34e179131..4c46068e63c9d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > @@ -217,16 +217,6 @@ struct i915_address_space {
>
> Pls also delete the drm_i915_file_private pre-dcl in this file.

Done!

> With this added and the history adequately covered in the commit message:
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Thanks

--Jason


>
> >       struct intel_gt *gt;
> >       struct drm_i915_private *i915;
> >       struct device *dma;
> > -     /*
> > -      * Every address space belongs to a struct file - except for the global
> > -      * GTT that is owned by the driver (and so @file is set to NULL). In
> > -      * principle, no information should leak from one context to another
> > -      * (or between files/processes etc) unless explicitly shared by the
> > -      * owner. Tracking the owner is important in order to free up per-file
> > -      * objects along with the file, to aide resource tracking, and to
> > -      * assign blame.
> > -      */
> > -     struct drm_i915_file_private *file;
> >       u64 total;              /* size addr space maps (ex. 2GB for ggtt) */
> >       u64 reserved;           /* size addr space reserved */
> >
> > diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> > index 5c7ae40bba634..cc047ec594f93 100644
> > --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> > +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> > @@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
> >       ppgtt->vm.gt = &i915->gt;
> >       ppgtt->vm.i915 = i915;
> >       ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
> > -     ppgtt->vm.file = ERR_PTR(-ENODEV);
> >       ppgtt->vm.dma = i915->drm.dev;
> >
> >       i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file
@ 2021-04-29 15:26       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 15:26 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:37 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:25PM -0500, Jason Ekstrand wrote:
> > There's a big comment saying how useful it is but no one is using this
> > for anything.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>
> I was trying to find anything before all your deletions, but alas nothing.
> I did spent a bit of time on this, and discovered that the debugfs use was
> nuked in
>
> db80a1294c23 ("drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects")
>
> After going through quite a few iterations, e.g.
>
> 5b5efdf79abf ("drm/i915: Make debugfs/per_file_stats scale better")
> f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")
>
> The above removed the need for vm->file because stats debugfs file
> filtered using stats->vm instead of stats->file.
>
> History goes on until the original introduction of this (again for
> debugfs) in
>
> 2bfa996e031b ("drm/i915: Store owning file on the i915_address_space")

I've added the following to the commit message:

    It was added in 2bfa996e031b ("drm/i915: Store owning file on the
    i915_address_space") and used for debugfs at the time as well as telling
    the difference between the global GTT and a PPGTT.  In f6e8aa387171
    ("drm/i915: Report the number of closed vma held by each context in
    debugfs") we removed one use of it by switching to a context walk and
    comparing with the VM in the context.  Finally, VM stats for debugfs
    were entirely nuked in db80a1294c23 ("drm/i915/gem: Remove per-client
    stats from debugfs/i915_gem_objects")


> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 ---------
> >  drivers/gpu/drm/i915/gt/intel_gtt.h         | 10 ----------
> >  drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
> >  3 files changed, 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 7929d5a8be449..db9153e0f85a7 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -921,17 +921,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >                               u32 *id)
> >  {
> >       struct drm_i915_private *i915 = ctx->i915;
> > -     struct i915_address_space *vm;
> >       int ret;
> >
> >       ctx->file_priv = fpriv;
> >
> > -     mutex_lock(&ctx->mutex);
> > -     vm = i915_gem_context_vm(ctx);
> > -     if (vm)
> > -             WRITE_ONCE(vm->file, fpriv); /* XXX */
> > -     mutex_unlock(&ctx->mutex);
> > -
> >       ctx->pid = get_task_pid(current, PIDTYPE_PID);
> >       snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
> >                current->comm, pid_nr(ctx->pid));
> > @@ -1030,8 +1023,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
> >       if (IS_ERR(ppgtt))
> >               return PTR_ERR(ppgtt);
> >
> > -     ppgtt->vm.file = file_priv;
> > -
> >       if (args->extensions) {
> >               err = i915_user_extensions(u64_to_user_ptr(args->extensions),
> >                                          NULL, 0,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > index e67e34e179131..4c46068e63c9d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > @@ -217,16 +217,6 @@ struct i915_address_space {
>
> Pls also delete the drm_i915_file_private pre-dcl in this file.

Done!

> With this added and the history adequately covered in the commit message:
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Thanks

--Jason


>
> >       struct intel_gt *gt;
> >       struct drm_i915_private *i915;
> >       struct device *dma;
> > -     /*
> > -      * Every address space belongs to a struct file - except for the global
> > -      * GTT that is owned by the driver (and so @file is set to NULL). In
> > -      * principle, no information should leak from one context to another
> > -      * (or between files/processes etc) unless explicitly shared by the
> > -      * owner. Tracking the owner is important in order to free up per-file
> > -      * objects along with the file, to aide resource tracking, and to
> > -      * assign blame.
> > -      */
> > -     struct drm_i915_file_private *file;
> >       u64 total;              /* size addr space maps (ex. 2GB for ggtt) */
> >       u64 reserved;           /* size addr space reserved */
> >
> > diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> > index 5c7ae40bba634..cc047ec594f93 100644
> > --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> > +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> > @@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
> >       ppgtt->vm.gt = &i915->gt;
> >       ppgtt->vm.i915 = i915;
> >       ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
> > -     ppgtt->vm.file = ERR_PTR(-ENODEV);
> >       ppgtt->vm.dma = i915->drm.dev;
> >
> >       i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
  2021-04-29 13:27     ` Daniel Vetter
@ 2021-04-29 15:29       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 15:29 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 8:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:24PM -0500, Jason Ekstrand wrote:
> > We're about to start doing lazy context creation which means contexts
> > get created in i915_gem_context_lookup and we may start having more
> > errors than -ENOENT.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
> >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
> >  drivers/gpu/drm/i915/i915_drv.h                |  2 +-
> >  drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
> >  4 files changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 3e883daab93bf..7929d5a8be449 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> >       int ret = 0;
> >
> >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > -     if (!ctx)
> > -             return -ENOENT;
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >
> >       switch (args->param) {
> >       case I915_CONTEXT_PARAM_GTT_SIZE:
> > @@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >       int ret;
> >
> >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > -     if (!ctx)
> > -             return -ENOENT;
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >
> >       ret = ctx_setparam(file_priv, ctx, args);
> >
> > @@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> >               return -EINVAL;
> >
> >       ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> > -     if (!ctx)
> > -             return -ENOENT;
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >
> >       /*
> >        * We opt for unserialised reads here. This may result in tearing
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 7024adcd5cf15..de14b26f3b2d5 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
> >       struct i915_gem_context *ctx;
> >
> >       ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
> > -     if (unlikely(!ctx))
> > -             return -ENOENT;
> > +     if (unlikely(IS_ERR(ctx)))
> > +             return PTR_ERR(ctx);
> >
> >       eb->gem_context = ctx;
> >       if (rcu_access_pointer(ctx->vm))
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 8571c5c1509a7..004ed0e59c999 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
>
> I just realized that I think __i915_gem_context_lookup_rcu doesn't have
> users anymore. Please make sure it's deleted.

I deleted it in "drm/i915: Stop manually RCU banging in reset_stats_ioctl"


> > @@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> >               ctx = NULL;
> >       rcu_read_unlock();
> >
> > -     return ctx;
> > +     return ctx ? ctx : ERR_PTR(-ENOENT);
> >  }
> >
> >  /* i915_gem_evict.c */
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > index 85ad62dbabfab..b86ed03f6a705 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
> >               struct drm_i915_file_private *file_priv = file->driver_priv;
> >
> >               specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
> > -             if (!specific_ctx) {
> > +             if (IS_ERR(specific_ctx)) {
> >                       DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
> >                                 ctx_handle);
> > -                     ret = -ENOENT;
> > +                     ret = PTR_ERR(specific_ctx);
>
> Yeah this looks like a nice place to integrate this.
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> One thing we need to make sure in the next patch or thereabouts is that
> lookup can only return ENOENT or ENOMEM, but never EINVAL. I'll drop some
> bikesheds on that :-)

I believe that is the case.  All -EINVAL should be handled in the
proto-context code.

--Jason

> -Daniel
>
> >                       goto err;
> >               }
> >       }
> > --
> > 2.31.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
@ 2021-04-29 15:29       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 15:29 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 8:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 23, 2021 at 05:31:24PM -0500, Jason Ekstrand wrote:
> > We're about to start doing lazy context creation which means contexts
> > get created in i915_gem_context_lookup and we may start having more
> > errors than -ENOENT.
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
> >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
> >  drivers/gpu/drm/i915/i915_drv.h                |  2 +-
> >  drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
> >  4 files changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 3e883daab93bf..7929d5a8be449 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> >       int ret = 0;
> >
> >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > -     if (!ctx)
> > -             return -ENOENT;
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >
> >       switch (args->param) {
> >       case I915_CONTEXT_PARAM_GTT_SIZE:
> > @@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >       int ret;
> >
> >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > -     if (!ctx)
> > -             return -ENOENT;
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >
> >       ret = ctx_setparam(file_priv, ctx, args);
> >
> > @@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> >               return -EINVAL;
> >
> >       ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> > -     if (!ctx)
> > -             return -ENOENT;
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >
> >       /*
> >        * We opt for unserialised reads here. This may result in tearing
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 7024adcd5cf15..de14b26f3b2d5 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
> >       struct i915_gem_context *ctx;
> >
> >       ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
> > -     if (unlikely(!ctx))
> > -             return -ENOENT;
> > +     if (unlikely(IS_ERR(ctx)))
> > +             return PTR_ERR(ctx);
> >
> >       eb->gem_context = ctx;
> >       if (rcu_access_pointer(ctx->vm))
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 8571c5c1509a7..004ed0e59c999 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
>
> I just realized that I think __i915_gem_context_lookup_rcu doesn't have
> users anymore. Please make sure it's deleted.

I deleted it in "drm/i915: Stop manually RCU banging in reset_stats_ioctl"


> > @@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> >               ctx = NULL;
> >       rcu_read_unlock();
> >
> > -     return ctx;
> > +     return ctx ? ctx : ERR_PTR(-ENOENT);
> >  }
> >
> >  /* i915_gem_evict.c */
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > index 85ad62dbabfab..b86ed03f6a705 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
> >               struct drm_i915_file_private *file_priv = file->driver_priv;
> >
> >               specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
> > -             if (!specific_ctx) {
> > +             if (IS_ERR(specific_ctx)) {
> >                       DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
> >                                 ctx_handle);
> > -                     ret = -ENOENT;
> > +                     ret = PTR_ERR(specific_ctx);
>
> Yeah this looks like a nice place to integrate this.
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> One thing we need to make sure in the next patch or thereabouts is that
> lookup can only return ENOENT or ENOMEM, but never EINVAL. I'll drop some
> bikesheds on that :-)

I believe that is the case.  All -EINVAL should be handled in the
proto-context code.

--Jason

> -Daniel
>
> >                       goto err;
> >               }
> >       }
> > --
> > 2.31.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-29 12:54         ` Tvrtko Ursulin
@ 2021-04-29 15:41           ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 15:41 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:54 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 29/04/2021 13:24, Daniel Vetter wrote:
> > On Wed, Apr 28, 2021 at 04:51:19PM +0100, Tvrtko Ursulin wrote:
> >>
> >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> >>> This adds a bunch of complexity which the media driver has never
> >>> actually used.  The media driver does technically bond a balanced engine
> >>> to another engine but the balanced engine only has one engine in the
> >>> sibling set.  This doesn't actually result in a virtual engine.
> >>
> >> For historical reference, this is not because uapi was over-engineered but
> >> because certain SKUs never materialized.
> >
> > Jason said that for SKU with lots of media engines media-driver sets up a
> > set of ctx in userspace with all the pairings (and I guess then load
> > balances in userspace or something like that). Tony Ye also seems to have
> > confirmed that. So I'm not clear on which SKU this is?
>
> Not sure if I should disclose it here. But anyway, platform which is
> currently in upstream and was supposed to be the first to use this uapi
> was supposed to have at least 4 vcs engines initially, or even 8 vcs + 4
> vecs at some point. That was the requirement uapi was designed for. For
> that kind of platform there were supposed to be two virtual engines
> created, with bonding, for instance parent = [vcs0, vcs2], child =
> [vcs1, vcs3]; bonds = [vcs0 - vcs1, vcs2 - vcs3]. With more engines the
> merrier.

I've added the following to the commit message:

    This functionality was originally added to handle cases where we may
    have more than two video engines and media might want to load-balance
    their bonded submits by, for instance, submitting to a balanced vcs0-1
    as the primary and then vcs2-3 as the secondary.  However, no such
    hardware has shipped thus far and, if we ever want to enable such
    use-cases in the future, we'll use the up-and-coming parallel submit API
    which targets GuC submission.

--Jason

> Userspace load balancing, from memory, came into the picture only as a
> consequence of balancing between two types of media pipelines which was
> either working around the rcs contention or lack of sfc, or both. Along
> the lines of - one stage of a media pipeline can be done either as GPGPU
> work, or on the media engine, and so userspace was deciding to spawn "a
> bit of these and a bit of those" to utilise all the GPU blocks. Not
> really about frame split virtual engines and bonding, but completely
> different load balancing, between gpgpu and fixed pipeline.

> > Or maybe the real deal is only future platforms, and there we have GuC
> > scheduler backend.
>
> Yes, because SKUs never materialised.
>
> > Not against adding a bit more context to the commit message, but we need
> > to make sure what we put there is actually correct. Maybe best to ask
> > Tony/Carl as part of getting an ack from them.
>
> I think there is no need - fact uapi was designed for way more engines
> than we got to have is straight forward enough.
>
> Only unasked for flexibility in the uapi was the fact bonding can
> express any dependency and not only N consecutive engines as media fixed
> function needed at the time. I say "at the time" because in fact the
> "consecutive" engines requirement also got more complicated / broken in
> a following gen (via fusing and logical instance remapping), proving the
> point having the uapi disassociated from the hw limitations of the _day_
> was a good call.
>
> Regards,
>
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 15:41           ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 15:41 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:54 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 29/04/2021 13:24, Daniel Vetter wrote:
> > On Wed, Apr 28, 2021 at 04:51:19PM +0100, Tvrtko Ursulin wrote:
> >>
> >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> >>> This adds a bunch of complexity which the media driver has never
> >>> actually used.  The media driver does technically bond a balanced engine
> >>> to another engine but the balanced engine only has one engine in the
> >>> sibling set.  This doesn't actually result in a virtual engine.
> >>
> >> For historical reference, this is not because uapi was over-engineered but
> >> because certain SKUs never materialized.
> >
> > Jason said that for SKU with lots of media engines media-driver sets up a
> > set of ctx in userspace with all the pairings (and I guess then load
> > balances in userspace or something like that). Tony Ye also seems to have
> > confirmed that. So I'm not clear on which SKU this is?
>
> Not sure if I should disclose it here. But anyway, platform which is
> currently in upstream and was supposed to be the first to use this uapi
> was supposed to have at least 4 vcs engines initially, or even 8 vcs + 4
> vecs at some point. That was the requirement uapi was designed for. For
> that kind of platform there were supposed to be two virtual engines
> created, with bonding, for instance parent = [vcs0, vcs2], child =
> [vcs1, vcs3]; bonds = [vcs0 - vcs1, vcs2 - vcs3]. With more engines the
> merrier.

I've added the following to the commit message:

    This functionality was originally added to handle cases where we may
    have more than two video engines and media might want to load-balance
    their bonded submits by, for instance, submitting to a balanced vcs0-1
    as the primary and then vcs2-3 as the secondary.  However, no such
    hardware has shipped thus far and, if we ever want to enable such
    use-cases in the future, we'll use the up-and-coming parallel submit API
    which targets GuC submission.

--Jason

> Userspace load balancing, from memory, came into the picture only as a
> consequence of balancing between two types of media pipelines which was
> either working around the rcs contention or lack of sfc, or both. Along
> the lines of - one stage of a media pipeline can be done either as GPGPU
> work, or on the media engine, and so userspace was deciding to spawn "a
> bit of these and a bit of those" to utilise all the GPU blocks. Not
> really about frame split virtual engines and bonding, but completely
> different load balancing, between gpgpu and fixed pipeline.

> > Or maybe the real deal is only future platforms, and there we have GuC
> > scheduler backend.
>
> Yes, because SKUs never materialised.
>
> > Not against adding a bit more context to the commit message, but we need
> > to make sure what we put there is actually correct. Maybe best to ask
> > Tony/Carl as part of getting an ack from them.
>
> I think there is no need - fact uapi was designed for way more engines
> than we got to have is straight forward enough.
>
> Only unasked for flexibility in the uapi was the fact bonding can
> express any dependency and not only N consecutive engines as media fixed
> function needed at the time. I say "at the time" because in fact the
> "consecutive" engines requirement also got more complicated / broken in
> a following gen (via fusing and logical instance remapping), proving the
> point having the uapi disassociated from the hw limitations of the _day_
> was a good call.
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 15:51     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 15:51 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

Yeah this needs some text to explain what/why you're doing this, and maybe
some rough sketch of the locking design.

On Fri, Apr 23, 2021 at 05:31:26PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 657 ++++++++++++++++--
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  26 +
>  .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
>  drivers/gpu/drm/i915/i915_drv.h               |  17 +-
>  5 files changed, 648 insertions(+), 60 deletions(-)

So I think the patch split here is a bit unfortunate, because you're
adding the new vm/engine validation code for proto context here, but the
old stuff is only removed in the next patches that make vm/engines
immutable after first use.

I think a better split would be if this patch here only has all the
scaffolding. You already have the EOPNOTSUPP fallback (which I hope gets
removed), so moving the conversion entirely to later patches should be all
fine.

Or do I miss something?

I think the only concern I'm seeing is that bisectability might be a bit
lost, because we finalize the context in some cases in setparam. And if we
do the conversion in a different order than the one media uses for its
setparam, then later setparam might fail because the context is finalized
already. But also
- it's just bisectability of media functionality I think
- just check which order media calls CTX_SETPARAM and use that to do the
  conversion

And we should be fine ... I think?

Some more thoughts below, but the proto ctx stuff itself looks fine.

> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index db9153e0f85a7..aa8e61211924f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
>  
>  static void proto_context_close(struct i915_gem_proto_context *pc)
>  {
> +	int i;
> +
>  	if (pc->vm)
>  		i915_vm_put(pc->vm);
> +	if (pc->user_engines) {
> +		for (i = 0; i < pc->num_user_engines; i++)
> +			kfree(pc->user_engines[i].siblings);
> +		kfree(pc->user_engines);
> +	}
>  	kfree(pc);
>  }
>  
> @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>  	proto_context_set_persistence(i915, pc, true);
>  	pc->sched.priority = I915_PRIORITY_NORMAL;
>  
> +	pc->num_user_engines = -1;
> +	pc->user_engines = NULL;
> +
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
>  		pc->single_timeline = true;
>  
>  	return pc;
>  }
>  
> +static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
> +					 struct i915_gem_proto_context *pc,
> +					 u32 *id)
> +{
> +	int ret;
> +	void *old;

assert_lock_held just for consistency.

> +
> +	ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
> +	if (ret)
> +		return ret;
> +
> +	old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL);
> +	if (xa_is_err(old)) {
> +		xa_erase(&fpriv->context_xa, *id);
> +		return xa_err(old);
> +	}
> +	GEM_BUG_ON(old);
> +
> +	return 0;
> +}
> +
> +static int proto_context_register(struct drm_i915_file_private *fpriv,
> +				  struct i915_gem_proto_context *pc,
> +				  u32 *id)
> +{
> +	int ret;
> +
> +	mutex_lock(&fpriv->proto_context_lock);
> +	ret = proto_context_register_locked(fpriv, pc, id);
> +	mutex_unlock(&fpriv->proto_context_lock);
> +
> +	return ret;
> +}
> +
> +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
> +			    struct i915_gem_proto_context *pc,
> +			    const struct drm_i915_gem_context_param *args)
> +{
> +	struct i915_address_space *vm;
> +
> +	if (args->size)
> +		return -EINVAL;
> +
> +	if (!pc->vm)
> +		return -ENODEV;
> +
> +	if (upper_32_bits(args->value))
> +		return -ENOENT;
> +
> +	rcu_read_lock();
> +	vm = xa_load(&fpriv->vm_xa, args->value);
> +	if (vm && !kref_get_unless_zero(&vm->ref))
> +		vm = NULL;
> +	rcu_read_unlock();
> +	if (!vm)
> +		return -ENOENT;
> +
> +	i915_vm_put(pc->vm);
> +	pc->vm = vm;
> +
> +	return 0;
> +}
> +
> +struct set_proto_ctx_engines {
> +	struct drm_i915_private *i915;
> +	unsigned num_engines;
> +	struct i915_gem_proto_engine *engines;
> +};
> +
> +static int
> +set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
> +			      void *data)
> +{
> +	struct i915_context_engines_load_balance __user *ext =
> +		container_of_user(base, typeof(*ext), base);
> +	const struct set_proto_ctx_engines *set = data;
> +	struct drm_i915_private *i915 = set->i915;
> +	struct intel_engine_cs **siblings;
> +	u16 num_siblings, idx;
> +	unsigned int n;
> +	int err;
> +
> +	if (!HAS_EXECLISTS(i915))
> +		return -ENODEV;
> +
> +	if (intel_uc_uses_guc_submission(&i915->gt.uc))
> +		return -ENODEV; /* not implement yet */
> +
> +	if (get_user(idx, &ext->engine_index))
> +		return -EFAULT;
> +
> +	if (idx >= set->num_engines) {
> +		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
> +			idx, set->num_engines);
> +		return -EINVAL;
> +	}
> +
> +	idx = array_index_nospec(idx, set->num_engines);
> +	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
> +		drm_dbg(&i915->drm,
> +			"Invalid placement[%d], already occupied\n", idx);
> +		return -EEXIST;
> +	}
> +
> +	if (get_user(num_siblings, &ext->num_siblings))
> +		return -EFAULT;
> +
> +	err = check_user_mbz(&ext->flags);
> +	if (err)
> +		return err;
> +
> +	err = check_user_mbz(&ext->mbz64);
> +	if (err)
> +		return err;
> +
> +	if (num_siblings == 0)
> +		return 0;
> +
> +	siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
> +	if (!siblings)
> +		return -ENOMEM;
> +
> +	for (n = 0; n < num_siblings; n++) {
> +		struct i915_engine_class_instance ci;
> +
> +		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
> +			err = -EFAULT;
> +			goto err_siblings;
> +		}
> +
> +		siblings[n] = intel_engine_lookup_user(i915,
> +						       ci.engine_class,
> +						       ci.engine_instance);
> +		if (!siblings[n]) {
> +			drm_dbg(&i915->drm,
> +				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
> +				n, ci.engine_class, ci.engine_instance);
> +			err = -EINVAL;
> +			goto err_siblings;
> +		}
> +	}
> +
> +	if (num_siblings == 1) {
> +		set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> +		set->engines[idx].engine = siblings[0];
> +		kfree(siblings);
> +	} else {
> +		set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED;
> +		set->engines[idx].num_siblings = num_siblings;
> +		set->engines[idx].siblings = siblings;
> +	}
> +
> +	return 0;
> +
> +err_siblings:
> +	kfree(siblings);
> +
> +	return err;
> +}
> +
> +static int
> +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
> +{
> +	struct i915_context_engines_bond __user *ext =
> +		container_of_user(base, typeof(*ext), base);
> +	const struct set_proto_ctx_engines *set = data;
> +	struct drm_i915_private *i915 = set->i915;
> +	struct i915_engine_class_instance ci;
> +	struct intel_engine_cs *master;
> +	u16 idx, num_bonds;
> +	int err, n;
> +
> +	if (get_user(idx, &ext->virtual_index))
> +		return -EFAULT;
> +
> +	if (idx >= set->num_engines) {
> +		drm_dbg(&i915->drm,
> +			"Invalid index for virtual engine: %d >= %d\n",
> +			idx, set->num_engines);
> +		return -EINVAL;
> +	}
> +
> +	idx = array_index_nospec(idx, set->num_engines);
> +	if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) {
> +		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
> +		return -EINVAL;
> +	}
> +
> +	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) {
> +		drm_dbg(&i915->drm,
> +			"Bonding with virtual engines not allowed\n");
> +		return -EINVAL;
> +	}
> +
> +	err = check_user_mbz(&ext->flags);
> +	if (err)
> +		return err;
> +
> +	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> +		err = check_user_mbz(&ext->mbz64[n]);
> +		if (err)
> +			return err;
> +	}
> +
> +	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
> +		return -EFAULT;
> +
> +	master = intel_engine_lookup_user(i915,
> +					  ci.engine_class,
> +					  ci.engine_instance);
> +	if (!master) {
> +		drm_dbg(&i915->drm,
> +			"Unrecognised master engine: { class:%u, instance:%u }\n",
> +			ci.engine_class, ci.engine_instance);
> +		return -EINVAL;
> +	}
> +
> +	if (get_user(num_bonds, &ext->num_bonds))
> +		return -EFAULT;
> +
> +	for (n = 0; n < num_bonds; n++) {
> +		struct intel_engine_cs *bond;
> +
> +		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
> +			return -EFAULT;
> +
> +		bond = intel_engine_lookup_user(i915,
> +						ci.engine_class,
> +						ci.engine_instance);
> +		if (!bond) {
> +			drm_dbg(&i915->drm,
> +				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
> +				n, ci.engine_class, ci.engine_instance);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = {
> +	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance,
> +	[I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond,
> +};
> +
> +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
> +			         struct i915_gem_proto_context *pc,
> +			         const struct drm_i915_gem_context_param *args)
> +{
> +	struct drm_i915_private *i915 = fpriv->dev_priv;
> +	struct set_proto_ctx_engines set = { .i915 = i915 };
> +	struct i915_context_param_engines __user *user =
> +		u64_to_user_ptr(args->value);
> +	unsigned int n;
> +	u64 extensions;
> +	int err;
> +
> +	if (!args->size) {
> +		kfree(pc->user_engines);
> +		pc->num_user_engines = -1;
> +		pc->user_engines = NULL;
> +		return 0;
> +	}
> +
> +	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
> +	if (args->size < sizeof(*user) ||
> +	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
> +		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
> +			args->size);
> +		return -EINVAL;
> +	}
> +
> +	set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> +	if (set.num_engines > I915_EXEC_RING_MASK + 1)
> +		return -EINVAL;
> +
> +	set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL);
> +	if (!set.engines)
> +		return -ENOMEM;
> +
> +	for (n = 0; n < set.num_engines; n++) {
> +		struct i915_engine_class_instance ci;
> +		struct intel_engine_cs *engine;
> +
> +		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
> +			kfree(set.engines);
> +			return -EFAULT;
> +		}
> +
> +		memset(&set.engines[n], 0, sizeof(set.engines[n]));
> +
> +		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
> +		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE)
> +			continue;
> +
> +		engine = intel_engine_lookup_user(i915,
> +						  ci.engine_class,
> +						  ci.engine_instance);
> +		if (!engine) {
> +			drm_dbg(&i915->drm,
> +				"Invalid engine[%d]: { class:%d, instance:%d }\n",
> +				n, ci.engine_class, ci.engine_instance);
> +			kfree(set.engines);
> +			return -ENOENT;
> +		}
> +
> +		set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> +		set.engines[n].engine = engine;
> +	}
> +
> +	err = -EFAULT;
> +	if (!get_user(extensions, &user->extensions))
> +		err = i915_user_extensions(u64_to_user_ptr(extensions),
> +					   set_proto_ctx_engines_extensions,
> +					   ARRAY_SIZE(set_proto_ctx_engines_extensions),
> +					   &set);
> +	if (err) {
> +		kfree(set.engines);
> +		return err;
> +	}
> +
> +	kfree(pc->user_engines);
> +	pc->num_user_engines = set.num_engines;
> +	pc->user_engines = set.engines;
> +
> +	return 0;
> +}
> +
> +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
> +			       struct i915_gem_proto_context *pc,
> +			       struct drm_i915_gem_context_param *args)
> +{
> +	int ret = 0;
> +
> +	switch (args->param) {
> +	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value)
> +			set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);

Atomic bitops like in previous patches: Pls no :-)

> +		else
> +			clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_BANNABLE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (!capable(CAP_SYS_ADMIN) && !args->value)
> +			ret = -EPERM;
> +		else if (args->value)
> +			set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> +		else
> +			clear_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_RECOVERABLE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value)
> +			set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> +		else
> +			clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_PRIORITY:
> +		ret = validate_priority(fpriv->dev_priv, args);
> +		if (!ret)
> +			pc->sched.priority = args->value;
> +		break;
> +
> +	case I915_CONTEXT_PARAM_SSEU:
> +		ret = -ENOTSUPP;
> +		break;
> +
> +	case I915_CONTEXT_PARAM_VM:
> +		ret = set_proto_ctx_vm(fpriv, pc, args);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_ENGINES:
> +		ret = set_proto_ctx_engines(fpriv, pc, args);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_PERSISTENCE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value)
> +			set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> +		else
> +			clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_NO_ZEROMAP:
> +	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_RINGSIZE:
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static struct i915_address_space *
>  context_get_vm_rcu(struct i915_gem_context *ctx)
>  {
> @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
>  	return e;
>  }
>  
> +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
> +					     unsigned int num_engines,
> +					     struct i915_gem_proto_engine *pe)
> +{
> +	struct i915_gem_engines *e;
> +	unsigned int n;
> +
> +	e = alloc_engines(num_engines);
> +	for (n = 0; n < num_engines; n++) {
> +		struct intel_context *ce;
> +
> +		switch (pe[n].type) {
> +		case I915_GEM_ENGINE_TYPE_PHYSICAL:
> +			ce = intel_context_create(pe[n].engine);
> +			break;
> +
> +		case I915_GEM_ENGINE_TYPE_BALANCED:
> +			ce = intel_execlists_create_virtual(pe[n].siblings,
> +							    pe[n].num_siblings);
> +			break;
> +
> +		case I915_GEM_ENGINE_TYPE_INVALID:
> +		default:
> +			GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID);
> +			continue;
> +		}
> +
> +		if (IS_ERR(ce)) {
> +			__free_engines(e, n);
> +			return ERR_CAST(ce);
> +		}
> +
> +		intel_context_set_gem(ce, ctx);
> +
> +		e->engines[n] = ce;
> +	}
> +	e->num_engines = num_engines;
> +
> +	return e;
> +}
> +
>  void i915_gem_context_release(struct kref *ref)
>  {
>  	struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
> @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915,
>  		mutex_unlock(&ctx->mutex);
>  	}
>  
> +	if (pc->num_user_engines >= 0) {
> +		struct i915_gem_engines *engines;
> +
> +		engines = user_engines(ctx, pc->num_user_engines,
> +				       pc->user_engines);
> +		if (IS_ERR(engines)) {
> +			context_close(ctx);
> +			return ERR_CAST(engines);
> +		}
> +
> +		mutex_lock(&ctx->engines_mutex);
> +		i915_gem_context_set_user_engines(ctx);
> +		engines = rcu_replace_pointer(ctx->engines, engines, 1);
> +		mutex_unlock(&ctx->engines_mutex);
> +
> +		free_engines(engines);
> +	}
> +
>  	if (pc->single_timeline) {
>  		ret = drm_syncobj_create(&ctx->syncobj,
>  					 DRM_SYNCOBJ_CREATE_SIGNALED,
> @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915)
>  	init_contexts(&i915->gem.contexts);
>  }
>  
> -static int gem_context_register(struct i915_gem_context *ctx,
> -				struct drm_i915_file_private *fpriv,
> -				u32 *id)
> +static void gem_context_register(struct i915_gem_context *ctx,
> +				 struct drm_i915_file_private *fpriv,
> +				 u32 id)
>  {
>  	struct drm_i915_private *i915 = ctx->i915;
> -	int ret;
> +	void *old;
>  
>  	ctx->file_priv = fpriv;
>  
> @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx,
>  		 current->comm, pid_nr(ctx->pid));
>  
>  	/* And finally expose ourselves to userspace via the idr */
> -	ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
> -	if (ret)
> -		goto err_pid;
> +	old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
> +	GEM_BUG_ON(old);
>  
>  	spin_lock(&i915->gem.contexts.lock);
>  	list_add_tail(&ctx->link, &i915->gem.contexts.list);
>  	spin_unlock(&i915->gem.contexts.lock);
> -
> -	return 0;
> -
> -err_pid:
> -	put_pid(fetch_and_zero(&ctx->pid));
> -	return ret;
>  }
>  
>  int i915_gem_context_open(struct drm_i915_private *i915,
> @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
> -	u32 id;
>  
> -	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC);
> +	mutex_init(&file_priv->proto_context_lock);
> +	xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC);
> +
> +	/* 0 reserved for the default context */
> +	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1);
>  
>  	/* 0 reserved for invalid/unassigned ppgtt */
>  	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
> @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  		goto err;
>  	}
>  
> -	err = gem_context_register(ctx, file_priv, &id);
> -	if (err < 0)
> -		goto err_ctx;
> +	gem_context_register(ctx, file_priv, 0);
>  
> -	GEM_BUG_ON(id);
>  	return 0;
>  
> -err_ctx:
> -	context_close(ctx);
>  err:
>  	xa_destroy(&file_priv->vm_xa);
>  	xa_destroy(&file_priv->context_xa);
> +	xa_destroy(&file_priv->proto_context_xa);
> +	mutex_destroy(&file_priv->proto_context_lock);
>  	return err;
>  }
>  
>  void i915_gem_context_close(struct drm_file *file)
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_address_space *vm;
>  	struct i915_gem_context *ctx;
>  	unsigned long idx;
>  
> +	xa_for_each(&file_priv->proto_context_xa, idx, pc)
> +		proto_context_close(pc);
> +	xa_destroy(&file_priv->proto_context_xa);
> +	mutex_destroy(&file_priv->proto_context_lock);
> +
>  	xa_for_each(&file_priv->context_xa, idx, ctx)
>  		context_close(ctx);
>  	xa_destroy(&file_priv->context_xa);
> @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  }
>  
>  struct create_ext {
> -	struct i915_gem_context *ctx;
> +	struct i915_gem_proto_context *pc;
>  	struct drm_i915_file_private *fpriv;
>  };
>  
> @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>  	if (local.param.ctx_id)
>  		return -EINVAL;
>  
> -	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> +	return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param);
>  }
>  
>  static int invalid_ext(struct i915_user_extension __user *ext, void *data)
> @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
>  	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
>  }
>  
> +static inline struct i915_gem_context *
> +__context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> +{
> +	struct i915_gem_context *ctx;
> +
> +	rcu_read_lock();
> +	ctx = xa_load(&file_priv->context_xa, id);
> +	if (ctx && !kref_get_unless_zero(&ctx->ref))
> +		ctx = NULL;
> +	rcu_read_unlock();
> +
> +	return ctx;
> +}
> +
> +struct i915_gem_context *
> +lazy_create_context_locked(struct drm_i915_file_private *file_priv,
> +			   struct i915_gem_proto_context *pc, u32 id)
> +{
> +	struct i915_gem_context *ctx;
> +	void *old;

assert_lock_held is alwasy nice in all _locked functions. It entirely
compiles out without CONFIG_PROVE_LOCKING enabled.

> +
> +	ctx = i915_gem_create_context(file_priv->dev_priv, pc);

I think we need a prep patch which changes the calling convetion of this
and anything it calls to only return a NULL pointer. Then
i915_gem_context_lookup below can return the ERR_PTR(-ENOMEM) below for
that case, and we know that we're never returning a wrong error pointer.

> +	if (IS_ERR(ctx))
> +		return ctx;
> +
> +	gem_context_register(ctx, file_priv, id);
> +
> +	old = xa_erase(&file_priv->proto_context_xa, id);
> +	GEM_BUG_ON(old != pc);
> +	proto_context_close(pc);
> +
> +	/* One for the xarray and one for the caller */
> +	return i915_gem_context_get(ctx);
> +}
> +
> +struct i915_gem_context *
> +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> +{
> +	struct i915_gem_proto_context *pc;
> +	struct i915_gem_context *ctx;
> +
> +	ctx = __context_lookup(file_priv, id);
> +	if (ctx)
> +		return ctx;
> +
> +	mutex_lock(&file_priv->proto_context_lock);
> +	/* Try one more time under the lock */
> +	ctx = __context_lookup(file_priv, id);
> +	if (!ctx) {
> +		pc = xa_load(&file_priv->proto_context_xa, id);
> +		if (!pc)
> +			ctx = ERR_PTR(-ENOENT);
> +		else
> +			ctx = lazy_create_context_locked(file_priv, pc, id);
> +	}
> +	mutex_unlock(&file_priv->proto_context_lock);
> +
> +	return ctx;
> +}
> +
>  int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  				  struct drm_file *file)
>  {
>  	struct drm_i915_private *i915 = to_i915(dev);
>  	struct drm_i915_gem_context_create_ext *args = data;
> -	struct i915_gem_proto_context *pc;
>  	struct create_ext ext_data;
>  	int ret;
>  	u32 id;
> @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  		return -EIO;
>  	}
>  
> -	pc = proto_context_create(i915, args->flags);
> -	if (IS_ERR(pc))
> -		return PTR_ERR(pc);
> -
> -	ext_data.ctx = i915_gem_create_context(i915, pc);
> -	proto_context_close(pc);
> -	if (IS_ERR(ext_data.ctx))
> -		return PTR_ERR(ext_data.ctx);
> +	ext_data.pc = proto_context_create(i915, args->flags);
> +	if (IS_ERR(ext_data.pc))
> +		return PTR_ERR(ext_data.pc);
>  
>  	if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
>  		ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
> @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  					   ARRAY_SIZE(create_extensions),
>  					   &ext_data);
>  		if (ret)
> -			goto err_ctx;
> +			goto err_pc;
>  	}
>  
> -	ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id);
> +	ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
>  	if (ret < 0)
> -		goto err_ctx;
> +		goto err_pc;
>  
>  	args->ctx_id = id;
>  	drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id);
>  
>  	return 0;
>  
> -err_ctx:
> -	context_close(ext_data.ctx);
> +err_pc:
> +	proto_context_close(ext_data.pc);
>  	return ret;
>  }
>  
> @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_gem_context_destroy *args = data;
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  
>  	if (args->pad != 0)
> @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  	if (!args->ctx_id)
>  		return -ENOENT;
>  
> +	mutex_lock(&file_priv->proto_context_lock);
>  	ctx = xa_erase(&file_priv->context_xa, args->ctx_id);
> -	if (!ctx)
> +	pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id);
> +	mutex_unlock(&file_priv->proto_context_lock);
> +
> +	if (!ctx && !pc)
>  		return -ENOENT;
> +	GEM_WARN_ON(ctx && pc);
> +
> +	if (pc)
> +		proto_context_close(pc);
> +
> +	if (ctx)
> +		context_close(ctx);
>  
> -	context_close(ctx);
>  	return 0;
>  }
>  
> @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	struct drm_i915_gem_context_param *args = data;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
> -	int ret;
> +	int ret = 0;
>  
> -	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> -	if (IS_ERR(ctx))
> -		return PTR_ERR(ctx);
> +	ctx = __context_lookup(file_priv, args->ctx_id);
> +	if (ctx)
> +		goto set_ctx_param;
>  
> -	ret = ctx_setparam(file_priv, ctx, args);
> +	mutex_lock(&file_priv->proto_context_lock);
> +	ctx = __context_lookup(file_priv, args->ctx_id);
> +	if (ctx)
> +		goto unlock;
> +
> +	pc = xa_load(&file_priv->proto_context_xa, args->ctx_id);
> +	if (!pc) {
> +		ret = -ENOENT;
> +		goto unlock;
> +	}
> +
> +	ret = set_proto_ctx_param(file_priv, pc, args);

I think we should have a FIXME here of not allowing this on some future
platforms because just use CTX_CREATE_EXT.

> +	if (ret == -ENOTSUPP) {
> +		/* Some params, specifically SSEU, can only be set on fully

I think this needs a FIXME: that this only holds during the conversion?
Otherwise we kinda have a bit a problem me thinks ...


> +		 * created contexts.
> +		 */
> +		ret = 0;
> +		ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id);
> +		if (IS_ERR(ctx)) {
> +			ret = PTR_ERR(ctx);
> +			ctx = NULL;
> +		}
> +	}
> +
> +unlock:
> +	mutex_unlock(&file_priv->proto_context_lock);
> +
> +set_ctx_param:
> +	if (!ret && ctx)
> +		ret = ctx_setparam(file_priv, ctx, args);
> +
> +	if (ctx)
> +		i915_gem_context_put(ctx);
>  
> -	i915_gem_context_put(ctx);
>  	return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> index b5c908f3f4f22..20411db84914a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>  				       struct drm_file *file);
>  
> +struct i915_gem_context *
> +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
> +
>  static inline struct i915_gem_context *
>  i915_gem_context_get(struct i915_gem_context *ctx)
>  {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index a42c429f94577..067ea3030ac91 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -46,6 +46,26 @@ struct i915_gem_engines_iter {
>  	const struct i915_gem_engines *engines;
>  };
>  
> +enum i915_gem_engine_type {
> +	I915_GEM_ENGINE_TYPE_INVALID = 0,
> +	I915_GEM_ENGINE_TYPE_PHYSICAL,
> +	I915_GEM_ENGINE_TYPE_BALANCED,
> +};
> +

Some kerneldoc missing?

> +struct i915_gem_proto_engine {
> +	/** @type: Type of this engine */
> +	enum i915_gem_engine_type type;
> +
> +	/** @num_siblings: Engine, for physical */
> +	struct intel_engine_cs *engine;
> +
> +	/** @num_siblings: Number of balanced siblings */
> +	unsigned int num_siblings;
> +
> +	/** @num_siblings: Balanced siblings */
> +	struct intel_engine_cs **siblings;

I guess you're stuffing both balanced and siblings into one?
> +};
> +
>  /**
>   * struct i915_gem_proto_context - prototype context
>   *
> @@ -64,6 +84,12 @@ struct i915_gem_proto_context {
>  	/** @sched: See i915_gem_context::sched */
>  	struct i915_sched_attr sched;
>  
> +	/** @num_user_engines: Number of user-specified engines or -1 */
> +	int num_user_engines;
> +
> +	/** @num_user_engines: User-specified engines */
> +	struct i915_gem_proto_engine *user_engines;
> +
>  	bool single_timeline;
>  };
>  
> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> index e0f512ef7f3c6..32cf2103828f9 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915)
>  struct i915_gem_context *
>  live_context(struct drm_i915_private *i915, struct file *file)
>  {
> +	struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
>  	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
> @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file)
>  
>  	i915_gem_context_set_no_error_capture(ctx);
>  
> -	err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
> +	err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
>  	if (err < 0)
>  		goto err_ctx;
>  
> +	gem_context_register(ctx, fpriv, id);
> +
>  	return ctx;
>  
>  err_ctx:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 004ed0e59c999..365c042529d72 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -200,6 +200,9 @@ struct drm_i915_file_private {
>  		struct rcu_head rcu;
>  	};
>  
> +	struct mutex proto_context_lock;
> +	struct xarray proto_context_xa;

Kerneldoc here please. Ideally also for the context_xa below (but maybe
that's for later).

Also please add a hint to the proto context struct that it's all fully
protected by proto_context_lock above and is never visible outside of
that.

> +
>  	struct xarray context_xa;
>  	struct xarray vm_xa;
>  
> @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
>  
>  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
>  
> -static inline struct i915_gem_context *
> -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> -{
> -	struct i915_gem_context *ctx;
> -
> -	rcu_read_lock();
> -	ctx = xa_load(&file_priv->context_xa, id);
> -	if (ctx && !kref_get_unless_zero(&ctx->ref))
> -		ctx = NULL;
> -	rcu_read_unlock();
> -
> -	return ctx ? ctx : ERR_PTR(-ENOENT);
> -}
> -
>  /* i915_gem_evict.c */
>  int __must_check i915_gem_evict_something(struct i915_address_space *vm,
>  					  u64 min_size, u64 alignment,

I think I'll check details when I'm not getting distracted by the
vm/engines validation code that I think shouldn't be here :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-29 15:51     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 15:51 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

Yeah this needs some text to explain what/why you're doing this, and maybe
some rough sketch of the locking design.

On Fri, Apr 23, 2021 at 05:31:26PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 657 ++++++++++++++++--
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
>  .../gpu/drm/i915/gem/i915_gem_context_types.h |  26 +
>  .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
>  drivers/gpu/drm/i915/i915_drv.h               |  17 +-
>  5 files changed, 648 insertions(+), 60 deletions(-)

So I think the patch split here is a bit unfortunate, because you're
adding the new vm/engine validation code for proto context here, but the
old stuff is only removed in the next patches that make vm/engines
immutable after first use.

I think a better split would be if this patch here only has all the
scaffolding. You already have the EOPNOTSUPP fallback (which I hope gets
removed), so moving the conversion entirely to later patches should be all
fine.

Or do I miss something?

I think the only concern I'm seeing is that bisectability might be a bit
lost, because we finalize the context in some cases in setparam. And if we
do the conversion in a different order than the one media uses for its
setparam, then later setparam might fail because the context is finalized
already. But also
- it's just bisectability of media functionality I think
- just check which order media calls CTX_SETPARAM and use that to do the
  conversion

And we should be fine ... I think?

Some more thoughts below, but the proto ctx stuff itself looks fine.

> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index db9153e0f85a7..aa8e61211924f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
>  
>  static void proto_context_close(struct i915_gem_proto_context *pc)
>  {
> +	int i;
> +
>  	if (pc->vm)
>  		i915_vm_put(pc->vm);
> +	if (pc->user_engines) {
> +		for (i = 0; i < pc->num_user_engines; i++)
> +			kfree(pc->user_engines[i].siblings);
> +		kfree(pc->user_engines);
> +	}
>  	kfree(pc);
>  }
>  
> @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
>  	proto_context_set_persistence(i915, pc, true);
>  	pc->sched.priority = I915_PRIORITY_NORMAL;
>  
> +	pc->num_user_engines = -1;
> +	pc->user_engines = NULL;
> +
>  	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
>  		pc->single_timeline = true;
>  
>  	return pc;
>  }
>  
> +static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
> +					 struct i915_gem_proto_context *pc,
> +					 u32 *id)
> +{
> +	int ret;
> +	void *old;

assert_lock_held just for consistency.

> +
> +	ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
> +	if (ret)
> +		return ret;
> +
> +	old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL);
> +	if (xa_is_err(old)) {
> +		xa_erase(&fpriv->context_xa, *id);
> +		return xa_err(old);
> +	}
> +	GEM_BUG_ON(old);
> +
> +	return 0;
> +}
> +
> +static int proto_context_register(struct drm_i915_file_private *fpriv,
> +				  struct i915_gem_proto_context *pc,
> +				  u32 *id)
> +{
> +	int ret;
> +
> +	mutex_lock(&fpriv->proto_context_lock);
> +	ret = proto_context_register_locked(fpriv, pc, id);
> +	mutex_unlock(&fpriv->proto_context_lock);
> +
> +	return ret;
> +}
> +
> +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
> +			    struct i915_gem_proto_context *pc,
> +			    const struct drm_i915_gem_context_param *args)
> +{
> +	struct i915_address_space *vm;
> +
> +	if (args->size)
> +		return -EINVAL;
> +
> +	if (!pc->vm)
> +		return -ENODEV;
> +
> +	if (upper_32_bits(args->value))
> +		return -ENOENT;
> +
> +	rcu_read_lock();
> +	vm = xa_load(&fpriv->vm_xa, args->value);
> +	if (vm && !kref_get_unless_zero(&vm->ref))
> +		vm = NULL;
> +	rcu_read_unlock();
> +	if (!vm)
> +		return -ENOENT;
> +
> +	i915_vm_put(pc->vm);
> +	pc->vm = vm;
> +
> +	return 0;
> +}
> +
> +struct set_proto_ctx_engines {
> +	struct drm_i915_private *i915;
> +	unsigned num_engines;
> +	struct i915_gem_proto_engine *engines;
> +};
> +
> +static int
> +set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
> +			      void *data)
> +{
> +	struct i915_context_engines_load_balance __user *ext =
> +		container_of_user(base, typeof(*ext), base);
> +	const struct set_proto_ctx_engines *set = data;
> +	struct drm_i915_private *i915 = set->i915;
> +	struct intel_engine_cs **siblings;
> +	u16 num_siblings, idx;
> +	unsigned int n;
> +	int err;
> +
> +	if (!HAS_EXECLISTS(i915))
> +		return -ENODEV;
> +
> +	if (intel_uc_uses_guc_submission(&i915->gt.uc))
> +		return -ENODEV; /* not implement yet */
> +
> +	if (get_user(idx, &ext->engine_index))
> +		return -EFAULT;
> +
> +	if (idx >= set->num_engines) {
> +		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
> +			idx, set->num_engines);
> +		return -EINVAL;
> +	}
> +
> +	idx = array_index_nospec(idx, set->num_engines);
> +	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
> +		drm_dbg(&i915->drm,
> +			"Invalid placement[%d], already occupied\n", idx);
> +		return -EEXIST;
> +	}
> +
> +	if (get_user(num_siblings, &ext->num_siblings))
> +		return -EFAULT;
> +
> +	err = check_user_mbz(&ext->flags);
> +	if (err)
> +		return err;
> +
> +	err = check_user_mbz(&ext->mbz64);
> +	if (err)
> +		return err;
> +
> +	if (num_siblings == 0)
> +		return 0;
> +
> +	siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
> +	if (!siblings)
> +		return -ENOMEM;
> +
> +	for (n = 0; n < num_siblings; n++) {
> +		struct i915_engine_class_instance ci;
> +
> +		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
> +			err = -EFAULT;
> +			goto err_siblings;
> +		}
> +
> +		siblings[n] = intel_engine_lookup_user(i915,
> +						       ci.engine_class,
> +						       ci.engine_instance);
> +		if (!siblings[n]) {
> +			drm_dbg(&i915->drm,
> +				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
> +				n, ci.engine_class, ci.engine_instance);
> +			err = -EINVAL;
> +			goto err_siblings;
> +		}
> +	}
> +
> +	if (num_siblings == 1) {
> +		set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> +		set->engines[idx].engine = siblings[0];
> +		kfree(siblings);
> +	} else {
> +		set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED;
> +		set->engines[idx].num_siblings = num_siblings;
> +		set->engines[idx].siblings = siblings;
> +	}
> +
> +	return 0;
> +
> +err_siblings:
> +	kfree(siblings);
> +
> +	return err;
> +}
> +
> +static int
> +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
> +{
> +	struct i915_context_engines_bond __user *ext =
> +		container_of_user(base, typeof(*ext), base);
> +	const struct set_proto_ctx_engines *set = data;
> +	struct drm_i915_private *i915 = set->i915;
> +	struct i915_engine_class_instance ci;
> +	struct intel_engine_cs *master;
> +	u16 idx, num_bonds;
> +	int err, n;
> +
> +	if (get_user(idx, &ext->virtual_index))
> +		return -EFAULT;
> +
> +	if (idx >= set->num_engines) {
> +		drm_dbg(&i915->drm,
> +			"Invalid index for virtual engine: %d >= %d\n",
> +			idx, set->num_engines);
> +		return -EINVAL;
> +	}
> +
> +	idx = array_index_nospec(idx, set->num_engines);
> +	if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) {
> +		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
> +		return -EINVAL;
> +	}
> +
> +	if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) {
> +		drm_dbg(&i915->drm,
> +			"Bonding with virtual engines not allowed\n");
> +		return -EINVAL;
> +	}
> +
> +	err = check_user_mbz(&ext->flags);
> +	if (err)
> +		return err;
> +
> +	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> +		err = check_user_mbz(&ext->mbz64[n]);
> +		if (err)
> +			return err;
> +	}
> +
> +	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
> +		return -EFAULT;
> +
> +	master = intel_engine_lookup_user(i915,
> +					  ci.engine_class,
> +					  ci.engine_instance);
> +	if (!master) {
> +		drm_dbg(&i915->drm,
> +			"Unrecognised master engine: { class:%u, instance:%u }\n",
> +			ci.engine_class, ci.engine_instance);
> +		return -EINVAL;
> +	}
> +
> +	if (get_user(num_bonds, &ext->num_bonds))
> +		return -EFAULT;
> +
> +	for (n = 0; n < num_bonds; n++) {
> +		struct intel_engine_cs *bond;
> +
> +		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
> +			return -EFAULT;
> +
> +		bond = intel_engine_lookup_user(i915,
> +						ci.engine_class,
> +						ci.engine_instance);
> +		if (!bond) {
> +			drm_dbg(&i915->drm,
> +				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
> +				n, ci.engine_class, ci.engine_instance);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = {
> +	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance,
> +	[I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond,
> +};
> +
> +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
> +			         struct i915_gem_proto_context *pc,
> +			         const struct drm_i915_gem_context_param *args)
> +{
> +	struct drm_i915_private *i915 = fpriv->dev_priv;
> +	struct set_proto_ctx_engines set = { .i915 = i915 };
> +	struct i915_context_param_engines __user *user =
> +		u64_to_user_ptr(args->value);
> +	unsigned int n;
> +	u64 extensions;
> +	int err;
> +
> +	if (!args->size) {
> +		kfree(pc->user_engines);
> +		pc->num_user_engines = -1;
> +		pc->user_engines = NULL;
> +		return 0;
> +	}
> +
> +	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
> +	if (args->size < sizeof(*user) ||
> +	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
> +		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
> +			args->size);
> +		return -EINVAL;
> +	}
> +
> +	set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> +	if (set.num_engines > I915_EXEC_RING_MASK + 1)
> +		return -EINVAL;
> +
> +	set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL);
> +	if (!set.engines)
> +		return -ENOMEM;
> +
> +	for (n = 0; n < set.num_engines; n++) {
> +		struct i915_engine_class_instance ci;
> +		struct intel_engine_cs *engine;
> +
> +		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
> +			kfree(set.engines);
> +			return -EFAULT;
> +		}
> +
> +		memset(&set.engines[n], 0, sizeof(set.engines[n]));
> +
> +		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
> +		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE)
> +			continue;
> +
> +		engine = intel_engine_lookup_user(i915,
> +						  ci.engine_class,
> +						  ci.engine_instance);
> +		if (!engine) {
> +			drm_dbg(&i915->drm,
> +				"Invalid engine[%d]: { class:%d, instance:%d }\n",
> +				n, ci.engine_class, ci.engine_instance);
> +			kfree(set.engines);
> +			return -ENOENT;
> +		}
> +
> +		set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> +		set.engines[n].engine = engine;
> +	}
> +
> +	err = -EFAULT;
> +	if (!get_user(extensions, &user->extensions))
> +		err = i915_user_extensions(u64_to_user_ptr(extensions),
> +					   set_proto_ctx_engines_extensions,
> +					   ARRAY_SIZE(set_proto_ctx_engines_extensions),
> +					   &set);
> +	if (err) {
> +		kfree(set.engines);
> +		return err;
> +	}
> +
> +	kfree(pc->user_engines);
> +	pc->num_user_engines = set.num_engines;
> +	pc->user_engines = set.engines;
> +
> +	return 0;
> +}
> +
> +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
> +			       struct i915_gem_proto_context *pc,
> +			       struct drm_i915_gem_context_param *args)
> +{
> +	int ret = 0;
> +
> +	switch (args->param) {
> +	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value)
> +			set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);

Atomic bitops like in previous patches: Pls no :-)

> +		else
> +			clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_BANNABLE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (!capable(CAP_SYS_ADMIN) && !args->value)
> +			ret = -EPERM;
> +		else if (args->value)
> +			set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> +		else
> +			clear_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_RECOVERABLE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value)
> +			set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> +		else
> +			clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_PRIORITY:
> +		ret = validate_priority(fpriv->dev_priv, args);
> +		if (!ret)
> +			pc->sched.priority = args->value;
> +		break;
> +
> +	case I915_CONTEXT_PARAM_SSEU:
> +		ret = -ENOTSUPP;
> +		break;
> +
> +	case I915_CONTEXT_PARAM_VM:
> +		ret = set_proto_ctx_vm(fpriv, pc, args);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_ENGINES:
> +		ret = set_proto_ctx_engines(fpriv, pc, args);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_PERSISTENCE:
> +		if (args->size)
> +			ret = -EINVAL;
> +		else if (args->value)
> +			set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> +		else
> +			clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> +		break;
> +
> +	case I915_CONTEXT_PARAM_NO_ZEROMAP:
> +	case I915_CONTEXT_PARAM_BAN_PERIOD:
> +	case I915_CONTEXT_PARAM_RINGSIZE:
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  static struct i915_address_space *
>  context_get_vm_rcu(struct i915_gem_context *ctx)
>  {
> @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
>  	return e;
>  }
>  
> +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
> +					     unsigned int num_engines,
> +					     struct i915_gem_proto_engine *pe)
> +{
> +	struct i915_gem_engines *e;
> +	unsigned int n;
> +
> +	e = alloc_engines(num_engines);
> +	for (n = 0; n < num_engines; n++) {
> +		struct intel_context *ce;
> +
> +		switch (pe[n].type) {
> +		case I915_GEM_ENGINE_TYPE_PHYSICAL:
> +			ce = intel_context_create(pe[n].engine);
> +			break;
> +
> +		case I915_GEM_ENGINE_TYPE_BALANCED:
> +			ce = intel_execlists_create_virtual(pe[n].siblings,
> +							    pe[n].num_siblings);
> +			break;
> +
> +		case I915_GEM_ENGINE_TYPE_INVALID:
> +		default:
> +			GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID);
> +			continue;
> +		}
> +
> +		if (IS_ERR(ce)) {
> +			__free_engines(e, n);
> +			return ERR_CAST(ce);
> +		}
> +
> +		intel_context_set_gem(ce, ctx);
> +
> +		e->engines[n] = ce;
> +	}
> +	e->num_engines = num_engines;
> +
> +	return e;
> +}
> +
>  void i915_gem_context_release(struct kref *ref)
>  {
>  	struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
> @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915,
>  		mutex_unlock(&ctx->mutex);
>  	}
>  
> +	if (pc->num_user_engines >= 0) {
> +		struct i915_gem_engines *engines;
> +
> +		engines = user_engines(ctx, pc->num_user_engines,
> +				       pc->user_engines);
> +		if (IS_ERR(engines)) {
> +			context_close(ctx);
> +			return ERR_CAST(engines);
> +		}
> +
> +		mutex_lock(&ctx->engines_mutex);
> +		i915_gem_context_set_user_engines(ctx);
> +		engines = rcu_replace_pointer(ctx->engines, engines, 1);
> +		mutex_unlock(&ctx->engines_mutex);
> +
> +		free_engines(engines);
> +	}
> +
>  	if (pc->single_timeline) {
>  		ret = drm_syncobj_create(&ctx->syncobj,
>  					 DRM_SYNCOBJ_CREATE_SIGNALED,
> @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915)
>  	init_contexts(&i915->gem.contexts);
>  }
>  
> -static int gem_context_register(struct i915_gem_context *ctx,
> -				struct drm_i915_file_private *fpriv,
> -				u32 *id)
> +static void gem_context_register(struct i915_gem_context *ctx,
> +				 struct drm_i915_file_private *fpriv,
> +				 u32 id)
>  {
>  	struct drm_i915_private *i915 = ctx->i915;
> -	int ret;
> +	void *old;
>  
>  	ctx->file_priv = fpriv;
>  
> @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx,
>  		 current->comm, pid_nr(ctx->pid));
>  
>  	/* And finally expose ourselves to userspace via the idr */
> -	ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
> -	if (ret)
> -		goto err_pid;
> +	old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
> +	GEM_BUG_ON(old);
>  
>  	spin_lock(&i915->gem.contexts.lock);
>  	list_add_tail(&ctx->link, &i915->gem.contexts.list);
>  	spin_unlock(&i915->gem.contexts.lock);
> -
> -	return 0;
> -
> -err_pid:
> -	put_pid(fetch_and_zero(&ctx->pid));
> -	return ret;
>  }
>  
>  int i915_gem_context_open(struct drm_i915_private *i915,
> @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
> -	u32 id;
>  
> -	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC);
> +	mutex_init(&file_priv->proto_context_lock);
> +	xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC);
> +
> +	/* 0 reserved for the default context */
> +	xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1);
>  
>  	/* 0 reserved for invalid/unassigned ppgtt */
>  	xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
> @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>  		goto err;
>  	}
>  
> -	err = gem_context_register(ctx, file_priv, &id);
> -	if (err < 0)
> -		goto err_ctx;
> +	gem_context_register(ctx, file_priv, 0);
>  
> -	GEM_BUG_ON(id);
>  	return 0;
>  
> -err_ctx:
> -	context_close(ctx);
>  err:
>  	xa_destroy(&file_priv->vm_xa);
>  	xa_destroy(&file_priv->context_xa);
> +	xa_destroy(&file_priv->proto_context_xa);
> +	mutex_destroy(&file_priv->proto_context_lock);
>  	return err;
>  }
>  
>  void i915_gem_context_close(struct drm_file *file)
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_address_space *vm;
>  	struct i915_gem_context *ctx;
>  	unsigned long idx;
>  
> +	xa_for_each(&file_priv->proto_context_xa, idx, pc)
> +		proto_context_close(pc);
> +	xa_destroy(&file_priv->proto_context_xa);
> +	mutex_destroy(&file_priv->proto_context_lock);
> +
>  	xa_for_each(&file_priv->context_xa, idx, ctx)
>  		context_close(ctx);
>  	xa_destroy(&file_priv->context_xa);
> @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  }
>  
>  struct create_ext {
> -	struct i915_gem_context *ctx;
> +	struct i915_gem_proto_context *pc;
>  	struct drm_i915_file_private *fpriv;
>  };
>  
> @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
>  	if (local.param.ctx_id)
>  		return -EINVAL;
>  
> -	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> +	return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param);
>  }
>  
>  static int invalid_ext(struct i915_user_extension __user *ext, void *data)
> @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
>  	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
>  }
>  
> +static inline struct i915_gem_context *
> +__context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> +{
> +	struct i915_gem_context *ctx;
> +
> +	rcu_read_lock();
> +	ctx = xa_load(&file_priv->context_xa, id);
> +	if (ctx && !kref_get_unless_zero(&ctx->ref))
> +		ctx = NULL;
> +	rcu_read_unlock();
> +
> +	return ctx;
> +}
> +
> +struct i915_gem_context *
> +lazy_create_context_locked(struct drm_i915_file_private *file_priv,
> +			   struct i915_gem_proto_context *pc, u32 id)
> +{
> +	struct i915_gem_context *ctx;
> +	void *old;

assert_lock_held is alwasy nice in all _locked functions. It entirely
compiles out without CONFIG_PROVE_LOCKING enabled.

> +
> +	ctx = i915_gem_create_context(file_priv->dev_priv, pc);

I think we need a prep patch which changes the calling convetion of this
and anything it calls to only return a NULL pointer. Then
i915_gem_context_lookup below can return the ERR_PTR(-ENOMEM) below for
that case, and we know that we're never returning a wrong error pointer.

> +	if (IS_ERR(ctx))
> +		return ctx;
> +
> +	gem_context_register(ctx, file_priv, id);
> +
> +	old = xa_erase(&file_priv->proto_context_xa, id);
> +	GEM_BUG_ON(old != pc);
> +	proto_context_close(pc);
> +
> +	/* One for the xarray and one for the caller */
> +	return i915_gem_context_get(ctx);
> +}
> +
> +struct i915_gem_context *
> +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> +{
> +	struct i915_gem_proto_context *pc;
> +	struct i915_gem_context *ctx;
> +
> +	ctx = __context_lookup(file_priv, id);
> +	if (ctx)
> +		return ctx;
> +
> +	mutex_lock(&file_priv->proto_context_lock);
> +	/* Try one more time under the lock */
> +	ctx = __context_lookup(file_priv, id);
> +	if (!ctx) {
> +		pc = xa_load(&file_priv->proto_context_xa, id);
> +		if (!pc)
> +			ctx = ERR_PTR(-ENOENT);
> +		else
> +			ctx = lazy_create_context_locked(file_priv, pc, id);
> +	}
> +	mutex_unlock(&file_priv->proto_context_lock);
> +
> +	return ctx;
> +}
> +
>  int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  				  struct drm_file *file)
>  {
>  	struct drm_i915_private *i915 = to_i915(dev);
>  	struct drm_i915_gem_context_create_ext *args = data;
> -	struct i915_gem_proto_context *pc;
>  	struct create_ext ext_data;
>  	int ret;
>  	u32 id;
> @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  		return -EIO;
>  	}
>  
> -	pc = proto_context_create(i915, args->flags);
> -	if (IS_ERR(pc))
> -		return PTR_ERR(pc);
> -
> -	ext_data.ctx = i915_gem_create_context(i915, pc);
> -	proto_context_close(pc);
> -	if (IS_ERR(ext_data.ctx))
> -		return PTR_ERR(ext_data.ctx);
> +	ext_data.pc = proto_context_create(i915, args->flags);
> +	if (IS_ERR(ext_data.pc))
> +		return PTR_ERR(ext_data.pc);
>  
>  	if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
>  		ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
> @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>  					   ARRAY_SIZE(create_extensions),
>  					   &ext_data);
>  		if (ret)
> -			goto err_ctx;
> +			goto err_pc;
>  	}
>  
> -	ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id);
> +	ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
>  	if (ret < 0)
> -		goto err_ctx;
> +		goto err_pc;
>  
>  	args->ctx_id = id;
>  	drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id);
>  
>  	return 0;
>  
> -err_ctx:
> -	context_close(ext_data.ctx);
> +err_pc:
> +	proto_context_close(ext_data.pc);
>  	return ret;
>  }
>  
> @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_gem_context_destroy *args = data;
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  
>  	if (args->pad != 0)
> @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>  	if (!args->ctx_id)
>  		return -ENOENT;
>  
> +	mutex_lock(&file_priv->proto_context_lock);
>  	ctx = xa_erase(&file_priv->context_xa, args->ctx_id);
> -	if (!ctx)
> +	pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id);
> +	mutex_unlock(&file_priv->proto_context_lock);
> +
> +	if (!ctx && !pc)
>  		return -ENOENT;
> +	GEM_WARN_ON(ctx && pc);
> +
> +	if (pc)
> +		proto_context_close(pc);
> +
> +	if (ctx)
> +		context_close(ctx);
>  
> -	context_close(ctx);
>  	return 0;
>  }
>  
> @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	struct drm_i915_gem_context_param *args = data;
> +	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
> -	int ret;
> +	int ret = 0;
>  
> -	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> -	if (IS_ERR(ctx))
> -		return PTR_ERR(ctx);
> +	ctx = __context_lookup(file_priv, args->ctx_id);
> +	if (ctx)
> +		goto set_ctx_param;
>  
> -	ret = ctx_setparam(file_priv, ctx, args);
> +	mutex_lock(&file_priv->proto_context_lock);
> +	ctx = __context_lookup(file_priv, args->ctx_id);
> +	if (ctx)
> +		goto unlock;
> +
> +	pc = xa_load(&file_priv->proto_context_xa, args->ctx_id);
> +	if (!pc) {
> +		ret = -ENOENT;
> +		goto unlock;
> +	}
> +
> +	ret = set_proto_ctx_param(file_priv, pc, args);

I think we should have a FIXME here of not allowing this on some future
platforms because just use CTX_CREATE_EXT.

> +	if (ret == -ENOTSUPP) {
> +		/* Some params, specifically SSEU, can only be set on fully

I think this needs a FIXME: that this only holds during the conversion?
Otherwise we kinda have a bit a problem me thinks ...


> +		 * created contexts.
> +		 */
> +		ret = 0;
> +		ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id);
> +		if (IS_ERR(ctx)) {
> +			ret = PTR_ERR(ctx);
> +			ctx = NULL;
> +		}
> +	}
> +
> +unlock:
> +	mutex_unlock(&file_priv->proto_context_lock);
> +
> +set_ctx_param:
> +	if (!ret && ctx)
> +		ret = ctx_setparam(file_priv, ctx, args);
> +
> +	if (ctx)
> +		i915_gem_context_put(ctx);
>  
> -	i915_gem_context_put(ctx);
>  	return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> index b5c908f3f4f22..20411db84914a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>  				       struct drm_file *file);
>  
> +struct i915_gem_context *
> +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
> +
>  static inline struct i915_gem_context *
>  i915_gem_context_get(struct i915_gem_context *ctx)
>  {
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> index a42c429f94577..067ea3030ac91 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> @@ -46,6 +46,26 @@ struct i915_gem_engines_iter {
>  	const struct i915_gem_engines *engines;
>  };
>  
> +enum i915_gem_engine_type {
> +	I915_GEM_ENGINE_TYPE_INVALID = 0,
> +	I915_GEM_ENGINE_TYPE_PHYSICAL,
> +	I915_GEM_ENGINE_TYPE_BALANCED,
> +};
> +

Some kerneldoc missing?

> +struct i915_gem_proto_engine {
> +	/** @type: Type of this engine */
> +	enum i915_gem_engine_type type;
> +
> +	/** @num_siblings: Engine, for physical */
> +	struct intel_engine_cs *engine;
> +
> +	/** @num_siblings: Number of balanced siblings */
> +	unsigned int num_siblings;
> +
> +	/** @num_siblings: Balanced siblings */
> +	struct intel_engine_cs **siblings;

I guess you're stuffing both balanced and siblings into one?
> +};
> +
>  /**
>   * struct i915_gem_proto_context - prototype context
>   *
> @@ -64,6 +84,12 @@ struct i915_gem_proto_context {
>  	/** @sched: See i915_gem_context::sched */
>  	struct i915_sched_attr sched;
>  
> +	/** @num_user_engines: Number of user-specified engines or -1 */
> +	int num_user_engines;
> +
> +	/** @num_user_engines: User-specified engines */
> +	struct i915_gem_proto_engine *user_engines;
> +
>  	bool single_timeline;
>  };
>  
> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> index e0f512ef7f3c6..32cf2103828f9 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915)
>  struct i915_gem_context *
>  live_context(struct drm_i915_private *i915, struct file *file)
>  {
> +	struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
>  	struct i915_gem_proto_context *pc;
>  	struct i915_gem_context *ctx;
>  	int err;
> @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file)
>  
>  	i915_gem_context_set_no_error_capture(ctx);
>  
> -	err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
> +	err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
>  	if (err < 0)
>  		goto err_ctx;
>  
> +	gem_context_register(ctx, fpriv, id);
> +
>  	return ctx;
>  
>  err_ctx:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 004ed0e59c999..365c042529d72 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -200,6 +200,9 @@ struct drm_i915_file_private {
>  		struct rcu_head rcu;
>  	};
>  
> +	struct mutex proto_context_lock;
> +	struct xarray proto_context_xa;

Kerneldoc here please. Ideally also for the context_xa below (but maybe
that's for later).

Also please add a hint to the proto context struct that it's all fully
protected by proto_context_lock above and is never visible outside of
that.

> +
>  	struct xarray context_xa;
>  	struct xarray vm_xa;
>  
> @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
>  
>  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
>  
> -static inline struct i915_gem_context *
> -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> -{
> -	struct i915_gem_context *ctx;
> -
> -	rcu_read_lock();
> -	ctx = xa_load(&file_priv->context_xa, id);
> -	if (ctx && !kref_get_unless_zero(&ctx->ref))
> -		ctx = NULL;
> -	rcu_read_unlock();
> -
> -	return ctx ? ctx : ERR_PTR(-ENOENT);
> -}
> -
>  /* i915_gem_evict.c */
>  int __must_check i915_gem_evict_something(struct i915_address_space *vm,
>  					  u64 min_size, u64 alignment,

I think I'll check details when I'm not getting distracted by the
vm/engines validation code that I think shouldn't be here :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-29 12:16             ` [Intel-gfx] " Daniel Vetter
@ 2021-04-29 16:02               ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 16:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Matthew Brost, Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:16 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Apr 28, 2021 at 01:58:17PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > the work onto a different engine than than the one it's supposed to
> > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > the bond_execution function pointer and related stuff.
> > > >
> > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > So there really should not be any need to pick the right one at execution
> > > > time.
> > >
> > > The media driver itself seems to work fine if I delete all the code.
> > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > if I can better isolate why.
> >
> > I did more digging and I figured out why this test hangs.  The test
> > looks at an engine class where there's more than one of that class
> > (currently only vcs) and creates a context where engine[0] is all of
> > the engines of that class bonded together and engine[1-N] is each of
> > those engines individually.  It then tests that you can submit a batch
> > to one of the individual engines and then submit with
> > EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
> > out.  This doesn't seem like a use-case we care about.
> >
> > If we cared about anything, I would expect it to be submitting to two
> > balanced contexts and expecting "pick any two" behavior.  But that's
> > not what the test is testing for.
>
> Yeah ditch it.
>
> Instead make sure that the bonded setparam/ctx validation makes sure that
> 1) no virtual engines are used
> 2) no engine used twice
> 3) anything else stupid you can come up with that we should make sure is
> blocked.

I've re-introduced the deletion and I'll add nuking that test to my
IGT series.  I did it as a separate patch as the FENCE_SUBMIT logic
and the bonding are somewhat separate concerns.

As far as validation goes, I don't think we need any more for this
case.  You used FENCE_SUBMIT and didn't properly isolate things such
that the two run on different engines.  Not our problem.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 16:02               ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 16:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 7:16 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Apr 28, 2021 at 01:58:17PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > the work onto a different engine than than the one it's supposed to
> > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > the bond_execution function pointer and related stuff.
> > > >
> > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > So there really should not be any need to pick the right one at execution
> > > > time.
> > >
> > > The media driver itself seems to work fine if I delete all the code.
> > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > if I can better isolate why.
> >
> > I did more digging and I figured out why this test hangs.  The test
> > looks at an engine class where there's more than one of that class
> > (currently only vcs) and creates a context where engine[0] is all of
> > the engines of that class bonded together and engine[1-N] is each of
> > those engines individually.  It then tests that you can submit a batch
> > to one of the individual engines and then submit with
> > EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
> > out.  This doesn't seem like a use-case we care about.
> >
> > If we cared about anything, I would expect it to be submitting to two
> > balanced contexts and expecting "pick any two" behavior.  But that's
> > not what the test is testing for.
>
> Yeah ditch it.
>
> Instead make sure that the bonded setparam/ctx validation makes sure that
> 1) no virtual engines are used
> 2) no engine used twice
> 3) anything else stupid you can come up with that we should make sure is
> blocked.

I've re-introduced the deletion and I'll add nuking that test to my
IGT series.  I did it as a separate patch as the FENCE_SUBMIT logic
and the bonding are somewhat separate concerns.

As far as validation goes, I don't think we need any more for this
case.  You used FENCE_SUBMIT and didn't properly isolate things such
that the two run on different engines.  Not our problem.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct
  2021-04-29 13:02     ` Daniel Vetter
@ 2021-04-29 16:44       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 16:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 8:02 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> The commit introducing a new data structure really should have a solid
> intro in the commit message about. Please cover
>
> - that ctx really should be immutable, safe for exceptions like priority
>
> - that unfortunately we butchered the uapi with setparam and sharing
>   setparams between create_ext and setparam
>
> - and how exactly proto ctx fixes this (with stuff like locking design
>   used)
>
> Maybe also dupe the kerneldoc into here for completeness.
> On Fri, Apr 23, 2021 at 05:31:23PM -0500, Jason Ekstrand wrote:
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 143 ++++++++++++++----
> >  .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +++
> >  .../gpu/drm/i915/gem/selftests/mock_context.c |  16 +-
>
> I'm wondering whether in the end we should split out the proto_ctx into
> its own file, with the struct private only to itself. But I guess
> impossible during the transition, and also maybe afterwards?
>
> >  3 files changed, 150 insertions(+), 30 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index e5efd22c89ba2..3e883daab93bf 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -191,6 +191,95 @@ static int validate_priority(struct drm_i915_private *i915,
> >       return 0;
> >  }
> >
> > +static void proto_context_close(struct i915_gem_proto_context *pc)
> > +{
> > +     if (pc->vm)
> > +             i915_vm_put(pc->vm);
> > +     kfree(pc);
> > +}
> > +
> > +static int proto_context_set_persistence(struct drm_i915_private *i915,
> > +                                      struct i915_gem_proto_context *pc,
> > +                                      bool persist)
> > +{
> > +     if (test_bit(UCONTEXT_PERSISTENCE, &pc->user_flags) == persist)
> > +             return 0;
>
> We have compilers to optimize this kind of stuff, pls remove :-)
> Especially with the non-atomic bitops there's no point.

I thought at one point that this did have a purpose.  However, now
that I look at it harder, I'm pretty sure it doesn't.  Will drop.

> > +
> > +     if (persist) {
> > +             /*
> > +              * Only contexts that are short-lived [that will expire or be
> > +              * reset] are allowed to survive past termination. We require
> > +              * hangcheck to ensure that the persistent requests are healthy.
> > +              */
> > +             if (!i915->params.enable_hangcheck)
> > +                     return -EINVAL;
> > +
> > +             set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
>
> It's a bit entetaining, but the bitops in the kernel are atomic. Which is
> hella confusing here.
>
> I think open coding is the standard for truly normal bitops.

There's __set_bit if you'd rather I use that.

> > +     } else {
> > +             /* To cancel a context we use "preempt-to-idle" */
> > +             if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
> > +                     return -ENODEV;
> > +
> > +             /*
> > +              * If the cancel fails, we then need to reset, cleanly!
> > +              *
> > +              * If the per-engine reset fails, all hope is lost! We resort
> > +              * to a full GPU reset in that unlikely case, but realistically
> > +              * if the engine could not reset, the full reset does not fare
> > +              * much better. The damage has been done.
> > +              *
> > +              * However, if we cannot reset an engine by itself, we cannot
> > +              * cleanup a hanging persistent context without causing
> > +              * colateral damage, and we should not pretend we can by
> > +              * exposing the interface.
> > +              */
> > +             if (!intel_has_reset_engine(&i915->gt))
> > +                     return -ENODEV;
> > +
> > +             clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
>
> Same here.
>
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static struct i915_gem_proto_context *
> > +proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> > +{
> > +     struct i915_gem_proto_context *pc;
> > +
> > +     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > +         !HAS_EXECLISTS(i915))
> > +             return ERR_PTR(-EINVAL);
> > +
> > +     pc = kzalloc(sizeof(*pc), GFP_KERNEL);
> > +     if (!pc)
> > +             return ERR_PTR(-ENOMEM);
> > +
> > +     if (HAS_FULL_PPGTT(i915)) {
> > +             struct i915_ppgtt *ppgtt;
> > +
> > +             ppgtt = i915_ppgtt_create(&i915->gt);
> > +             if (IS_ERR(ppgtt)) {
> > +                     drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> > +                             PTR_ERR(ppgtt));
> > +                     proto_context_close(pc);
> > +                     return ERR_CAST(ppgtt);
> > +             }
> > +             pc->vm = &ppgtt->vm;
> > +     }
> > +
> > +     pc->user_flags = 0;
> > +     set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> > +     set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
>
> Same about atomic bitops here.

Changed to just initialize to BANNABLE | RECOVERABLE.

> > +     proto_context_set_persistence(i915, pc, true);
> > +     pc->sched.priority = I915_PRIORITY_NORMAL;
> > +
> > +     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
> > +             pc->single_timeline = true;
>
> bit a bikeshed, but I'd put the error checking in here too and deal with
> the unwind pain with the usual goto proto_close. That should also make the
> ppgtt unwind path a bit clearer because it sticks out in the standard way.

Sure.  Can do.

> > +
> > +     return pc;
> > +}
> > +
> >  static struct i915_address_space *
> >  context_get_vm_rcu(struct i915_gem_context *ctx)
> >  {
> > @@ -660,7 +749,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
> >  }
> >
> >  static struct i915_gem_context *
> > -__create_context(struct drm_i915_private *i915)
> > +__create_context(struct drm_i915_private *i915,
> > +              const struct i915_gem_proto_context *pc)
> >  {
> >       struct i915_gem_context *ctx;
> >       struct i915_gem_engines *e;
> > @@ -673,7 +763,7 @@ __create_context(struct drm_i915_private *i915)
> >
> >       kref_init(&ctx->ref);
> >       ctx->i915 = i915;
> > -     ctx->sched.priority = I915_PRIORITY_NORMAL;
> > +     ctx->sched = pc->sched;
> >       mutex_init(&ctx->mutex);
> >       INIT_LIST_HEAD(&ctx->link);
> >
> > @@ -696,9 +786,7 @@ __create_context(struct drm_i915_private *i915)
> >        * is no remap info, it will be a NOP. */
> >       ctx->remap_slice = ALL_L3_SLICES(i915);
> >
> > -     i915_gem_context_set_bannable(ctx);
> > -     i915_gem_context_set_recoverable(ctx);
> > -     __context_set_persistence(ctx, true /* cgroup hook? */);
> > +     ctx->user_flags = pc->user_flags;
> >
> >       for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
> >               ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
> > @@ -786,38 +874,23 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> >  }
> >
> >  static struct i915_gem_context *
> > -i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > +i915_gem_create_context(struct drm_i915_private *i915,
> > +                     const struct i915_gem_proto_context *pc)
> >  {
> >       struct i915_gem_context *ctx;
> >       int ret;
> >
> > -     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > -         !HAS_EXECLISTS(i915))
> > -             return ERR_PTR(-EINVAL);
> > -
> > -     ctx = __create_context(i915);
> > +     ctx = __create_context(i915, pc);
> >       if (IS_ERR(ctx))
> >               return ctx;
> >
> > -     if (HAS_FULL_PPGTT(i915)) {
> > -             struct i915_ppgtt *ppgtt;
> > -
> > -             ppgtt = i915_ppgtt_create(&i915->gt);
> > -             if (IS_ERR(ppgtt)) {
> > -                     drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> > -                             PTR_ERR(ppgtt));
> > -                     context_close(ctx);
> > -                     return ERR_CAST(ppgtt);
> > -             }
> > -
> > +     if (pc->vm) {
> >               mutex_lock(&ctx->mutex);
>
> I guess this dies later, but this mutex_lock here is superflous since
> right now no one else can get at our ctx struct. And nothing in
> __assign_ppgtt checks for us holding the lock.
>
> But fine if it only gets remove in the vm immutable patch.

Yeah, I think it gets dropped in the immutable patch.  I just didn't
want to perturb things more than necessary in this one.

> > -             __assign_ppgtt(ctx, &ppgtt->vm);
> > +             __assign_ppgtt(ctx, pc->vm);
> >               mutex_unlock(&ctx->mutex);
> > -
> > -             i915_vm_put(&ppgtt->vm);
> >       }
> >
> > -     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > +     if (pc->single_timeline) {
> >               ret = drm_syncobj_create(&ctx->syncobj,
> >                                        DRM_SYNCOBJ_CREATE_SIGNALED,
> >                                        NULL);
> > @@ -883,6 +956,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >                         struct drm_file *file)
> >  {
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> >       u32 id;
> > @@ -892,7 +966,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >       /* 0 reserved for invalid/unassigned ppgtt */
> >       xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
> >
> > -     ctx = i915_gem_create_context(i915, 0);
> > +     pc = proto_context_create(i915, 0);
> > +     if (IS_ERR(pc)) {
> > +             err = PTR_ERR(pc);
> > +             goto err;
> > +     }
> > +
> > +     ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ctx)) {
> >               err = PTR_ERR(ctx);
> >               goto err;
> > @@ -1884,6 +1965,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >  {
> >       struct drm_i915_private *i915 = to_i915(dev);
> >       struct drm_i915_gem_context_create_ext *args = data;
> > +     struct i915_gem_proto_context *pc;
> >       struct create_ext ext_data;
> >       int ret;
> >       u32 id;
> > @@ -1906,7 +1988,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >               return -EIO;
> >       }
> >
> > -     ext_data.ctx = i915_gem_create_context(i915, args->flags);
> > +     pc = proto_context_create(i915, args->flags);
> > +     if (IS_ERR(pc))
> > +             return PTR_ERR(pc);
> > +
> > +     ext_data.ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ext_data.ctx))
> >               return PTR_ERR(ext_data.ctx);
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index df76767f0c41b..a42c429f94577 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -46,6 +46,27 @@ struct i915_gem_engines_iter {
> >       const struct i915_gem_engines *engines;
> >  };
> >
> > +/**
> > + * struct i915_gem_proto_context - prototype context
> > + *
> > + * The struct i915_gem_proto_context represents the creation parameters for
> > + * an i915_gem_context.  This is used to gather parameters provided either
> > + * through creation flags or via SET_CONTEXT_PARAM so that, when we create
> > + * the final i915_gem_context, those parameters can be immutable.
>
> The patch that puts them on an xa should explain how the locking here
> works, even if it's rather trivial.
>
> > + */
> > +struct i915_gem_proto_context {
> > +     /** @vm: See i915_gem_context::vm */
> > +     struct i915_address_space *vm;
> > +
> > +     /** @user_flags: See i915_gem_context::user_flags */
> > +     unsigned long user_flags;
> > +
> > +     /** @sched: See i915_gem_context::sched */
> > +     struct i915_sched_attr sched;
> > +
>
> To avoid the kerneldoc warning point at your emulated syncobj here.

Done.

> Also this file isn't included in the i915 context docs (why would it, the
> docs have been left dead for years after all :-/). Please fix that in a
> prep patch.

Will do.

--Jason

> > +     bool single_timeline;
> > +};
> > +
> >  /**
> >   * struct i915_gem_context - client state
> >   *
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > index 51b5a3421b400..e0f512ef7f3c6 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > @@ -80,11 +80,17 @@ void mock_init_contexts(struct drm_i915_private *i915)
> >  struct i915_gem_context *
> >  live_context(struct drm_i915_private *i915, struct file *file)
> >  {
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> >       u32 id;
> >
> > -     ctx = i915_gem_create_context(i915, 0);
> > +     pc = proto_context_create(i915, 0);
> > +     if (IS_ERR(pc))
> > +             return ERR_CAST(pc);
> > +
> > +     ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ctx))
> >               return ctx;
> >
> > @@ -142,8 +148,14 @@ struct i915_gem_context *
> >  kernel_context(struct drm_i915_private *i915)
> >  {
> >       struct i915_gem_context *ctx;
> > +     struct i915_gem_proto_context *pc;
> > +
> > +     pc = proto_context_create(i915, 0);
> > +     if (IS_ERR(pc))
> > +             return ERR_CAST(pc);
> >
> > -     ctx = i915_gem_create_context(i915, 0);
> > +     ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ctx))
> >               return ctx;
>
> With all comments addressed: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> >
> > --
> > 2.31.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct
@ 2021-04-29 16:44       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 16:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 8:02 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> The commit introducing a new data structure really should have a solid
> intro in the commit message about. Please cover
>
> - that ctx really should be immutable, safe for exceptions like priority
>
> - that unfortunately we butchered the uapi with setparam and sharing
>   setparams between create_ext and setparam
>
> - and how exactly proto ctx fixes this (with stuff like locking design
>   used)
>
> Maybe also dupe the kerneldoc into here for completeness.
> On Fri, Apr 23, 2021 at 05:31:23PM -0500, Jason Ekstrand wrote:
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 143 ++++++++++++++----
> >  .../gpu/drm/i915/gem/i915_gem_context_types.h |  21 +++
> >  .../gpu/drm/i915/gem/selftests/mock_context.c |  16 +-
>
> I'm wondering whether in the end we should split out the proto_ctx into
> its own file, with the struct private only to itself. But I guess
> impossible during the transition, and also maybe afterwards?
>
> >  3 files changed, 150 insertions(+), 30 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index e5efd22c89ba2..3e883daab93bf 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -191,6 +191,95 @@ static int validate_priority(struct drm_i915_private *i915,
> >       return 0;
> >  }
> >
> > +static void proto_context_close(struct i915_gem_proto_context *pc)
> > +{
> > +     if (pc->vm)
> > +             i915_vm_put(pc->vm);
> > +     kfree(pc);
> > +}
> > +
> > +static int proto_context_set_persistence(struct drm_i915_private *i915,
> > +                                      struct i915_gem_proto_context *pc,
> > +                                      bool persist)
> > +{
> > +     if (test_bit(UCONTEXT_PERSISTENCE, &pc->user_flags) == persist)
> > +             return 0;
>
> We have compilers to optimize this kind of stuff, pls remove :-)
> Especially with the non-atomic bitops there's no point.

I thought at one point that this did have a purpose.  However, now
that I look at it harder, I'm pretty sure it doesn't.  Will drop.

> > +
> > +     if (persist) {
> > +             /*
> > +              * Only contexts that are short-lived [that will expire or be
> > +              * reset] are allowed to survive past termination. We require
> > +              * hangcheck to ensure that the persistent requests are healthy.
> > +              */
> > +             if (!i915->params.enable_hangcheck)
> > +                     return -EINVAL;
> > +
> > +             set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
>
> It's a bit entetaining, but the bitops in the kernel are atomic. Which is
> hella confusing here.
>
> I think open coding is the standard for truly normal bitops.

There's __set_bit if you'd rather I use that.

> > +     } else {
> > +             /* To cancel a context we use "preempt-to-idle" */
> > +             if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
> > +                     return -ENODEV;
> > +
> > +             /*
> > +              * If the cancel fails, we then need to reset, cleanly!
> > +              *
> > +              * If the per-engine reset fails, all hope is lost! We resort
> > +              * to a full GPU reset in that unlikely case, but realistically
> > +              * if the engine could not reset, the full reset does not fare
> > +              * much better. The damage has been done.
> > +              *
> > +              * However, if we cannot reset an engine by itself, we cannot
> > +              * cleanup a hanging persistent context without causing
> > +              * colateral damage, and we should not pretend we can by
> > +              * exposing the interface.
> > +              */
> > +             if (!intel_has_reset_engine(&i915->gt))
> > +                     return -ENODEV;
> > +
> > +             clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
>
> Same here.
>
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static struct i915_gem_proto_context *
> > +proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> > +{
> > +     struct i915_gem_proto_context *pc;
> > +
> > +     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > +         !HAS_EXECLISTS(i915))
> > +             return ERR_PTR(-EINVAL);
> > +
> > +     pc = kzalloc(sizeof(*pc), GFP_KERNEL);
> > +     if (!pc)
> > +             return ERR_PTR(-ENOMEM);
> > +
> > +     if (HAS_FULL_PPGTT(i915)) {
> > +             struct i915_ppgtt *ppgtt;
> > +
> > +             ppgtt = i915_ppgtt_create(&i915->gt);
> > +             if (IS_ERR(ppgtt)) {
> > +                     drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> > +                             PTR_ERR(ppgtt));
> > +                     proto_context_close(pc);
> > +                     return ERR_CAST(ppgtt);
> > +             }
> > +             pc->vm = &ppgtt->vm;
> > +     }
> > +
> > +     pc->user_flags = 0;
> > +     set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> > +     set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
>
> Same about atomic bitops here.

Changed to just initialize to BANNABLE | RECOVERABLE.

> > +     proto_context_set_persistence(i915, pc, true);
> > +     pc->sched.priority = I915_PRIORITY_NORMAL;
> > +
> > +     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
> > +             pc->single_timeline = true;
>
> bit a bikeshed, but I'd put the error checking in here too and deal with
> the unwind pain with the usual goto proto_close. That should also make the
> ppgtt unwind path a bit clearer because it sticks out in the standard way.

Sure.  Can do.

> > +
> > +     return pc;
> > +}
> > +
> >  static struct i915_address_space *
> >  context_get_vm_rcu(struct i915_gem_context *ctx)
> >  {
> > @@ -660,7 +749,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
> >  }
> >
> >  static struct i915_gem_context *
> > -__create_context(struct drm_i915_private *i915)
> > +__create_context(struct drm_i915_private *i915,
> > +              const struct i915_gem_proto_context *pc)
> >  {
> >       struct i915_gem_context *ctx;
> >       struct i915_gem_engines *e;
> > @@ -673,7 +763,7 @@ __create_context(struct drm_i915_private *i915)
> >
> >       kref_init(&ctx->ref);
> >       ctx->i915 = i915;
> > -     ctx->sched.priority = I915_PRIORITY_NORMAL;
> > +     ctx->sched = pc->sched;
> >       mutex_init(&ctx->mutex);
> >       INIT_LIST_HEAD(&ctx->link);
> >
> > @@ -696,9 +786,7 @@ __create_context(struct drm_i915_private *i915)
> >        * is no remap info, it will be a NOP. */
> >       ctx->remap_slice = ALL_L3_SLICES(i915);
> >
> > -     i915_gem_context_set_bannable(ctx);
> > -     i915_gem_context_set_recoverable(ctx);
> > -     __context_set_persistence(ctx, true /* cgroup hook? */);
> > +     ctx->user_flags = pc->user_flags;
> >
> >       for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
> >               ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
> > @@ -786,38 +874,23 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
> >  }
> >
> >  static struct i915_gem_context *
> > -i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > +i915_gem_create_context(struct drm_i915_private *i915,
> > +                     const struct i915_gem_proto_context *pc)
> >  {
> >       struct i915_gem_context *ctx;
> >       int ret;
> >
> > -     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
> > -         !HAS_EXECLISTS(i915))
> > -             return ERR_PTR(-EINVAL);
> > -
> > -     ctx = __create_context(i915);
> > +     ctx = __create_context(i915, pc);
> >       if (IS_ERR(ctx))
> >               return ctx;
> >
> > -     if (HAS_FULL_PPGTT(i915)) {
> > -             struct i915_ppgtt *ppgtt;
> > -
> > -             ppgtt = i915_ppgtt_create(&i915->gt);
> > -             if (IS_ERR(ppgtt)) {
> > -                     drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
> > -                             PTR_ERR(ppgtt));
> > -                     context_close(ctx);
> > -                     return ERR_CAST(ppgtt);
> > -             }
> > -
> > +     if (pc->vm) {
> >               mutex_lock(&ctx->mutex);
>
> I guess this dies later, but this mutex_lock here is superflous since
> right now no one else can get at our ctx struct. And nothing in
> __assign_ppgtt checks for us holding the lock.
>
> But fine if it only gets remove in the vm immutable patch.

Yeah, I think it gets dropped in the immutable patch.  I just didn't
want to perturb things more than necessary in this one.

> > -             __assign_ppgtt(ctx, &ppgtt->vm);
> > +             __assign_ppgtt(ctx, pc->vm);
> >               mutex_unlock(&ctx->mutex);
> > -
> > -             i915_vm_put(&ppgtt->vm);
> >       }
> >
> > -     if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
> > +     if (pc->single_timeline) {
> >               ret = drm_syncobj_create(&ctx->syncobj,
> >                                        DRM_SYNCOBJ_CREATE_SIGNALED,
> >                                        NULL);
> > @@ -883,6 +956,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >                         struct drm_file *file)
> >  {
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> >       u32 id;
> > @@ -892,7 +966,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >       /* 0 reserved for invalid/unassigned ppgtt */
> >       xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
> >
> > -     ctx = i915_gem_create_context(i915, 0);
> > +     pc = proto_context_create(i915, 0);
> > +     if (IS_ERR(pc)) {
> > +             err = PTR_ERR(pc);
> > +             goto err;
> > +     }
> > +
> > +     ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ctx)) {
> >               err = PTR_ERR(ctx);
> >               goto err;
> > @@ -1884,6 +1965,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >  {
> >       struct drm_i915_private *i915 = to_i915(dev);
> >       struct drm_i915_gem_context_create_ext *args = data;
> > +     struct i915_gem_proto_context *pc;
> >       struct create_ext ext_data;
> >       int ret;
> >       u32 id;
> > @@ -1906,7 +1988,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >               return -EIO;
> >       }
> >
> > -     ext_data.ctx = i915_gem_create_context(i915, args->flags);
> > +     pc = proto_context_create(i915, args->flags);
> > +     if (IS_ERR(pc))
> > +             return PTR_ERR(pc);
> > +
> > +     ext_data.ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ext_data.ctx))
> >               return PTR_ERR(ext_data.ctx);
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index df76767f0c41b..a42c429f94577 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -46,6 +46,27 @@ struct i915_gem_engines_iter {
> >       const struct i915_gem_engines *engines;
> >  };
> >
> > +/**
> > + * struct i915_gem_proto_context - prototype context
> > + *
> > + * The struct i915_gem_proto_context represents the creation parameters for
> > + * an i915_gem_context.  This is used to gather parameters provided either
> > + * through creation flags or via SET_CONTEXT_PARAM so that, when we create
> > + * the final i915_gem_context, those parameters can be immutable.
>
> The patch that puts them on an xa should explain how the locking here
> works, even if it's rather trivial.
>
> > + */
> > +struct i915_gem_proto_context {
> > +     /** @vm: See i915_gem_context::vm */
> > +     struct i915_address_space *vm;
> > +
> > +     /** @user_flags: See i915_gem_context::user_flags */
> > +     unsigned long user_flags;
> > +
> > +     /** @sched: See i915_gem_context::sched */
> > +     struct i915_sched_attr sched;
> > +
>
> To avoid the kerneldoc warning point at your emulated syncobj here.

Done.

> Also this file isn't included in the i915 context docs (why would it, the
> docs have been left dead for years after all :-/). Please fix that in a
> prep patch.

Will do.

--Jason

> > +     bool single_timeline;
> > +};
> > +
> >  /**
> >   * struct i915_gem_context - client state
> >   *
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > index 51b5a3421b400..e0f512ef7f3c6 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > @@ -80,11 +80,17 @@ void mock_init_contexts(struct drm_i915_private *i915)
> >  struct i915_gem_context *
> >  live_context(struct drm_i915_private *i915, struct file *file)
> >  {
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> >       u32 id;
> >
> > -     ctx = i915_gem_create_context(i915, 0);
> > +     pc = proto_context_create(i915, 0);
> > +     if (IS_ERR(pc))
> > +             return ERR_CAST(pc);
> > +
> > +     ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ctx))
> >               return ctx;
> >
> > @@ -142,8 +148,14 @@ struct i915_gem_context *
> >  kernel_context(struct drm_i915_private *i915)
> >  {
> >       struct i915_gem_context *ctx;
> > +     struct i915_gem_proto_context *pc;
> > +
> > +     pc = proto_context_create(i915, 0);
> > +     if (IS_ERR(pc))
> > +             return ERR_CAST(pc);
> >
> > -     ctx = i915_gem_create_context(i915, 0);
> > +     ctx = i915_gem_create_context(i915, pc);
> > +     proto_context_close(pc);
> >       if (IS_ERR(ctx))
> >               return ctx;
>
> With all comments addressed: Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> >
> > --
> > 2.31.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-29 14:54           ` Jason Ekstrand
@ 2021-04-29 17:12             ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:12 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Tvrtko Ursulin, Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 09:54:15AM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> >
> >
> > On 28/04/2021 18:24, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > >>> Instead of handling it like a context param, unconditionally set it when
> > >>> intel_contexts are created.  This doesn't fix anything but does simplify
> > >>> the code a bit.
> > >>>
> > >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > >>> ---
> > >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> > >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> > >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> > >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> > >>>            intel_engine_has_timeslices(ce->engine))
> > >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > >>>
> > >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > >>> +         ctx->i915->params.request_timeout_ms) {
> > >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> > >>
> > >> Blank line between declarations and code please, or just lose the local.
> > >>
> > >> Otherwise looks okay. Slight change that same GEM context can now have a
> > >> mix of different request expirations isn't interesting I think. At least
> > >> the change goes away by the end of the series.
> > >
> > > In order for that to happen, I think you'd have to have a race between
> > > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > > sysfs.  Or am I missing something?  Given that timeouts are really
> > > per-engine anyway, I don't think we need to care too much about that.
> >
> > We don't care, no.
> >
> > For completeness only - by the end of the series it is what you say. But
> > at _this_ point in the series though it is if modparam changes at any
> > point between context create and replacing engines. Which is a change
> > compared to before this patch, since modparam was cached in the GEM
> > context so far. So one GEM context was a single request_timeout_ms.
> 
> I've added the following to the commit message:
> 
> It also means that sync files exported from different engines on a
> SINGLE_TIMELINE context will have different fence contexts.  This is
> visible to userspace if it looks at the obj_name field of
> sync_fence_info.
> 
> How's that sound?

If you add "Which media-driver as the sole user of this doesn't do" then I
think it's perfect.
-Daniel

> 
> --Jason
> 
> > Regards,
> >
> > Tvrtko
> >
> > > --Jason
> > >
> > >> Regards,
> > >>
> > >> Tvrtko
> > >>
> > >>> +     }
> > >>>    }
> > >>>
> > >>>    static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> > >>> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
> > >>>        context_apply_all(ctx, __apply_timeline, timeline);
> > >>>    }
> > >>>
> > >>> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> > >>> -{
> > >>> -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> > >>> -}
> > >>> -
> > >>> -static int
> > >>> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> > >>> -{
> > >>> -     int ret;
> > >>> -
> > >>> -     ret = context_apply_all(ctx, __apply_watchdog,
> > >>> -                             (void *)(uintptr_t)timeout_us);
> > >>> -     if (!ret)
> > >>> -             ctx->watchdog.timeout_us = timeout_us;
> > >>> -
> > >>> -     return ret;
> > >>> -}
> > >>> -
> > >>> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> > >>> -{
> > >>> -     struct drm_i915_private *i915 = ctx->i915;
> > >>> -     int ret;
> > >>> -
> > >>> -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> > >>> -         !i915->params.request_timeout_ms)
> > >>> -             return;
> > >>> -
> > >>> -     /* Default expiry for user fences. */
> > >>> -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> > >>> -     if (ret)
> > >>> -             drm_notice(&i915->drm,
> > >>> -                        "Failed to configure default fence expiry! (%d)",
> > >>> -                        ret);
> > >>> -}
> > >>> -
> > >>>    static struct i915_gem_context *
> > >>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > >>>    {
> > >>> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > >>>                intel_timeline_put(timeline);
> > >>>        }
> > >>>
> > >>> -     __set_default_fence_expiry(ctx);
> > >>> -
> > >>>        trace_i915_context_create(ctx);
> > >>>
> > >>>        return ctx;
> > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > >>> index 5ae71ec936f7c..676592e27e7d2 100644
> > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > >>> @@ -153,10 +153,6 @@ struct i915_gem_context {
> > >>>         */
> > >>>        atomic_t active_count;
> > >>>
> > >>> -     struct {
> > >>> -             u64 timeout_us;
> > >>> -     } watchdog;
> > >>> -
> > >>>        /**
> > >>>         * @hang_timestamp: The last time(s) this context caused a GPU hang
> > >>>         */
> > >>> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > >>> index dffedd983693d..0c69cb42d075c 100644
> > >>> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> > >>> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > >>> @@ -10,11 +10,10 @@
> > >>>
> > >>>    #include "intel_context.h"
> > >>>
> > >>> -static inline int
> > >>> +static inline void
> > >>>    intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> > >>>    {
> > >>>        ce->watchdog.timeout_us = timeout_us;
> > >>> -     return 0;
> > >>>    }
> > >>>
> > >>>    #endif /* INTEL_CONTEXT_PARAM_H */
> > >>>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-29 17:12             ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:12 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 09:54:15AM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> >
> >
> > On 28/04/2021 18:24, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > >>> Instead of handling it like a context param, unconditionally set it when
> > >>> intel_contexts are created.  This doesn't fix anything but does simplify
> > >>> the code a bit.
> > >>>
> > >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > >>> ---
> > >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> > >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> > >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> > >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> > >>>            intel_engine_has_timeslices(ce->engine))
> > >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > >>>
> > >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > >>> +         ctx->i915->params.request_timeout_ms) {
> > >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> > >>
> > >> Blank line between declarations and code please, or just lose the local.
> > >>
> > >> Otherwise looks okay. Slight change that same GEM context can now have a
> > >> mix of different request expirations isn't interesting I think. At least
> > >> the change goes away by the end of the series.
> > >
> > > In order for that to happen, I think you'd have to have a race between
> > > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > > sysfs.  Or am I missing something?  Given that timeouts are really
> > > per-engine anyway, I don't think we need to care too much about that.
> >
> > We don't care, no.
> >
> > For completeness only - by the end of the series it is what you say. But
> > at _this_ point in the series though it is if modparam changes at any
> > point between context create and replacing engines. Which is a change
> > compared to before this patch, since modparam was cached in the GEM
> > context so far. So one GEM context was a single request_timeout_ms.
> 
> I've added the following to the commit message:
> 
> It also means that sync files exported from different engines on a
> SINGLE_TIMELINE context will have different fence contexts.  This is
> visible to userspace if it looks at the obj_name field of
> sync_fence_info.
> 
> How's that sound?

If you add "Which media-driver as the sole user of this doesn't do" then I
think it's perfect.
-Daniel

> 
> --Jason
> 
> > Regards,
> >
> > Tvrtko
> >
> > > --Jason
> > >
> > >> Regards,
> > >>
> > >> Tvrtko
> > >>
> > >>> +     }
> > >>>    }
> > >>>
> > >>>    static void __free_engines(struct i915_gem_engines *e, unsigned int count)
> > >>> @@ -792,41 +796,6 @@ static void __assign_timeline(struct i915_gem_context *ctx,
> > >>>        context_apply_all(ctx, __apply_timeline, timeline);
> > >>>    }
> > >>>
> > >>> -static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
> > >>> -{
> > >>> -     return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
> > >>> -}
> > >>> -
> > >>> -static int
> > >>> -__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
> > >>> -{
> > >>> -     int ret;
> > >>> -
> > >>> -     ret = context_apply_all(ctx, __apply_watchdog,
> > >>> -                             (void *)(uintptr_t)timeout_us);
> > >>> -     if (!ret)
> > >>> -             ctx->watchdog.timeout_us = timeout_us;
> > >>> -
> > >>> -     return ret;
> > >>> -}
> > >>> -
> > >>> -static void __set_default_fence_expiry(struct i915_gem_context *ctx)
> > >>> -{
> > >>> -     struct drm_i915_private *i915 = ctx->i915;
> > >>> -     int ret;
> > >>> -
> > >>> -     if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) ||
> > >>> -         !i915->params.request_timeout_ms)
> > >>> -             return;
> > >>> -
> > >>> -     /* Default expiry for user fences. */
> > >>> -     ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
> > >>> -     if (ret)
> > >>> -             drm_notice(&i915->drm,
> > >>> -                        "Failed to configure default fence expiry! (%d)",
> > >>> -                        ret);
> > >>> -}
> > >>> -
> > >>>    static struct i915_gem_context *
> > >>>    i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > >>>    {
> > >>> @@ -871,8 +840,6 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
> > >>>                intel_timeline_put(timeline);
> > >>>        }
> > >>>
> > >>> -     __set_default_fence_expiry(ctx);
> > >>> -
> > >>>        trace_i915_context_create(ctx);
> > >>>
> > >>>        return ctx;
> > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > >>> index 5ae71ec936f7c..676592e27e7d2 100644
> > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > >>> @@ -153,10 +153,6 @@ struct i915_gem_context {
> > >>>         */
> > >>>        atomic_t active_count;
> > >>>
> > >>> -     struct {
> > >>> -             u64 timeout_us;
> > >>> -     } watchdog;
> > >>> -
> > >>>        /**
> > >>>         * @hang_timestamp: The last time(s) this context caused a GPU hang
> > >>>         */
> > >>> diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > >>> index dffedd983693d..0c69cb42d075c 100644
> > >>> --- a/drivers/gpu/drm/i915/gt/intel_context_param.h
> > >>> +++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
> > >>> @@ -10,11 +10,10 @@
> > >>>
> > >>>    #include "intel_context.h"
> > >>>
> > >>> -static inline int
> > >>> +static inline void
> > >>>    intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
> > >>>    {
> > >>>        ce->watchdog.timeout_us = timeout_us;
> > >>> -     return 0;
> > >>>    }
> > >>>
> > >>>    #endif /* INTEL_CONTEXT_PARAM_H */
> > >>>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-29 17:12             ` Daniel Vetter
@ 2021-04-29 17:13               ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:13 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Tvrtko Ursulin, Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 07:12:05PM +0200, Daniel Vetter wrote:
> On Thu, Apr 29, 2021 at 09:54:15AM -0500, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> > >
> > >
> > > On 28/04/2021 18:24, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > >>> Instead of handling it like a context param, unconditionally set it when
> > > >>> intel_contexts are created.  This doesn't fix anything but does simplify
> > > >>> the code a bit.
> > > >>>
> > > >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > >>> ---
> > > >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> > > >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> > > >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> > > >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> > > >>>
> > > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> > > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > >>>            intel_engine_has_timeslices(ce->engine))
> > > >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > >>>
> > > >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > > >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > > >>> +         ctx->i915->params.request_timeout_ms) {
> > > >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > > >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> > > >>
> > > >> Blank line between declarations and code please, or just lose the local.
> > > >>
> > > >> Otherwise looks okay. Slight change that same GEM context can now have a
> > > >> mix of different request expirations isn't interesting I think. At least
> > > >> the change goes away by the end of the series.
> > > >
> > > > In order for that to happen, I think you'd have to have a race between
> > > > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > > > sysfs.  Or am I missing something?  Given that timeouts are really
> > > > per-engine anyway, I don't think we need to care too much about that.
> > >
> > > We don't care, no.
> > >
> > > For completeness only - by the end of the series it is what you say. But
> > > at _this_ point in the series though it is if modparam changes at any
> > > point between context create and replacing engines. Which is a change
> > > compared to before this patch, since modparam was cached in the GEM
> > > context so far. So one GEM context was a single request_timeout_ms.
> > 
> > I've added the following to the commit message:
> > 
> > It also means that sync files exported from different engines on a
> > SINGLE_TIMELINE context will have different fence contexts.  This is
> > visible to userspace if it looks at the obj_name field of
> > sync_fence_info.
> > 
> > How's that sound?
> 
> If you add "Which media-driver as the sole user of this doesn't do" then I
> think it's perfect.

Uh I think you replied to the wrong thread :-)

This here is about watchdog, not timeline.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-29 17:13               ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:13 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 07:12:05PM +0200, Daniel Vetter wrote:
> On Thu, Apr 29, 2021 at 09:54:15AM -0500, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> > >
> > >
> > > On 28/04/2021 18:24, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > >>> Instead of handling it like a context param, unconditionally set it when
> > > >>> intel_contexts are created.  This doesn't fix anything but does simplify
> > > >>> the code a bit.
> > > >>>
> > > >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > >>> ---
> > > >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> > > >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> > > >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> > > >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> > > >>>
> > > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> > > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > >>>            intel_engine_has_timeslices(ce->engine))
> > > >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > >>>
> > > >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > > >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > > >>> +         ctx->i915->params.request_timeout_ms) {
> > > >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > > >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> > > >>
> > > >> Blank line between declarations and code please, or just lose the local.
> > > >>
> > > >> Otherwise looks okay. Slight change that same GEM context can now have a
> > > >> mix of different request expirations isn't interesting I think. At least
> > > >> the change goes away by the end of the series.
> > > >
> > > > In order for that to happen, I think you'd have to have a race between
> > > > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > > > sysfs.  Or am I missing something?  Given that timeouts are really
> > > > per-engine anyway, I don't think we need to care too much about that.
> > >
> > > We don't care, no.
> > >
> > > For completeness only - by the end of the series it is what you say. But
> > > at _this_ point in the series though it is if modparam changes at any
> > > point between context create and replacing engines. Which is a change
> > > compared to before this patch, since modparam was cached in the GEM
> > > context so far. So one GEM context was a single request_timeout_ms.
> > 
> > I've added the following to the commit message:
> > 
> > It also means that sync files exported from different engines on a
> > SINGLE_TIMELINE context will have different fence contexts.  This is
> > visible to userspace if it looks at the obj_name field of
> > sync_fence_info.
> > 
> > How's that sound?
> 
> If you add "Which media-driver as the sole user of this doesn't do" then I
> think it's perfect.

Uh I think you replied to the wrong thread :-)

This here is about watchdog, not timeline.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-29 16:02               ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 17:14                 ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:14 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Matthew Brost, Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 11:02:27AM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 7:16 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, Apr 28, 2021 at 01:58:17PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > > the work onto a different engine than than the one it's supposed to
> > > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > > the bond_execution function pointer and related stuff.
> > > > >
> > > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > > So there really should not be any need to pick the right one at execution
> > > > > time.
> > > >
> > > > The media driver itself seems to work fine if I delete all the code.
> > > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > > if I can better isolate why.
> > >
> > > I did more digging and I figured out why this test hangs.  The test
> > > looks at an engine class where there's more than one of that class
> > > (currently only vcs) and creates a context where engine[0] is all of
> > > the engines of that class bonded together and engine[1-N] is each of
> > > those engines individually.  It then tests that you can submit a batch
> > > to one of the individual engines and then submit with
> > > EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
> > > out.  This doesn't seem like a use-case we care about.
> > >
> > > If we cared about anything, I would expect it to be submitting to two
> > > balanced contexts and expecting "pick any two" behavior.  But that's
> > > not what the test is testing for.
> >
> > Yeah ditch it.
> >
> > Instead make sure that the bonded setparam/ctx validation makes sure that
> > 1) no virtual engines are used
> > 2) no engine used twice
> > 3) anything else stupid you can come up with that we should make sure is
> > blocked.
> 
> I've re-introduced the deletion and I'll add nuking that test to my
> IGT series.  I did it as a separate patch as the FENCE_SUBMIT logic
> and the bonding are somewhat separate concerns.
> 
> As far as validation goes, I don't think we need any more for this
> case.  You used FENCE_SUBMIT and didn't properly isolate things such
> that the two run on different engines.  Not our problem.

Oh I just meant validating the bonded ctx extension thing. Not validating
submit fence, that's rather hopeless since it really allows anything you
can think of, by design.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-29 17:14                 ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:14 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 11:02:27AM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 7:16 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, Apr 28, 2021 at 01:58:17PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 12:18 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 5:13 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Tue, Apr 27, 2021 at 08:51:08AM -0500, Jason Ekstrand wrote:
> > > > > > I sent a v2 of this patch because it turns out I deleted a bit too
> > > > > > much code.  This function in particular, has to stay, unfortunately.
> > > > > > When a batch is submitted with a SUBMIT_FENCE, this is used to push
> > > > > > the work onto a different engine than than the one it's supposed to
> > > > > > run in parallel with.  This means we can't dead-code this function or
> > > > > > the bond_execution function pointer and related stuff.
> > > > >
> > > > > Uh that's disappointing, since if I understand your point correctly, the
> > > > > sibling engines should all be singletons, not load balancing virtual ones.
> > > > > So there really should not be any need to pick the right one at execution
> > > > > time.
> > > >
> > > > The media driver itself seems to work fine if I delete all the code.
> > > > It's just an IGT testcase that blows up.  I'll do more digging to see
> > > > if I can better isolate why.
> > >
> > > I did more digging and I figured out why this test hangs.  The test
> > > looks at an engine class where there's more than one of that class
> > > (currently only vcs) and creates a context where engine[0] is all of
> > > the engines of that class bonded together and engine[1-N] is each of
> > > those engines individually.  It then tests that you can submit a batch
> > > to one of the individual engines and then submit with
> > > EXEC_FENCE_SUBMIT to the balanced engine and the kernel will sort it
> > > out.  This doesn't seem like a use-case we care about.
> > >
> > > If we cared about anything, I would expect it to be submitting to two
> > > balanced contexts and expecting "pick any two" behavior.  But that's
> > > not what the test is testing for.
> >
> > Yeah ditch it.
> >
> > Instead make sure that the bonded setparam/ctx validation makes sure that
> > 1) no virtual engines are used
> > 2) no engine used twice
> > 3) anything else stupid you can come up with that we should make sure is
> > blocked.
> 
> I've re-introduced the deletion and I'll add nuking that test to my
> IGT series.  I did it as a separate patch as the FENCE_SUBMIT logic
> and the bonding are somewhat separate concerns.
> 
> As far as validation goes, I don't think we need any more for this
> case.  You used FENCE_SUBMIT and didn't properly isolate things such
> that the two run on different engines.  Not our problem.

Oh I just meant validating the bonded ctx extension thing. Not validating
submit fence, that's rather hopeless since it really allows anything you
can think of, by design.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
  2021-04-29 15:29       ` Jason Ekstrand
@ 2021-04-29 17:16         ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 10:29:51AM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 8:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Apr 23, 2021 at 05:31:24PM -0500, Jason Ekstrand wrote:
> > > We're about to start doing lazy context creation which means contexts
> > > get created in i915_gem_context_lookup and we may start having more
> > > errors than -ENOENT.
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
> > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
> > >  drivers/gpu/drm/i915/i915_drv.h                |  2 +-
> > >  drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
> > >  4 files changed, 11 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index 3e883daab93bf..7929d5a8be449 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> > >       int ret = 0;
> > >
> > >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > > -     if (!ctx)
> > > -             return -ENOENT;
> > > +     if (IS_ERR(ctx))
> > > +             return PTR_ERR(ctx);
> > >
> > >       switch (args->param) {
> > >       case I915_CONTEXT_PARAM_GTT_SIZE:
> > > @@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> > >       int ret;
> > >
> > >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > > -     if (!ctx)
> > > -             return -ENOENT;
> > > +     if (IS_ERR(ctx))
> > > +             return PTR_ERR(ctx);
> > >
> > >       ret = ctx_setparam(file_priv, ctx, args);
> > >
> > > @@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> > >               return -EINVAL;
> > >
> > >       ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> > > -     if (!ctx)
> > > -             return -ENOENT;
> > > +     if (IS_ERR(ctx))
> > > +             return PTR_ERR(ctx);
> > >
> > >       /*
> > >        * We opt for unserialised reads here. This may result in tearing
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index 7024adcd5cf15..de14b26f3b2d5 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
> > >       struct i915_gem_context *ctx;
> > >
> > >       ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
> > > -     if (unlikely(!ctx))
> > > -             return -ENOENT;
> > > +     if (unlikely(IS_ERR(ctx)))
> > > +             return PTR_ERR(ctx);
> > >
> > >       eb->gem_context = ctx;
> > >       if (rcu_access_pointer(ctx->vm))
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 8571c5c1509a7..004ed0e59c999 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >
> > I just realized that I think __i915_gem_context_lookup_rcu doesn't have
> > users anymore. Please make sure it's deleted.
> 
> I deleted it in "drm/i915: Stop manually RCU banging in reset_stats_ioctl"

Indeed that's the case, so looks all fine. The benefits of reviewing
patches one-by-one :-/
-Daniel

> 
> 
> > > @@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > >               ctx = NULL;
> > >       rcu_read_unlock();
> > >
> > > -     return ctx;
> > > +     return ctx ? ctx : ERR_PTR(-ENOENT);
> > >  }
> > >
> > >  /* i915_gem_evict.c */
> > > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > > index 85ad62dbabfab..b86ed03f6a705 100644
> > > --- a/drivers/gpu/drm/i915/i915_perf.c
> > > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > > @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
> > >               struct drm_i915_file_private *file_priv = file->driver_priv;
> > >
> > >               specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
> > > -             if (!specific_ctx) {
> > > +             if (IS_ERR(specific_ctx)) {
> > >                       DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
> > >                                 ctx_handle);
> > > -                     ret = -ENOENT;
> > > +                     ret = PTR_ERR(specific_ctx);
> >
> > Yeah this looks like a nice place to integrate this.
> >
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > One thing we need to make sure in the next patch or thereabouts is that
> > lookup can only return ENOENT or ENOMEM, but never EINVAL. I'll drop some
> > bikesheds on that :-)
> 
> I believe that is the case.  All -EINVAL should be handled in the
> proto-context code.
> 
> --Jason
> 
> > -Daniel
> >
> > >                       goto err;
> > >               }
> > >       }
> > > --
> > > 2.31.1
> > >
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup
@ 2021-04-29 17:16         ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:16 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 10:29:51AM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 8:27 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Apr 23, 2021 at 05:31:24PM -0500, Jason Ekstrand wrote:
> > > We're about to start doing lazy context creation which means contexts
> > > get created in i915_gem_context_lookup and we may start having more
> > > errors than -ENOENT.
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.c    | 12 ++++++------
> > >  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
> > >  drivers/gpu/drm/i915/i915_drv.h                |  2 +-
> > >  drivers/gpu/drm/i915/i915_perf.c               |  4 ++--
> > >  4 files changed, 11 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index 3e883daab93bf..7929d5a8be449 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -2105,8 +2105,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
> > >       int ret = 0;
> > >
> > >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > > -     if (!ctx)
> > > -             return -ENOENT;
> > > +     if (IS_ERR(ctx))
> > > +             return PTR_ERR(ctx);
> > >
> > >       switch (args->param) {
> > >       case I915_CONTEXT_PARAM_GTT_SIZE:
> > > @@ -2174,8 +2174,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> > >       int ret;
> > >
> > >       ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > > -     if (!ctx)
> > > -             return -ENOENT;
> > > +     if (IS_ERR(ctx))
> > > +             return PTR_ERR(ctx);
> > >
> > >       ret = ctx_setparam(file_priv, ctx, args);
> > >
> > > @@ -2194,8 +2194,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
> > >               return -EINVAL;
> > >
> > >       ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
> > > -     if (!ctx)
> > > -             return -ENOENT;
> > > +     if (IS_ERR(ctx))
> > > +             return PTR_ERR(ctx);
> > >
> > >       /*
> > >        * We opt for unserialised reads here. This may result in tearing
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index 7024adcd5cf15..de14b26f3b2d5 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
> > >       struct i915_gem_context *ctx;
> > >
> > >       ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
> > > -     if (unlikely(!ctx))
> > > -             return -ENOENT;
> > > +     if (unlikely(IS_ERR(ctx)))
> > > +             return PTR_ERR(ctx);
> > >
> > >       eb->gem_context = ctx;
> > >       if (rcu_access_pointer(ctx->vm))
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 8571c5c1509a7..004ed0e59c999 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >
> > I just realized that I think __i915_gem_context_lookup_rcu doesn't have
> > users anymore. Please make sure it's deleted.
> 
> I deleted it in "drm/i915: Stop manually RCU banging in reset_stats_ioctl"

Indeed that's the case, so looks all fine. The benefits of reviewing
patches one-by-one :-/
-Daniel

> 
> 
> > > @@ -1851,7 +1851,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > >               ctx = NULL;
> > >       rcu_read_unlock();
> > >
> > > -     return ctx;
> > > +     return ctx ? ctx : ERR_PTR(-ENOENT);
> > >  }
> > >
> > >  /* i915_gem_evict.c */
> > > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > > index 85ad62dbabfab..b86ed03f6a705 100644
> > > --- a/drivers/gpu/drm/i915/i915_perf.c
> > > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > > @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
> > >               struct drm_i915_file_private *file_priv = file->driver_priv;
> > >
> > >               specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
> > > -             if (!specific_ctx) {
> > > +             if (IS_ERR(specific_ctx)) {
> > >                       DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
> > >                                 ctx_handle);
> > > -                     ret = -ENOENT;
> > > +                     ret = PTR_ERR(specific_ctx);
> >
> > Yeah this looks like a nice place to integrate this.
> >
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > One thing we need to make sure in the next patch or thereabouts is that
> > lookup can only return ENOENT or ENOMEM, but never EINVAL. I'll drop some
> > bikesheds on that :-)
> 
> I believe that is the case.  All -EINVAL should be handled in the
> proto-context code.
> 
> --Jason
> 
> > -Daniel
> >
> > >                       goto err;
> > >               }
> > >       }
> > > --
> > > 2.31.1
> > >
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 20/21] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 17:19     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:19 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:30PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Maybe spend a few words on explaining why in these two selftest patches
instead of letting me guess :-)
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 76029d7143f6c..76dd5cfe11b3c 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
>  			struct i915_gem_context *ctx;
>  			struct intel_context *ce;
>  
> -			ctx = kernel_context(i915, NULL);
> +			ctx = kernel_context(i915, ctx_vm(parent));
>  			if (IS_ERR(ctx)) {
>  				err = PTR_ERR(ctx);
>  				goto out_test;
>  			}
>  
> -			mutex_lock(&ctx->mutex);
> -			__assign_ppgtt(ctx, ctx_vm(parent));
> -			mutex_unlock(&ctx->mutex);
> -
>  			ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
>  			GEM_BUG_ON(IS_ERR(ce));
>  
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 20/21] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
@ 2021-04-29 17:19     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:19 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:30PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Maybe spend a few words on explaining why in these two selftest patches
instead of letting me guess :-)
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> index 76029d7143f6c..76dd5cfe11b3c 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
> @@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
>  			struct i915_gem_context *ctx;
>  			struct intel_context *ce;
>  
> -			ctx = kernel_context(i915, NULL);
> +			ctx = kernel_context(i915, ctx_vm(parent));
>  			if (IS_ERR(ctx)) {
>  				err = PTR_ERR(ctx);
>  				goto out_test;
>  			}
>  
> -			mutex_lock(&ctx->mutex);
> -			__assign_ppgtt(ctx, ctx_vm(parent));
> -			mutex_unlock(&ctx->mutex);
> -
>  			ce = i915_gem_context_get_engine(ctx, engine->legacy_idx);
>  			GEM_BUG_ON(IS_ERR(ce));
>  
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 18/21] drm/i915/gem: Don't allow changing the engine set on running contexts
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 17:21     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:21 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:28PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

I think with the additions move in here and commit message explaining a
bit what's going on this looks all reasonable.

I think minimally you should explain the audit you've done here and which
userspace still uses this post CTX_CREATE_EXT in setparam. That would be
really good to have recorded for all these changes. And if that explainer
is on the proto ctx code you're adding it can even be found in the future
again.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 301 --------------------
>  1 file changed, 301 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 3238260cffa31..ef23ab4260c24 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1722,303 +1722,6 @@ static int set_sseu(struct i915_gem_context *ctx,
>  	return ret;
>  }
>  
> -struct set_engines {
> -	struct i915_gem_context *ctx;
> -	struct i915_gem_engines *engines;
> -};
> -
> -static int
> -set_engines__load_balance(struct i915_user_extension __user *base, void *data)
> -{
> -	struct i915_context_engines_load_balance __user *ext =
> -		container_of_user(base, typeof(*ext), base);
> -	const struct set_engines *set = data;
> -	struct drm_i915_private *i915 = set->ctx->i915;
> -	struct intel_engine_cs *stack[16];
> -	struct intel_engine_cs **siblings;
> -	struct intel_context *ce;
> -	u16 num_siblings, idx;
> -	unsigned int n;
> -	int err;
> -
> -	if (!HAS_EXECLISTS(i915))
> -		return -ENODEV;
> -
> -	if (intel_uc_uses_guc_submission(&i915->gt.uc))
> -		return -ENODEV; /* not implement yet */
> -
> -	if (get_user(idx, &ext->engine_index))
> -		return -EFAULT;
> -
> -	if (idx >= set->engines->num_engines) {
> -		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
> -			idx, set->engines->num_engines);
> -		return -EINVAL;
> -	}
> -
> -	idx = array_index_nospec(idx, set->engines->num_engines);
> -	if (set->engines->engines[idx]) {
> -		drm_dbg(&i915->drm,
> -			"Invalid placement[%d], already occupied\n", idx);
> -		return -EEXIST;
> -	}
> -
> -	if (get_user(num_siblings, &ext->num_siblings))
> -		return -EFAULT;
> -
> -	err = check_user_mbz(&ext->flags);
> -	if (err)
> -		return err;
> -
> -	err = check_user_mbz(&ext->mbz64);
> -	if (err)
> -		return err;
> -
> -	siblings = stack;
> -	if (num_siblings > ARRAY_SIZE(stack)) {
> -		siblings = kmalloc_array(num_siblings,
> -					 sizeof(*siblings),
> -					 GFP_KERNEL);
> -		if (!siblings)
> -			return -ENOMEM;
> -	}
> -
> -	for (n = 0; n < num_siblings; n++) {
> -		struct i915_engine_class_instance ci;
> -
> -		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
> -			err = -EFAULT;
> -			goto out_siblings;
> -		}
> -
> -		siblings[n] = intel_engine_lookup_user(i915,
> -						       ci.engine_class,
> -						       ci.engine_instance);
> -		if (!siblings[n]) {
> -			drm_dbg(&i915->drm,
> -				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
> -				n, ci.engine_class, ci.engine_instance);
> -			err = -EINVAL;
> -			goto out_siblings;
> -		}
> -	}
> -
> -	ce = intel_execlists_create_virtual(siblings, n);
> -	if (IS_ERR(ce)) {
> -		err = PTR_ERR(ce);
> -		goto out_siblings;
> -	}
> -
> -	intel_context_set_gem(ce, set->ctx);
> -
> -	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
> -		intel_context_put(ce);
> -		err = -EEXIST;
> -		goto out_siblings;
> -	}
> -
> -out_siblings:
> -	if (siblings != stack)
> -		kfree(siblings);
> -
> -	return err;
> -}
> -
> -static int
> -set_engines__bond(struct i915_user_extension __user *base, void *data)
> -{
> -	struct i915_context_engines_bond __user *ext =
> -		container_of_user(base, typeof(*ext), base);
> -	const struct set_engines *set = data;
> -	struct drm_i915_private *i915 = set->ctx->i915;
> -	struct i915_engine_class_instance ci;
> -	struct intel_engine_cs *virtual;
> -	struct intel_engine_cs *master;
> -	u16 idx, num_bonds;
> -	int err, n;
> -
> -	if (get_user(idx, &ext->virtual_index))
> -		return -EFAULT;
> -
> -	if (idx >= set->engines->num_engines) {
> -		drm_dbg(&i915->drm,
> -			"Invalid index for virtual engine: %d >= %d\n",
> -			idx, set->engines->num_engines);
> -		return -EINVAL;
> -	}
> -
> -	idx = array_index_nospec(idx, set->engines->num_engines);
> -	if (!set->engines->engines[idx]) {
> -		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
> -		return -EINVAL;
> -	}
> -	virtual = set->engines->engines[idx]->engine;
> -
> -	if (intel_engine_is_virtual(virtual)) {
> -		drm_dbg(&i915->drm,
> -			"Bonding with virtual engines not allowed\n");
> -		return -EINVAL;
> -	}
> -
> -	err = check_user_mbz(&ext->flags);
> -	if (err)
> -		return err;
> -
> -	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> -		err = check_user_mbz(&ext->mbz64[n]);
> -		if (err)
> -			return err;
> -	}
> -
> -	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
> -		return -EFAULT;
> -
> -	master = intel_engine_lookup_user(i915,
> -					  ci.engine_class, ci.engine_instance);
> -	if (!master) {
> -		drm_dbg(&i915->drm,
> -			"Unrecognised master engine: { class:%u, instance:%u }\n",
> -			ci.engine_class, ci.engine_instance);
> -		return -EINVAL;
> -	}
> -
> -	if (get_user(num_bonds, &ext->num_bonds))
> -		return -EFAULT;
> -
> -	for (n = 0; n < num_bonds; n++) {
> -		struct intel_engine_cs *bond;
> -
> -		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
> -			return -EFAULT;
> -
> -		bond = intel_engine_lookup_user(i915,
> -						ci.engine_class,
> -						ci.engine_instance);
> -		if (!bond) {
> -			drm_dbg(&i915->drm,
> -				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
> -				n, ci.engine_class, ci.engine_instance);
> -			return -EINVAL;
> -		}
> -	}
> -
> -	return 0;
> -}
> -
> -static const i915_user_extension_fn set_engines__extensions[] = {
> -	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
> -	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
> -};
> -
> -static int
> -set_engines(struct i915_gem_context *ctx,
> -	    const struct drm_i915_gem_context_param *args)
> -{
> -	struct drm_i915_private *i915 = ctx->i915;
> -	struct i915_context_param_engines __user *user =
> -		u64_to_user_ptr(args->value);
> -	struct set_engines set = { .ctx = ctx };
> -	unsigned int num_engines, n;
> -	u64 extensions;
> -	int err;
> -
> -	if (!args->size) { /* switch back to legacy user_ring_map */
> -		if (!i915_gem_context_user_engines(ctx))
> -			return 0;
> -
> -		set.engines = default_engines(ctx);
> -		if (IS_ERR(set.engines))
> -			return PTR_ERR(set.engines);
> -
> -		goto replace;
> -	}
> -
> -	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
> -	if (args->size < sizeof(*user) ||
> -	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
> -		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
> -			args->size);
> -		return -EINVAL;
> -	}
> -
> -	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> -	if (num_engines > I915_EXEC_RING_MASK + 1)
> -		return -EINVAL;
> -
> -	set.engines = alloc_engines(num_engines);
> -	if (!set.engines)
> -		return -ENOMEM;
> -
> -	for (n = 0; n < num_engines; n++) {
> -		struct i915_engine_class_instance ci;
> -		struct intel_engine_cs *engine;
> -		struct intel_context *ce;
> -
> -		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
> -			__free_engines(set.engines, n);
> -			return -EFAULT;
> -		}
> -
> -		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
> -		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
> -			set.engines->engines[n] = NULL;
> -			continue;
> -		}
> -
> -		engine = intel_engine_lookup_user(ctx->i915,
> -						  ci.engine_class,
> -						  ci.engine_instance);
> -		if (!engine) {
> -			drm_dbg(&i915->drm,
> -				"Invalid engine[%d]: { class:%d, instance:%d }\n",
> -				n, ci.engine_class, ci.engine_instance);
> -			__free_engines(set.engines, n);
> -			return -ENOENT;
> -		}
> -
> -		ce = intel_context_create(engine);
> -		if (IS_ERR(ce)) {
> -			__free_engines(set.engines, n);
> -			return PTR_ERR(ce);
> -		}
> -
> -		intel_context_set_gem(ce, ctx);
> -
> -		set.engines->engines[n] = ce;
> -	}
> -	set.engines->num_engines = num_engines;
> -
> -	err = -EFAULT;
> -	if (!get_user(extensions, &user->extensions))
> -		err = i915_user_extensions(u64_to_user_ptr(extensions),
> -					   set_engines__extensions,
> -					   ARRAY_SIZE(set_engines__extensions),
> -					   &set);
> -	if (err) {
> -		free_engines(set.engines);
> -		return err;
> -	}
> -
> -replace:
> -	mutex_lock(&ctx->engines_mutex);
> -	if (i915_gem_context_is_closed(ctx)) {
> -		mutex_unlock(&ctx->engines_mutex);
> -		free_engines(set.engines);
> -		return -ENOENT;
> -	}
> -	if (args->size)
> -		i915_gem_context_set_user_engines(ctx);
> -	else
> -		i915_gem_context_clear_user_engines(ctx);
> -	set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1);
> -	mutex_unlock(&ctx->engines_mutex);
> -
> -	/* Keep track of old engine sets for kill_context() */
> -	engines_idle_release(ctx, set.engines);
> -
> -	return 0;
> -}
> -
>  static int
>  set_persistence(struct i915_gem_context *ctx,
>  		const struct drm_i915_gem_context_param *args)
> @@ -2101,10 +1804,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  		ret = set_sseu(ctx, args);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_ENGINES:
> -		ret = set_engines(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_PERSISTENCE:
>  		ret = set_persistence(ctx, args);
>  		break;
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 18/21] drm/i915/gem: Don't allow changing the engine set on running contexts
@ 2021-04-29 17:21     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:21 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:28PM -0500, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

I think with the additions move in here and commit message explaining a
bit what's going on this looks all reasonable.

I think minimally you should explain the audit you've done here and which
userspace still uses this post CTX_CREATE_EXT in setparam. That would be
really good to have recorded for all these changes. And if that explainer
is on the proto ctx code you're adding it can even be found in the future
again.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 301 --------------------
>  1 file changed, 301 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index 3238260cffa31..ef23ab4260c24 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1722,303 +1722,6 @@ static int set_sseu(struct i915_gem_context *ctx,
>  	return ret;
>  }
>  
> -struct set_engines {
> -	struct i915_gem_context *ctx;
> -	struct i915_gem_engines *engines;
> -};
> -
> -static int
> -set_engines__load_balance(struct i915_user_extension __user *base, void *data)
> -{
> -	struct i915_context_engines_load_balance __user *ext =
> -		container_of_user(base, typeof(*ext), base);
> -	const struct set_engines *set = data;
> -	struct drm_i915_private *i915 = set->ctx->i915;
> -	struct intel_engine_cs *stack[16];
> -	struct intel_engine_cs **siblings;
> -	struct intel_context *ce;
> -	u16 num_siblings, idx;
> -	unsigned int n;
> -	int err;
> -
> -	if (!HAS_EXECLISTS(i915))
> -		return -ENODEV;
> -
> -	if (intel_uc_uses_guc_submission(&i915->gt.uc))
> -		return -ENODEV; /* not implement yet */
> -
> -	if (get_user(idx, &ext->engine_index))
> -		return -EFAULT;
> -
> -	if (idx >= set->engines->num_engines) {
> -		drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
> -			idx, set->engines->num_engines);
> -		return -EINVAL;
> -	}
> -
> -	idx = array_index_nospec(idx, set->engines->num_engines);
> -	if (set->engines->engines[idx]) {
> -		drm_dbg(&i915->drm,
> -			"Invalid placement[%d], already occupied\n", idx);
> -		return -EEXIST;
> -	}
> -
> -	if (get_user(num_siblings, &ext->num_siblings))
> -		return -EFAULT;
> -
> -	err = check_user_mbz(&ext->flags);
> -	if (err)
> -		return err;
> -
> -	err = check_user_mbz(&ext->mbz64);
> -	if (err)
> -		return err;
> -
> -	siblings = stack;
> -	if (num_siblings > ARRAY_SIZE(stack)) {
> -		siblings = kmalloc_array(num_siblings,
> -					 sizeof(*siblings),
> -					 GFP_KERNEL);
> -		if (!siblings)
> -			return -ENOMEM;
> -	}
> -
> -	for (n = 0; n < num_siblings; n++) {
> -		struct i915_engine_class_instance ci;
> -
> -		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
> -			err = -EFAULT;
> -			goto out_siblings;
> -		}
> -
> -		siblings[n] = intel_engine_lookup_user(i915,
> -						       ci.engine_class,
> -						       ci.engine_instance);
> -		if (!siblings[n]) {
> -			drm_dbg(&i915->drm,
> -				"Invalid sibling[%d]: { class:%d, inst:%d }\n",
> -				n, ci.engine_class, ci.engine_instance);
> -			err = -EINVAL;
> -			goto out_siblings;
> -		}
> -	}
> -
> -	ce = intel_execlists_create_virtual(siblings, n);
> -	if (IS_ERR(ce)) {
> -		err = PTR_ERR(ce);
> -		goto out_siblings;
> -	}
> -
> -	intel_context_set_gem(ce, set->ctx);
> -
> -	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
> -		intel_context_put(ce);
> -		err = -EEXIST;
> -		goto out_siblings;
> -	}
> -
> -out_siblings:
> -	if (siblings != stack)
> -		kfree(siblings);
> -
> -	return err;
> -}
> -
> -static int
> -set_engines__bond(struct i915_user_extension __user *base, void *data)
> -{
> -	struct i915_context_engines_bond __user *ext =
> -		container_of_user(base, typeof(*ext), base);
> -	const struct set_engines *set = data;
> -	struct drm_i915_private *i915 = set->ctx->i915;
> -	struct i915_engine_class_instance ci;
> -	struct intel_engine_cs *virtual;
> -	struct intel_engine_cs *master;
> -	u16 idx, num_bonds;
> -	int err, n;
> -
> -	if (get_user(idx, &ext->virtual_index))
> -		return -EFAULT;
> -
> -	if (idx >= set->engines->num_engines) {
> -		drm_dbg(&i915->drm,
> -			"Invalid index for virtual engine: %d >= %d\n",
> -			idx, set->engines->num_engines);
> -		return -EINVAL;
> -	}
> -
> -	idx = array_index_nospec(idx, set->engines->num_engines);
> -	if (!set->engines->engines[idx]) {
> -		drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
> -		return -EINVAL;
> -	}
> -	virtual = set->engines->engines[idx]->engine;
> -
> -	if (intel_engine_is_virtual(virtual)) {
> -		drm_dbg(&i915->drm,
> -			"Bonding with virtual engines not allowed\n");
> -		return -EINVAL;
> -	}
> -
> -	err = check_user_mbz(&ext->flags);
> -	if (err)
> -		return err;
> -
> -	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> -		err = check_user_mbz(&ext->mbz64[n]);
> -		if (err)
> -			return err;
> -	}
> -
> -	if (copy_from_user(&ci, &ext->master, sizeof(ci)))
> -		return -EFAULT;
> -
> -	master = intel_engine_lookup_user(i915,
> -					  ci.engine_class, ci.engine_instance);
> -	if (!master) {
> -		drm_dbg(&i915->drm,
> -			"Unrecognised master engine: { class:%u, instance:%u }\n",
> -			ci.engine_class, ci.engine_instance);
> -		return -EINVAL;
> -	}
> -
> -	if (get_user(num_bonds, &ext->num_bonds))
> -		return -EFAULT;
> -
> -	for (n = 0; n < num_bonds; n++) {
> -		struct intel_engine_cs *bond;
> -
> -		if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
> -			return -EFAULT;
> -
> -		bond = intel_engine_lookup_user(i915,
> -						ci.engine_class,
> -						ci.engine_instance);
> -		if (!bond) {
> -			drm_dbg(&i915->drm,
> -				"Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
> -				n, ci.engine_class, ci.engine_instance);
> -			return -EINVAL;
> -		}
> -	}
> -
> -	return 0;
> -}
> -
> -static const i915_user_extension_fn set_engines__extensions[] = {
> -	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
> -	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
> -};
> -
> -static int
> -set_engines(struct i915_gem_context *ctx,
> -	    const struct drm_i915_gem_context_param *args)
> -{
> -	struct drm_i915_private *i915 = ctx->i915;
> -	struct i915_context_param_engines __user *user =
> -		u64_to_user_ptr(args->value);
> -	struct set_engines set = { .ctx = ctx };
> -	unsigned int num_engines, n;
> -	u64 extensions;
> -	int err;
> -
> -	if (!args->size) { /* switch back to legacy user_ring_map */
> -		if (!i915_gem_context_user_engines(ctx))
> -			return 0;
> -
> -		set.engines = default_engines(ctx);
> -		if (IS_ERR(set.engines))
> -			return PTR_ERR(set.engines);
> -
> -		goto replace;
> -	}
> -
> -	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
> -	if (args->size < sizeof(*user) ||
> -	    !IS_ALIGNED(args->size, sizeof(*user->engines))) {
> -		drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
> -			args->size);
> -		return -EINVAL;
> -	}
> -
> -	num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> -	if (num_engines > I915_EXEC_RING_MASK + 1)
> -		return -EINVAL;
> -
> -	set.engines = alloc_engines(num_engines);
> -	if (!set.engines)
> -		return -ENOMEM;
> -
> -	for (n = 0; n < num_engines; n++) {
> -		struct i915_engine_class_instance ci;
> -		struct intel_engine_cs *engine;
> -		struct intel_context *ce;
> -
> -		if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
> -			__free_engines(set.engines, n);
> -			return -EFAULT;
> -		}
> -
> -		if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
> -		    ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
> -			set.engines->engines[n] = NULL;
> -			continue;
> -		}
> -
> -		engine = intel_engine_lookup_user(ctx->i915,
> -						  ci.engine_class,
> -						  ci.engine_instance);
> -		if (!engine) {
> -			drm_dbg(&i915->drm,
> -				"Invalid engine[%d]: { class:%d, instance:%d }\n",
> -				n, ci.engine_class, ci.engine_instance);
> -			__free_engines(set.engines, n);
> -			return -ENOENT;
> -		}
> -
> -		ce = intel_context_create(engine);
> -		if (IS_ERR(ce)) {
> -			__free_engines(set.engines, n);
> -			return PTR_ERR(ce);
> -		}
> -
> -		intel_context_set_gem(ce, ctx);
> -
> -		set.engines->engines[n] = ce;
> -	}
> -	set.engines->num_engines = num_engines;
> -
> -	err = -EFAULT;
> -	if (!get_user(extensions, &user->extensions))
> -		err = i915_user_extensions(u64_to_user_ptr(extensions),
> -					   set_engines__extensions,
> -					   ARRAY_SIZE(set_engines__extensions),
> -					   &set);
> -	if (err) {
> -		free_engines(set.engines);
> -		return err;
> -	}
> -
> -replace:
> -	mutex_lock(&ctx->engines_mutex);
> -	if (i915_gem_context_is_closed(ctx)) {
> -		mutex_unlock(&ctx->engines_mutex);
> -		free_engines(set.engines);
> -		return -ENOENT;
> -	}
> -	if (args->size)
> -		i915_gem_context_set_user_engines(ctx);
> -	else
> -		i915_gem_context_clear_user_engines(ctx);
> -	set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1);
> -	mutex_unlock(&ctx->engines_mutex);
> -
> -	/* Keep track of old engine sets for kill_context() */
> -	engines_idle_release(ctx, set.engines);
> -
> -	return 0;
> -}
> -
>  static int
>  set_persistence(struct i915_gem_context *ctx,
>  		const struct drm_i915_gem_context_param *args)
> @@ -2101,10 +1804,6 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
>  		ret = set_sseu(ctx, args);
>  		break;
>  
> -	case I915_CONTEXT_PARAM_ENGINES:
> -		ret = set_engines(ctx, args);
> -		break;
> -
>  	case I915_CONTEXT_PARAM_PERSISTENCE:
>  		ret = set_persistence(ctx, args);
>  		break;
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [PATCH 21/21] drm/i915/gem: Roll all of context creation together
  2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
@ 2021-04-29 17:25     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:25 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:31PM -0500, Jason Ekstrand wrote:
> Now that we have the whole engine set and VM at context creation time,
> we can just assign those fields instead of creating first and handling
> the VM and engines later.  This lets us avoid creating useless VMs and
> engine sets and lets us git rid of the complex VM setting code.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

I think for ocd reasons it would be nice to split this up into the engine
and vm cleanups and sort in with the corresponding prep patch.

It does all look pretty reasonable though. I'll review the details in
these later patches in the series once we've agreed on the big picture
shuffling.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 159 ++++++------------
>  .../gpu/drm/i915/gem/selftests/mock_context.c |  33 ++--
>  2 files changed, 64 insertions(+), 128 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index ef23ab4260c24..829730d402e8a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1201,56 +1201,6 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
>  	return 0;
>  }
>  
> -static struct i915_gem_context *
> -__create_context(struct drm_i915_private *i915,
> -		 const struct i915_gem_proto_context *pc)
> -{
> -	struct i915_gem_context *ctx;
> -	struct i915_gem_engines *e;
> -	int err;
> -	int i;
> -
> -	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> -	if (!ctx)
> -		return ERR_PTR(-ENOMEM);
> -
> -	kref_init(&ctx->ref);
> -	ctx->i915 = i915;
> -	ctx->sched = pc->sched;
> -	mutex_init(&ctx->mutex);
> -	INIT_LIST_HEAD(&ctx->link);
> -
> -	spin_lock_init(&ctx->stale.lock);
> -	INIT_LIST_HEAD(&ctx->stale.engines);
> -
> -	mutex_init(&ctx->engines_mutex);
> -	e = default_engines(ctx);
> -	if (IS_ERR(e)) {
> -		err = PTR_ERR(e);
> -		goto err_free;
> -	}
> -	RCU_INIT_POINTER(ctx->engines, e);
> -
> -	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> -	mutex_init(&ctx->lut_mutex);
> -
> -	/* NB: Mark all slices as needing a remap so that when the context first
> -	 * loads it will restore whatever remap state already exists. If there
> -	 * is no remap info, it will be a NOP. */
> -	ctx->remap_slice = ALL_L3_SLICES(i915);
> -
> -	ctx->user_flags = pc->user_flags;
> -
> -	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
> -		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
> -
> -	return ctx;
> -
> -err_free:
> -	kfree(ctx);
> -	return ERR_PTR(err);
> -}
> -
>  static inline struct i915_gem_engines *
>  __context_engines_await(const struct i915_gem_context *ctx,
>  			bool *user_engines)
> @@ -1294,86 +1244,77 @@ context_apply_all(struct i915_gem_context *ctx,
>  	i915_sw_fence_complete(&e->fence);
>  }
>  
> -static void __apply_ppgtt(struct intel_context *ce, void *vm)
> -{
> -	i915_vm_put(ce->vm);
> -	ce->vm = i915_vm_get(vm);
> -}
> -
> -static struct i915_address_space *
> -__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm)
> -{
> -	struct i915_address_space *old;
> -
> -	old = rcu_replace_pointer(ctx->vm,
> -				  i915_vm_open(vm),
> -				  lockdep_is_held(&ctx->mutex));
> -	GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old));
> -
> -	context_apply_all(ctx, __apply_ppgtt, vm);
> -
> -	return old;
> -}
> -
> -static void __assign_ppgtt(struct i915_gem_context *ctx,
> -			   struct i915_address_space *vm)
> -{
> -	if (vm == rcu_access_pointer(ctx->vm))
> -		return;
> -
> -	vm = __set_ppgtt(ctx, vm);
> -	if (vm)
> -		i915_vm_close(vm);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915,
>  			const struct i915_gem_proto_context *pc)
>  {
>  	struct i915_gem_context *ctx;
> -	int ret;
> +	struct i915_gem_engines *e;
> +	int err;
> +	int i;
>  
> -	ctx = __create_context(i915, pc);
> -	if (IS_ERR(ctx))
> -		return ctx;
> +	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx)
> +		return ERR_PTR(-ENOMEM);
>  
> -	if (pc->vm) {
> -		mutex_lock(&ctx->mutex);
> -		__assign_ppgtt(ctx, pc->vm);
> -		mutex_unlock(&ctx->mutex);
> -	}
> +	kref_init(&ctx->ref);
> +	ctx->i915 = i915;
> +	ctx->sched = pc->sched;
> +	mutex_init(&ctx->mutex);
> +	INIT_LIST_HEAD(&ctx->link);
>  
> -	if (pc->num_user_engines >= 0) {
> -		struct i915_gem_engines *engines;
> +	spin_lock_init(&ctx->stale.lock);
> +	INIT_LIST_HEAD(&ctx->stale.engines);
>  
> -		engines = user_engines(ctx, pc->num_user_engines,
> -				       pc->user_engines);
> -		if (IS_ERR(engines)) {
> -			context_close(ctx);
> -			return ERR_CAST(engines);
> -		}
> +	if (pc->vm)
> +		RCU_INIT_POINTER(ctx->vm, i915_vm_open(pc->vm));
>  
> -		mutex_lock(&ctx->engines_mutex);
> +	mutex_init(&ctx->engines_mutex);
> +	if (pc->num_user_engines >= 0) {
>  		i915_gem_context_set_user_engines(ctx);
> -		engines = rcu_replace_pointer(ctx->engines, engines, 1);
> -		mutex_unlock(&ctx->engines_mutex);
> -
> -		free_engines(engines);
> +		e = user_engines(ctx, pc->num_user_engines, pc->user_engines);
> +	} else {
> +		i915_gem_context_clear_user_engines(ctx);
> +		e = default_engines(ctx);
> +	}
> +	if (IS_ERR(e)) {
> +		err = PTR_ERR(e);
> +		goto err_vm;
>  	}
> +	RCU_INIT_POINTER(ctx->engines, e);
> +
> +	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> +	mutex_init(&ctx->lut_mutex);
> +
> +	/* NB: Mark all slices as needing a remap so that when the context first
> +	 * loads it will restore whatever remap state already exists. If there
> +	 * is no remap info, it will be a NOP. */
> +	ctx->remap_slice = ALL_L3_SLICES(i915);
> +
> +	ctx->user_flags = pc->user_flags;
> +
> +	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
> +		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
>  
>  	if (pc->single_timeline) {
> -		ret = drm_syncobj_create(&ctx->syncobj,
> +		err = drm_syncobj_create(&ctx->syncobj,
>  					 DRM_SYNCOBJ_CREATE_SIGNALED,
>  					 NULL);
> -		if (ret) {
> -			context_close(ctx);
> -			return ERR_PTR(ret);
> -		}
> +		if (err)
> +			goto err_engines;
>  	}
>  
>  	trace_i915_context_create(ctx);
>  
>  	return ctx;
> +
> +err_engines:
> +	free_engines(e);
> +err_vm:
> +	if (ctx->vm)
> +		i915_vm_close(ctx->vm);
> +	kfree(ctx);
> +	return ERR_PTR(err);
>  }
>  
>  static void init_contexts(struct i915_gem_contexts *gc)
> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> index e4aced7eabb72..5ee7e9bb6175d 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> @@ -30,15 +30,6 @@ mock_context(struct drm_i915_private *i915,
>  
>  	i915_gem_context_set_persistence(ctx);
>  
> -	mutex_init(&ctx->engines_mutex);
> -	e = default_engines(ctx);
> -	if (IS_ERR(e))
> -		goto err_free;
> -	RCU_INIT_POINTER(ctx->engines, e);
> -
> -	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> -	mutex_init(&ctx->lut_mutex);
> -
>  	if (name) {
>  		struct i915_ppgtt *ppgtt;
>  
> @@ -46,25 +37,29 @@ mock_context(struct drm_i915_private *i915,
>  
>  		ppgtt = mock_ppgtt(i915, name);
>  		if (!ppgtt)
> -			goto err_put;
> -
> -		mutex_lock(&ctx->mutex);
> -		__set_ppgtt(ctx, &ppgtt->vm);
> -		mutex_unlock(&ctx->mutex);
> +			goto err_free;
>  
> +		ctx->vm = i915_vm_open(&ppgtt->vm);
>  		i915_vm_put(&ppgtt->vm);
>  	}
>  
> +	mutex_init(&ctx->engines_mutex);
> +	e = default_engines(ctx);
> +	if (IS_ERR(e))
> +		goto err_vm;
> +	RCU_INIT_POINTER(ctx->engines, e);
> +
> +	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> +	mutex_init(&ctx->lut_mutex);
> +
>  	return ctx;
>  
> +err_vm:
> +	if (ctx->vm)
> +		i915_vm_close(ctx->vm);
>  err_free:
>  	kfree(ctx);
>  	return NULL;
> -
> -err_put:
> -	i915_gem_context_set_closed(ctx);
> -	i915_gem_context_put(ctx);
> -	return NULL;
>  }
>  
>  void mock_context_close(struct i915_gem_context *ctx)
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 21/21] drm/i915/gem: Roll all of context creation together
@ 2021-04-29 17:25     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 17:25 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Fri, Apr 23, 2021 at 05:31:31PM -0500, Jason Ekstrand wrote:
> Now that we have the whole engine set and VM at context creation time,
> we can just assign those fields instead of creating first and handling
> the VM and engines later.  This lets us avoid creating useless VMs and
> engine sets and lets us git rid of the complex VM setting code.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

I think for ocd reasons it would be nice to split this up into the engine
and vm cleanups and sort in with the corresponding prep patch.

It does all look pretty reasonable though. I'll review the details in
these later patches in the series once we've agreed on the big picture
shuffling.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 159 ++++++------------
>  .../gpu/drm/i915/gem/selftests/mock_context.c |  33 ++--
>  2 files changed, 64 insertions(+), 128 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index ef23ab4260c24..829730d402e8a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1201,56 +1201,6 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
>  	return 0;
>  }
>  
> -static struct i915_gem_context *
> -__create_context(struct drm_i915_private *i915,
> -		 const struct i915_gem_proto_context *pc)
> -{
> -	struct i915_gem_context *ctx;
> -	struct i915_gem_engines *e;
> -	int err;
> -	int i;
> -
> -	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> -	if (!ctx)
> -		return ERR_PTR(-ENOMEM);
> -
> -	kref_init(&ctx->ref);
> -	ctx->i915 = i915;
> -	ctx->sched = pc->sched;
> -	mutex_init(&ctx->mutex);
> -	INIT_LIST_HEAD(&ctx->link);
> -
> -	spin_lock_init(&ctx->stale.lock);
> -	INIT_LIST_HEAD(&ctx->stale.engines);
> -
> -	mutex_init(&ctx->engines_mutex);
> -	e = default_engines(ctx);
> -	if (IS_ERR(e)) {
> -		err = PTR_ERR(e);
> -		goto err_free;
> -	}
> -	RCU_INIT_POINTER(ctx->engines, e);
> -
> -	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> -	mutex_init(&ctx->lut_mutex);
> -
> -	/* NB: Mark all slices as needing a remap so that when the context first
> -	 * loads it will restore whatever remap state already exists. If there
> -	 * is no remap info, it will be a NOP. */
> -	ctx->remap_slice = ALL_L3_SLICES(i915);
> -
> -	ctx->user_flags = pc->user_flags;
> -
> -	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
> -		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
> -
> -	return ctx;
> -
> -err_free:
> -	kfree(ctx);
> -	return ERR_PTR(err);
> -}
> -
>  static inline struct i915_gem_engines *
>  __context_engines_await(const struct i915_gem_context *ctx,
>  			bool *user_engines)
> @@ -1294,86 +1244,77 @@ context_apply_all(struct i915_gem_context *ctx,
>  	i915_sw_fence_complete(&e->fence);
>  }
>  
> -static void __apply_ppgtt(struct intel_context *ce, void *vm)
> -{
> -	i915_vm_put(ce->vm);
> -	ce->vm = i915_vm_get(vm);
> -}
> -
> -static struct i915_address_space *
> -__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm)
> -{
> -	struct i915_address_space *old;
> -
> -	old = rcu_replace_pointer(ctx->vm,
> -				  i915_vm_open(vm),
> -				  lockdep_is_held(&ctx->mutex));
> -	GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old));
> -
> -	context_apply_all(ctx, __apply_ppgtt, vm);
> -
> -	return old;
> -}
> -
> -static void __assign_ppgtt(struct i915_gem_context *ctx,
> -			   struct i915_address_space *vm)
> -{
> -	if (vm == rcu_access_pointer(ctx->vm))
> -		return;
> -
> -	vm = __set_ppgtt(ctx, vm);
> -	if (vm)
> -		i915_vm_close(vm);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915,
>  			const struct i915_gem_proto_context *pc)
>  {
>  	struct i915_gem_context *ctx;
> -	int ret;
> +	struct i915_gem_engines *e;
> +	int err;
> +	int i;
>  
> -	ctx = __create_context(i915, pc);
> -	if (IS_ERR(ctx))
> -		return ctx;
> +	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx)
> +		return ERR_PTR(-ENOMEM);
>  
> -	if (pc->vm) {
> -		mutex_lock(&ctx->mutex);
> -		__assign_ppgtt(ctx, pc->vm);
> -		mutex_unlock(&ctx->mutex);
> -	}
> +	kref_init(&ctx->ref);
> +	ctx->i915 = i915;
> +	ctx->sched = pc->sched;
> +	mutex_init(&ctx->mutex);
> +	INIT_LIST_HEAD(&ctx->link);
>  
> -	if (pc->num_user_engines >= 0) {
> -		struct i915_gem_engines *engines;
> +	spin_lock_init(&ctx->stale.lock);
> +	INIT_LIST_HEAD(&ctx->stale.engines);
>  
> -		engines = user_engines(ctx, pc->num_user_engines,
> -				       pc->user_engines);
> -		if (IS_ERR(engines)) {
> -			context_close(ctx);
> -			return ERR_CAST(engines);
> -		}
> +	if (pc->vm)
> +		RCU_INIT_POINTER(ctx->vm, i915_vm_open(pc->vm));
>  
> -		mutex_lock(&ctx->engines_mutex);
> +	mutex_init(&ctx->engines_mutex);
> +	if (pc->num_user_engines >= 0) {
>  		i915_gem_context_set_user_engines(ctx);
> -		engines = rcu_replace_pointer(ctx->engines, engines, 1);
> -		mutex_unlock(&ctx->engines_mutex);
> -
> -		free_engines(engines);
> +		e = user_engines(ctx, pc->num_user_engines, pc->user_engines);
> +	} else {
> +		i915_gem_context_clear_user_engines(ctx);
> +		e = default_engines(ctx);
> +	}
> +	if (IS_ERR(e)) {
> +		err = PTR_ERR(e);
> +		goto err_vm;
>  	}
> +	RCU_INIT_POINTER(ctx->engines, e);
> +
> +	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> +	mutex_init(&ctx->lut_mutex);
> +
> +	/* NB: Mark all slices as needing a remap so that when the context first
> +	 * loads it will restore whatever remap state already exists. If there
> +	 * is no remap info, it will be a NOP. */
> +	ctx->remap_slice = ALL_L3_SLICES(i915);
> +
> +	ctx->user_flags = pc->user_flags;
> +
> +	for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
> +		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
>  
>  	if (pc->single_timeline) {
> -		ret = drm_syncobj_create(&ctx->syncobj,
> +		err = drm_syncobj_create(&ctx->syncobj,
>  					 DRM_SYNCOBJ_CREATE_SIGNALED,
>  					 NULL);
> -		if (ret) {
> -			context_close(ctx);
> -			return ERR_PTR(ret);
> -		}
> +		if (err)
> +			goto err_engines;
>  	}
>  
>  	trace_i915_context_create(ctx);
>  
>  	return ctx;
> +
> +err_engines:
> +	free_engines(e);
> +err_vm:
> +	if (ctx->vm)
> +		i915_vm_close(ctx->vm);
> +	kfree(ctx);
> +	return ERR_PTR(err);
>  }
>  
>  static void init_contexts(struct i915_gem_contexts *gc)
> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> index e4aced7eabb72..5ee7e9bb6175d 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> @@ -30,15 +30,6 @@ mock_context(struct drm_i915_private *i915,
>  
>  	i915_gem_context_set_persistence(ctx);
>  
> -	mutex_init(&ctx->engines_mutex);
> -	e = default_engines(ctx);
> -	if (IS_ERR(e))
> -		goto err_free;
> -	RCU_INIT_POINTER(ctx->engines, e);
> -
> -	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> -	mutex_init(&ctx->lut_mutex);
> -
>  	if (name) {
>  		struct i915_ppgtt *ppgtt;
>  
> @@ -46,25 +37,29 @@ mock_context(struct drm_i915_private *i915,
>  
>  		ppgtt = mock_ppgtt(i915, name);
>  		if (!ppgtt)
> -			goto err_put;
> -
> -		mutex_lock(&ctx->mutex);
> -		__set_ppgtt(ctx, &ppgtt->vm);
> -		mutex_unlock(&ctx->mutex);
> +			goto err_free;
>  
> +		ctx->vm = i915_vm_open(&ppgtt->vm);
>  		i915_vm_put(&ppgtt->vm);
>  	}
>  
> +	mutex_init(&ctx->engines_mutex);
> +	e = default_engines(ctx);
> +	if (IS_ERR(e))
> +		goto err_vm;
> +	RCU_INIT_POINTER(ctx->engines, e);
> +
> +	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
> +	mutex_init(&ctx->lut_mutex);
> +
>  	return ctx;
>  
> +err_vm:
> +	if (ctx->vm)
> +		i915_vm_close(ctx->vm);
>  err_free:
>  	kfree(ctx);
>  	return NULL;
> -
> -err_put:
> -	i915_gem_context_set_closed(ctx);
> -	i915_gem_context_put(ctx);
> -	return NULL;
>  }
>  
>  void mock_context_close(struct i915_gem_context *ctx)
> -- 
> 2.31.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-29 15:51     ` Daniel Vetter
@ 2021-04-29 18:16       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 18:16 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> Yeah this needs some text to explain what/why you're doing this, and maybe
> some rough sketch of the locking design.

Yup.  Will add.

>
> On Fri, Apr 23, 2021 at 05:31:26PM -0500, Jason Ekstrand wrote:
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 657 ++++++++++++++++--
> >  drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
> >  .../gpu/drm/i915/gem/i915_gem_context_types.h |  26 +
> >  .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
> >  drivers/gpu/drm/i915/i915_drv.h               |  17 +-
> >  5 files changed, 648 insertions(+), 60 deletions(-)
>
> So I think the patch split here is a bit unfortunate, because you're
> adding the new vm/engine validation code for proto context here, but the
> old stuff is only removed in the next patches that make vm/engines
> immutable after first use.

Yes, it's very unfortunate.  I'm reworking things now to have a
different split which I think makes more sense but actually separates
the add from the remove even further. :-(

> I think a better split would be if this patch here only has all the
> scaffolding. You already have the EOPNOTSUPP fallback (which I hope gets
> removed), so moving the conversion entirely to later patches should be all
> fine.
>
> Or do I miss something?
>
> I think the only concern I'm seeing is that bisectability might be a bit
> lost, because we finalize the context in some cases in setparam. And if we
> do the conversion in a different order than the one media uses for its
> setparam, then later setparam might fail because the context is finalized
> already. But also
> - it's just bisectability of media functionality I think
> - just check which order media calls CTX_SETPARAM and use that to do the
>   conversion
>
> And we should be fine ... I think?

Before we go down that path, let's what you think of my new ordering.

> Some more thoughts below, but the proto ctx stuff itself looks fine.
>
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index db9153e0f85a7..aa8e61211924f 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
> >
> >  static void proto_context_close(struct i915_gem_proto_context *pc)
> >  {
> > +     int i;
> > +
> >       if (pc->vm)
> >               i915_vm_put(pc->vm);
> > +     if (pc->user_engines) {
> > +             for (i = 0; i < pc->num_user_engines; i++)
> > +                     kfree(pc->user_engines[i].siblings);
> > +             kfree(pc->user_engines);
> > +     }
> >       kfree(pc);
> >  }
> >
> > @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> >       proto_context_set_persistence(i915, pc, true);
> >       pc->sched.priority = I915_PRIORITY_NORMAL;
> >
> > +     pc->num_user_engines = -1;
> > +     pc->user_engines = NULL;
> > +
> >       if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
> >               pc->single_timeline = true;
> >
> >       return pc;
> >  }
> >
> > +static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
> > +                                      struct i915_gem_proto_context *pc,
> > +                                      u32 *id)
> > +{
> > +     int ret;
> > +     void *old;
>
> assert_lock_held just for consistency.

Done.

> > +
> > +     ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
> > +     if (ret)
> > +             return ret;
> > +
> > +     old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL);
> > +     if (xa_is_err(old)) {
> > +             xa_erase(&fpriv->context_xa, *id);
> > +             return xa_err(old);
> > +     }
> > +     GEM_BUG_ON(old);
> > +
> > +     return 0;
> > +}
> > +
> > +static int proto_context_register(struct drm_i915_file_private *fpriv,
> > +                               struct i915_gem_proto_context *pc,
> > +                               u32 *id)
> > +{
> > +     int ret;
> > +
> > +     mutex_lock(&fpriv->proto_context_lock);
> > +     ret = proto_context_register_locked(fpriv, pc, id);
> > +     mutex_unlock(&fpriv->proto_context_lock);
> > +
> > +     return ret;
> > +}
> > +
> > +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
> > +                         struct i915_gem_proto_context *pc,
> > +                         const struct drm_i915_gem_context_param *args)
> > +{
> > +     struct i915_address_space *vm;
> > +
> > +     if (args->size)
> > +             return -EINVAL;
> > +
> > +     if (!pc->vm)
> > +             return -ENODEV;
> > +
> > +     if (upper_32_bits(args->value))
> > +             return -ENOENT;
> > +
> > +     rcu_read_lock();
> > +     vm = xa_load(&fpriv->vm_xa, args->value);
> > +     if (vm && !kref_get_unless_zero(&vm->ref))
> > +             vm = NULL;
> > +     rcu_read_unlock();
> > +     if (!vm)
> > +             return -ENOENT;
> > +
> > +     i915_vm_put(pc->vm);
> > +     pc->vm = vm;
> > +
> > +     return 0;
> > +}
> > +
> > +struct set_proto_ctx_engines {
> > +     struct drm_i915_private *i915;
> > +     unsigned num_engines;
> > +     struct i915_gem_proto_engine *engines;
> > +};
> > +
> > +static int
> > +set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
> > +                           void *data)
> > +{
> > +     struct i915_context_engines_load_balance __user *ext =
> > +             container_of_user(base, typeof(*ext), base);
> > +     const struct set_proto_ctx_engines *set = data;
> > +     struct drm_i915_private *i915 = set->i915;
> > +     struct intel_engine_cs **siblings;
> > +     u16 num_siblings, idx;
> > +     unsigned int n;
> > +     int err;
> > +
> > +     if (!HAS_EXECLISTS(i915))
> > +             return -ENODEV;
> > +
> > +     if (intel_uc_uses_guc_submission(&i915->gt.uc))
> > +             return -ENODEV; /* not implement yet */
> > +
> > +     if (get_user(idx, &ext->engine_index))
> > +             return -EFAULT;
> > +
> > +     if (idx >= set->num_engines) {
> > +             drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
> > +                     idx, set->num_engines);
> > +             return -EINVAL;
> > +     }
> > +
> > +     idx = array_index_nospec(idx, set->num_engines);
> > +     if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
> > +             drm_dbg(&i915->drm,
> > +                     "Invalid placement[%d], already occupied\n", idx);
> > +             return -EEXIST;
> > +     }
> > +
> > +     if (get_user(num_siblings, &ext->num_siblings))
> > +             return -EFAULT;
> > +
> > +     err = check_user_mbz(&ext->flags);
> > +     if (err)
> > +             return err;
> > +
> > +     err = check_user_mbz(&ext->mbz64);
> > +     if (err)
> > +             return err;
> > +
> > +     if (num_siblings == 0)
> > +             return 0;
> > +
> > +     siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
> > +     if (!siblings)
> > +             return -ENOMEM;
> > +
> > +     for (n = 0; n < num_siblings; n++) {
> > +             struct i915_engine_class_instance ci;
> > +
> > +             if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
> > +                     err = -EFAULT;
> > +                     goto err_siblings;
> > +             }
> > +
> > +             siblings[n] = intel_engine_lookup_user(i915,
> > +                                                    ci.engine_class,
> > +                                                    ci.engine_instance);
> > +             if (!siblings[n]) {
> > +                     drm_dbg(&i915->drm,
> > +                             "Invalid sibling[%d]: { class:%d, inst:%d }\n",
> > +                             n, ci.engine_class, ci.engine_instance);
> > +                     err = -EINVAL;
> > +                     goto err_siblings;
> > +             }
> > +     }
> > +
> > +     if (num_siblings == 1) {
> > +             set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> > +             set->engines[idx].engine = siblings[0];
> > +             kfree(siblings);
> > +     } else {
> > +             set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED;
> > +             set->engines[idx].num_siblings = num_siblings;
> > +             set->engines[idx].siblings = siblings;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_siblings:
> > +     kfree(siblings);
> > +
> > +     return err;
> > +}
> > +
> > +static int
> > +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
> > +{
> > +     struct i915_context_engines_bond __user *ext =
> > +             container_of_user(base, typeof(*ext), base);
> > +     const struct set_proto_ctx_engines *set = data;
> > +     struct drm_i915_private *i915 = set->i915;
> > +     struct i915_engine_class_instance ci;
> > +     struct intel_engine_cs *master;
> > +     u16 idx, num_bonds;
> > +     int err, n;
> > +
> > +     if (get_user(idx, &ext->virtual_index))
> > +             return -EFAULT;
> > +
> > +     if (idx >= set->num_engines) {
> > +             drm_dbg(&i915->drm,
> > +                     "Invalid index for virtual engine: %d >= %d\n",
> > +                     idx, set->num_engines);
> > +             return -EINVAL;
> > +     }
> > +
> > +     idx = array_index_nospec(idx, set->num_engines);
> > +     if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) {
> > +             drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) {
> > +             drm_dbg(&i915->drm,
> > +                     "Bonding with virtual engines not allowed\n");
> > +             return -EINVAL;
> > +     }
> > +
> > +     err = check_user_mbz(&ext->flags);
> > +     if (err)
> > +             return err;
> > +
> > +     for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> > +             err = check_user_mbz(&ext->mbz64[n]);
> > +             if (err)
> > +                     return err;
> > +     }
> > +
> > +     if (copy_from_user(&ci, &ext->master, sizeof(ci)))
> > +             return -EFAULT;
> > +
> > +     master = intel_engine_lookup_user(i915,
> > +                                       ci.engine_class,
> > +                                       ci.engine_instance);
> > +     if (!master) {
> > +             drm_dbg(&i915->drm,
> > +                     "Unrecognised master engine: { class:%u, instance:%u }\n",
> > +                     ci.engine_class, ci.engine_instance);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (get_user(num_bonds, &ext->num_bonds))
> > +             return -EFAULT;
> > +
> > +     for (n = 0; n < num_bonds; n++) {
> > +             struct intel_engine_cs *bond;
> > +
> > +             if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
> > +                     return -EFAULT;
> > +
> > +             bond = intel_engine_lookup_user(i915,
> > +                                             ci.engine_class,
> > +                                             ci.engine_instance);
> > +             if (!bond) {
> > +                     drm_dbg(&i915->drm,
> > +                             "Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
> > +                             n, ci.engine_class, ci.engine_instance);
> > +                     return -EINVAL;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = {
> > +     [I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance,
> > +     [I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond,
> > +};
> > +
> > +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
> > +                              struct i915_gem_proto_context *pc,
> > +                              const struct drm_i915_gem_context_param *args)
> > +{
> > +     struct drm_i915_private *i915 = fpriv->dev_priv;
> > +     struct set_proto_ctx_engines set = { .i915 = i915 };
> > +     struct i915_context_param_engines __user *user =
> > +             u64_to_user_ptr(args->value);
> > +     unsigned int n;
> > +     u64 extensions;
> > +     int err;
> > +
> > +     if (!args->size) {
> > +             kfree(pc->user_engines);
> > +             pc->num_user_engines = -1;
> > +             pc->user_engines = NULL;
> > +             return 0;
> > +     }
> > +
> > +     BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
> > +     if (args->size < sizeof(*user) ||
> > +         !IS_ALIGNED(args->size, sizeof(*user->engines))) {
> > +             drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
> > +                     args->size);
> > +             return -EINVAL;
> > +     }
> > +
> > +     set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> > +     if (set.num_engines > I915_EXEC_RING_MASK + 1)
> > +             return -EINVAL;
> > +
> > +     set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL);
> > +     if (!set.engines)
> > +             return -ENOMEM;
> > +
> > +     for (n = 0; n < set.num_engines; n++) {
> > +             struct i915_engine_class_instance ci;
> > +             struct intel_engine_cs *engine;
> > +
> > +             if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
> > +                     kfree(set.engines);
> > +                     return -EFAULT;
> > +             }
> > +
> > +             memset(&set.engines[n], 0, sizeof(set.engines[n]));
> > +
> > +             if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
> > +                 ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE)
> > +                     continue;
> > +
> > +             engine = intel_engine_lookup_user(i915,
> > +                                               ci.engine_class,
> > +                                               ci.engine_instance);
> > +             if (!engine) {
> > +                     drm_dbg(&i915->drm,
> > +                             "Invalid engine[%d]: { class:%d, instance:%d }\n",
> > +                             n, ci.engine_class, ci.engine_instance);
> > +                     kfree(set.engines);
> > +                     return -ENOENT;
> > +             }
> > +
> > +             set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> > +             set.engines[n].engine = engine;
> > +     }
> > +
> > +     err = -EFAULT;
> > +     if (!get_user(extensions, &user->extensions))
> > +             err = i915_user_extensions(u64_to_user_ptr(extensions),
> > +                                        set_proto_ctx_engines_extensions,
> > +                                        ARRAY_SIZE(set_proto_ctx_engines_extensions),
> > +                                        &set);
> > +     if (err) {
> > +             kfree(set.engines);
> > +             return err;
> > +     }
> > +
> > +     kfree(pc->user_engines);
> > +     pc->num_user_engines = set.num_engines;
> > +     pc->user_engines = set.engines;
> > +
> > +     return 0;
> > +}
> > +
> > +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
> > +                            struct i915_gem_proto_context *pc,
> > +                            struct drm_i915_gem_context_param *args)
> > +{
> > +     int ret = 0;
> > +
> > +     switch (args->param) {
> > +     case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
>
> Atomic bitops like in previous patches: Pls no :-)

Yup.  Fixed.

> > +             else
> > +                     clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_BANNABLE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (!capable(CAP_SYS_ADMIN) && !args->value)
> > +                     ret = -EPERM;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> > +             else
> > +                     clear_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_RECOVERABLE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> > +             else
> > +                     clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_PRIORITY:
> > +             ret = validate_priority(fpriv->dev_priv, args);
> > +             if (!ret)
> > +                     pc->sched.priority = args->value;
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_SSEU:
> > +             ret = -ENOTSUPP;
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_VM:
> > +             ret = set_proto_ctx_vm(fpriv, pc, args);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_ENGINES:
> > +             ret = set_proto_ctx_engines(fpriv, pc, args);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_PERSISTENCE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> > +             else
> > +                     clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_NO_ZEROMAP:
> > +     case I915_CONTEXT_PARAM_BAN_PERIOD:
> > +     case I915_CONTEXT_PARAM_RINGSIZE:
> > +     default:
> > +             ret = -EINVAL;
> > +             break;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> >  static struct i915_address_space *
> >  context_get_vm_rcu(struct i915_gem_context *ctx)
> >  {
> > @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
> >       return e;
> >  }
> >
> > +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
> > +                                          unsigned int num_engines,
> > +                                          struct i915_gem_proto_engine *pe)
> > +{
> > +     struct i915_gem_engines *e;
> > +     unsigned int n;
> > +
> > +     e = alloc_engines(num_engines);
> > +     for (n = 0; n < num_engines; n++) {
> > +             struct intel_context *ce;
> > +
> > +             switch (pe[n].type) {
> > +             case I915_GEM_ENGINE_TYPE_PHYSICAL:
> > +                     ce = intel_context_create(pe[n].engine);
> > +                     break;
> > +
> > +             case I915_GEM_ENGINE_TYPE_BALANCED:
> > +                     ce = intel_execlists_create_virtual(pe[n].siblings,
> > +                                                         pe[n].num_siblings);
> > +                     break;
> > +
> > +             case I915_GEM_ENGINE_TYPE_INVALID:
> > +             default:
> > +                     GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID);
> > +                     continue;
> > +             }
> > +
> > +             if (IS_ERR(ce)) {
> > +                     __free_engines(e, n);
> > +                     return ERR_CAST(ce);
> > +             }
> > +
> > +             intel_context_set_gem(ce, ctx);
> > +
> > +             e->engines[n] = ce;
> > +     }
> > +     e->num_engines = num_engines;
> > +
> > +     return e;
> > +}
> > +
> >  void i915_gem_context_release(struct kref *ref)
> >  {
> >       struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
> > @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915,
> >               mutex_unlock(&ctx->mutex);
> >       }
> >
> > +     if (pc->num_user_engines >= 0) {
> > +             struct i915_gem_engines *engines;
> > +
> > +             engines = user_engines(ctx, pc->num_user_engines,
> > +                                    pc->user_engines);
> > +             if (IS_ERR(engines)) {
> > +                     context_close(ctx);
> > +                     return ERR_CAST(engines);
> > +             }
> > +
> > +             mutex_lock(&ctx->engines_mutex);
> > +             i915_gem_context_set_user_engines(ctx);
> > +             engines = rcu_replace_pointer(ctx->engines, engines, 1);
> > +             mutex_unlock(&ctx->engines_mutex);
> > +
> > +             free_engines(engines);
> > +     }
> > +
> >       if (pc->single_timeline) {
> >               ret = drm_syncobj_create(&ctx->syncobj,
> >                                        DRM_SYNCOBJ_CREATE_SIGNALED,
> > @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915)
> >       init_contexts(&i915->gem.contexts);
> >  }
> >
> > -static int gem_context_register(struct i915_gem_context *ctx,
> > -                             struct drm_i915_file_private *fpriv,
> > -                             u32 *id)
> > +static void gem_context_register(struct i915_gem_context *ctx,
> > +                              struct drm_i915_file_private *fpriv,
> > +                              u32 id)
> >  {
> >       struct drm_i915_private *i915 = ctx->i915;
> > -     int ret;
> > +     void *old;
> >
> >       ctx->file_priv = fpriv;
> >
> > @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >                current->comm, pid_nr(ctx->pid));
> >
> >       /* And finally expose ourselves to userspace via the idr */
> > -     ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
> > -     if (ret)
> > -             goto err_pid;
> > +     old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
> > +     GEM_BUG_ON(old);
> >
> >       spin_lock(&i915->gem.contexts.lock);
> >       list_add_tail(&ctx->link, &i915->gem.contexts.list);
> >       spin_unlock(&i915->gem.contexts.lock);
> > -
> > -     return 0;
> > -
> > -err_pid:
> > -     put_pid(fetch_and_zero(&ctx->pid));
> > -     return ret;
> >  }
> >
> >  int i915_gem_context_open(struct drm_i915_private *i915,
> > @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >       struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> > -     u32 id;
> >
> > -     xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC);
> > +     mutex_init(&file_priv->proto_context_lock);
> > +     xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC);
> > +
> > +     /* 0 reserved for the default context */
> > +     xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1);
> >
> >       /* 0 reserved for invalid/unassigned ppgtt */
> >       xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
> > @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >               goto err;
> >       }
> >
> > -     err = gem_context_register(ctx, file_priv, &id);
> > -     if (err < 0)
> > -             goto err_ctx;
> > +     gem_context_register(ctx, file_priv, 0);
> >
> > -     GEM_BUG_ON(id);
> >       return 0;
> >
> > -err_ctx:
> > -     context_close(ctx);
> >  err:
> >       xa_destroy(&file_priv->vm_xa);
> >       xa_destroy(&file_priv->context_xa);
> > +     xa_destroy(&file_priv->proto_context_xa);
> > +     mutex_destroy(&file_priv->proto_context_lock);
> >       return err;
> >  }
> >
> >  void i915_gem_context_close(struct drm_file *file)
> >  {
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_address_space *vm;
> >       struct i915_gem_context *ctx;
> >       unsigned long idx;
> >
> > +     xa_for_each(&file_priv->proto_context_xa, idx, pc)
> > +             proto_context_close(pc);
> > +     xa_destroy(&file_priv->proto_context_xa);
> > +     mutex_destroy(&file_priv->proto_context_lock);
> > +
> >       xa_for_each(&file_priv->context_xa, idx, ctx)
> >               context_close(ctx);
> >       xa_destroy(&file_priv->context_xa);
> > @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
> >  }
> >
> >  struct create_ext {
> > -     struct i915_gem_context *ctx;
> > +     struct i915_gem_proto_context *pc;
> >       struct drm_i915_file_private *fpriv;
> >  };
> >
> > @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       if (local.param.ctx_id)
> >               return -EINVAL;
> >
> > -     return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> > +     return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param);
> >  }
> >
> >  static int invalid_ext(struct i915_user_extension __user *ext, void *data)
> > @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
> >       return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
> >  }
> >
> > +static inline struct i915_gem_context *
> > +__context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > +{
> > +     struct i915_gem_context *ctx;
> > +
> > +     rcu_read_lock();
> > +     ctx = xa_load(&file_priv->context_xa, id);
> > +     if (ctx && !kref_get_unless_zero(&ctx->ref))
> > +             ctx = NULL;
> > +     rcu_read_unlock();
> > +
> > +     return ctx;
> > +}
> > +
> > +struct i915_gem_context *
> > +lazy_create_context_locked(struct drm_i915_file_private *file_priv,
> > +                        struct i915_gem_proto_context *pc, u32 id)
> > +{
> > +     struct i915_gem_context *ctx;
> > +     void *old;
>
> assert_lock_held is alwasy nice in all _locked functions. It entirely
> compiles out without CONFIG_PROVE_LOCKING enabled.

Done.

> > +
> > +     ctx = i915_gem_create_context(file_priv->dev_priv, pc);
>
> I think we need a prep patch which changes the calling convetion of this
> and anything it calls to only return a NULL pointer. Then
> i915_gem_context_lookup below can return the ERR_PTR(-ENOMEM) below for
> that case, and we know that we're never returning a wrong error pointer.
>
> > +     if (IS_ERR(ctx))
> > +             return ctx;
> > +
> > +     gem_context_register(ctx, file_priv, id);
> > +
> > +     old = xa_erase(&file_priv->proto_context_xa, id);
> > +     GEM_BUG_ON(old != pc);
> > +     proto_context_close(pc);
> > +
> > +     /* One for the xarray and one for the caller */
> > +     return i915_gem_context_get(ctx);
> > +}
> > +
> > +struct i915_gem_context *
> > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > +{
> > +     struct i915_gem_proto_context *pc;
> > +     struct i915_gem_context *ctx;
> > +
> > +     ctx = __context_lookup(file_priv, id);
> > +     if (ctx)
> > +             return ctx;
> > +
> > +     mutex_lock(&file_priv->proto_context_lock);
> > +     /* Try one more time under the lock */
> > +     ctx = __context_lookup(file_priv, id);
> > +     if (!ctx) {
> > +             pc = xa_load(&file_priv->proto_context_xa, id);
> > +             if (!pc)
> > +                     ctx = ERR_PTR(-ENOENT);
> > +             else
> > +                     ctx = lazy_create_context_locked(file_priv, pc, id);
> > +     }
> > +     mutex_unlock(&file_priv->proto_context_lock);
> > +
> > +     return ctx;
> > +}
> > +
> >  int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >                                 struct drm_file *file)
> >  {
> >       struct drm_i915_private *i915 = to_i915(dev);
> >       struct drm_i915_gem_context_create_ext *args = data;
> > -     struct i915_gem_proto_context *pc;
> >       struct create_ext ext_data;
> >       int ret;
> >       u32 id;
> > @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >               return -EIO;
> >       }
> >
> > -     pc = proto_context_create(i915, args->flags);
> > -     if (IS_ERR(pc))
> > -             return PTR_ERR(pc);
> > -
> > -     ext_data.ctx = i915_gem_create_context(i915, pc);
> > -     proto_context_close(pc);
> > -     if (IS_ERR(ext_data.ctx))
> > -             return PTR_ERR(ext_data.ctx);
> > +     ext_data.pc = proto_context_create(i915, args->flags);
> > +     if (IS_ERR(ext_data.pc))
> > +             return PTR_ERR(ext_data.pc);
> >
> >       if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
> >               ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
> > @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >                                          ARRAY_SIZE(create_extensions),
> >                                          &ext_data);
> >               if (ret)
> > -                     goto err_ctx;
> > +                     goto err_pc;
> >       }
> >
> > -     ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id);
> > +     ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
> >       if (ret < 0)
> > -             goto err_ctx;
> > +             goto err_pc;
> >
> >       args->ctx_id = id;
> >       drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id);
> >
> >       return 0;
> >
> > -err_ctx:
> > -     context_close(ext_data.ctx);
> > +err_pc:
> > +     proto_context_close(ext_data.pc);
> >       return ret;
> >  }
> >
> > @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >  {
> >       struct drm_i915_gem_context_destroy *args = data;
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >
> >       if (args->pad != 0)
> > @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >       if (!args->ctx_id)
> >               return -ENOENT;
> >
> > +     mutex_lock(&file_priv->proto_context_lock);
> >       ctx = xa_erase(&file_priv->context_xa, args->ctx_id);
> > -     if (!ctx)
> > +     pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id);
> > +     mutex_unlock(&file_priv->proto_context_lock);
> > +
> > +     if (!ctx && !pc)
> >               return -ENOENT;
> > +     GEM_WARN_ON(ctx && pc);
> > +
> > +     if (pc)
> > +             proto_context_close(pc);
> > +
> > +     if (ctx)
> > +             context_close(ctx);
> >
> > -     context_close(ctx);
> >       return 0;
> >  }
> >
> > @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >  {
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> >       struct drm_i915_gem_context_param *args = data;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> > -     int ret;
> > +     int ret = 0;
> >
> > -     ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > -     if (IS_ERR(ctx))
> > -             return PTR_ERR(ctx);
> > +     ctx = __context_lookup(file_priv, args->ctx_id);
> > +     if (ctx)
> > +             goto set_ctx_param;
> >
> > -     ret = ctx_setparam(file_priv, ctx, args);
> > +     mutex_lock(&file_priv->proto_context_lock);
> > +     ctx = __context_lookup(file_priv, args->ctx_id);
> > +     if (ctx)
> > +             goto unlock;
> > +
> > +     pc = xa_load(&file_priv->proto_context_xa, args->ctx_id);
> > +     if (!pc) {
> > +             ret = -ENOENT;
> > +             goto unlock;
> > +     }
> > +
> > +     ret = set_proto_ctx_param(file_priv, pc, args);
>
> I think we should have a FIXME here of not allowing this on some future
> platforms because just use CTX_CREATE_EXT.

Done.

> > +     if (ret == -ENOTSUPP) {
> > +             /* Some params, specifically SSEU, can only be set on fully
>
> I think this needs a FIXME: that this only holds during the conversion?
> Otherwise we kinda have a bit a problem me thinks ...

I'm not sure what you mean by that.

> > +              * created contexts.
> > +              */
> > +             ret = 0;
> > +             ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id);
> > +             if (IS_ERR(ctx)) {
> > +                     ret = PTR_ERR(ctx);
> > +                     ctx = NULL;
> > +             }
> > +     }
> > +
> > +unlock:
> > +     mutex_unlock(&file_priv->proto_context_lock);
> > +
> > +set_ctx_param:
> > +     if (!ret && ctx)
> > +             ret = ctx_setparam(file_priv, ctx, args);
> > +
> > +     if (ctx)
> > +             i915_gem_context_put(ctx);
> >
> > -     i915_gem_context_put(ctx);
> >       return ret;
> >  }
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > index b5c908f3f4f22..20411db84914a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file);
> >
> > +struct i915_gem_context *
> > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
> > +
> >  static inline struct i915_gem_context *
> >  i915_gem_context_get(struct i915_gem_context *ctx)
> >  {
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index a42c429f94577..067ea3030ac91 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -46,6 +46,26 @@ struct i915_gem_engines_iter {
> >       const struct i915_gem_engines *engines;
> >  };
> >
> > +enum i915_gem_engine_type {
> > +     I915_GEM_ENGINE_TYPE_INVALID = 0,
> > +     I915_GEM_ENGINE_TYPE_PHYSICAL,
> > +     I915_GEM_ENGINE_TYPE_BALANCED,
> > +};
> > +
>
> Some kerneldoc missing?

Yup.  Fixed.

> > +struct i915_gem_proto_engine {
> > +     /** @type: Type of this engine */
> > +     enum i915_gem_engine_type type;
> > +
> > +     /** @num_siblings: Engine, for physical */
> > +     struct intel_engine_cs *engine;
> > +
> > +     /** @num_siblings: Number of balanced siblings */
> > +     unsigned int num_siblings;
> > +
> > +     /** @num_siblings: Balanced siblings */
> > +     struct intel_engine_cs **siblings;
>
> I guess you're stuffing both balanced and siblings into one?

Nope.  Thanks to the patch to disable balance+bonded, we just throw
the bonding info away. :-D

> > +};
> > +
> >  /**
> >   * struct i915_gem_proto_context - prototype context
> >   *
> > @@ -64,6 +84,12 @@ struct i915_gem_proto_context {
> >       /** @sched: See i915_gem_context::sched */
> >       struct i915_sched_attr sched;
> >
> > +     /** @num_user_engines: Number of user-specified engines or -1 */
> > +     int num_user_engines;
> > +
> > +     /** @num_user_engines: User-specified engines */
> > +     struct i915_gem_proto_engine *user_engines;
> > +
> >       bool single_timeline;
> >  };
> >
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > index e0f512ef7f3c6..32cf2103828f9 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915)
> >  struct i915_gem_context *
> >  live_context(struct drm_i915_private *i915, struct file *file)
> >  {
> > +     struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
> >       struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> > @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file)
> >
> >       i915_gem_context_set_no_error_capture(ctx);
> >
> > -     err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
> > +     err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
> >       if (err < 0)
> >               goto err_ctx;
> >
> > +     gem_context_register(ctx, fpriv, id);
> > +
> >       return ctx;
> >
> >  err_ctx:
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 004ed0e59c999..365c042529d72 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -200,6 +200,9 @@ struct drm_i915_file_private {
> >               struct rcu_head rcu;
> >       };
> >
> > +     struct mutex proto_context_lock;
> > +     struct xarray proto_context_xa;
>
> Kerneldoc here please. Ideally also for the context_xa below (but maybe
> that's for later).
>
> Also please add a hint to the proto context struct that it's all fully
> protected by proto_context_lock above and is never visible outside of
> that.

Both done.

> > +
> >       struct xarray context_xa;
> >       struct xarray vm_xa;
> >
> > @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> >
> >  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
> >
> > -static inline struct i915_gem_context *
> > -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > -{
> > -     struct i915_gem_context *ctx;
> > -
> > -     rcu_read_lock();
> > -     ctx = xa_load(&file_priv->context_xa, id);
> > -     if (ctx && !kref_get_unless_zero(&ctx->ref))
> > -             ctx = NULL;
> > -     rcu_read_unlock();
> > -
> > -     return ctx ? ctx : ERR_PTR(-ENOENT);
> > -}
> > -
> >  /* i915_gem_evict.c */
> >  int __must_check i915_gem_evict_something(struct i915_address_space *vm,
> >                                         u64 min_size, u64 alignment,
>
> I think I'll check details when I'm not getting distracted by the
> vm/engines validation code that I think shouldn't be here :-)

No worries.  I should be sending out a new version of the series
shortly that's hopefully easier to read.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-29 18:16       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 18:16 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> Yeah this needs some text to explain what/why you're doing this, and maybe
> some rough sketch of the locking design.

Yup.  Will add.

>
> On Fri, Apr 23, 2021 at 05:31:26PM -0500, Jason Ekstrand wrote:
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 657 ++++++++++++++++--
> >  drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
> >  .../gpu/drm/i915/gem/i915_gem_context_types.h |  26 +
> >  .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
> >  drivers/gpu/drm/i915/i915_drv.h               |  17 +-
> >  5 files changed, 648 insertions(+), 60 deletions(-)
>
> So I think the patch split here is a bit unfortunate, because you're
> adding the new vm/engine validation code for proto context here, but the
> old stuff is only removed in the next patches that make vm/engines
> immutable after first use.

Yes, it's very unfortunate.  I'm reworking things now to have a
different split which I think makes more sense but actually separates
the add from the remove even further. :-(

> I think a better split would be if this patch here only has all the
> scaffolding. You already have the EOPNOTSUPP fallback (which I hope gets
> removed), so moving the conversion entirely to later patches should be all
> fine.
>
> Or do I miss something?
>
> I think the only concern I'm seeing is that bisectability might be a bit
> lost, because we finalize the context in some cases in setparam. And if we
> do the conversion in a different order than the one media uses for its
> setparam, then later setparam might fail because the context is finalized
> already. But also
> - it's just bisectability of media functionality I think
> - just check which order media calls CTX_SETPARAM and use that to do the
>   conversion
>
> And we should be fine ... I think?

Before we go down that path, let's what you think of my new ordering.

> Some more thoughts below, but the proto ctx stuff itself looks fine.
>
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index db9153e0f85a7..aa8e61211924f 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
> >
> >  static void proto_context_close(struct i915_gem_proto_context *pc)
> >  {
> > +     int i;
> > +
> >       if (pc->vm)
> >               i915_vm_put(pc->vm);
> > +     if (pc->user_engines) {
> > +             for (i = 0; i < pc->num_user_engines; i++)
> > +                     kfree(pc->user_engines[i].siblings);
> > +             kfree(pc->user_engines);
> > +     }
> >       kfree(pc);
> >  }
> >
> > @@ -274,12 +281,417 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
> >       proto_context_set_persistence(i915, pc, true);
> >       pc->sched.priority = I915_PRIORITY_NORMAL;
> >
> > +     pc->num_user_engines = -1;
> > +     pc->user_engines = NULL;
> > +
> >       if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE)
> >               pc->single_timeline = true;
> >
> >       return pc;
> >  }
> >
> > +static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
> > +                                      struct i915_gem_proto_context *pc,
> > +                                      u32 *id)
> > +{
> > +     int ret;
> > +     void *old;
>
> assert_lock_held just for consistency.

Done.

> > +
> > +     ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
> > +     if (ret)
> > +             return ret;
> > +
> > +     old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL);
> > +     if (xa_is_err(old)) {
> > +             xa_erase(&fpriv->context_xa, *id);
> > +             return xa_err(old);
> > +     }
> > +     GEM_BUG_ON(old);
> > +
> > +     return 0;
> > +}
> > +
> > +static int proto_context_register(struct drm_i915_file_private *fpriv,
> > +                               struct i915_gem_proto_context *pc,
> > +                               u32 *id)
> > +{
> > +     int ret;
> > +
> > +     mutex_lock(&fpriv->proto_context_lock);
> > +     ret = proto_context_register_locked(fpriv, pc, id);
> > +     mutex_unlock(&fpriv->proto_context_lock);
> > +
> > +     return ret;
> > +}
> > +
> > +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
> > +                         struct i915_gem_proto_context *pc,
> > +                         const struct drm_i915_gem_context_param *args)
> > +{
> > +     struct i915_address_space *vm;
> > +
> > +     if (args->size)
> > +             return -EINVAL;
> > +
> > +     if (!pc->vm)
> > +             return -ENODEV;
> > +
> > +     if (upper_32_bits(args->value))
> > +             return -ENOENT;
> > +
> > +     rcu_read_lock();
> > +     vm = xa_load(&fpriv->vm_xa, args->value);
> > +     if (vm && !kref_get_unless_zero(&vm->ref))
> > +             vm = NULL;
> > +     rcu_read_unlock();
> > +     if (!vm)
> > +             return -ENOENT;
> > +
> > +     i915_vm_put(pc->vm);
> > +     pc->vm = vm;
> > +
> > +     return 0;
> > +}
> > +
> > +struct set_proto_ctx_engines {
> > +     struct drm_i915_private *i915;
> > +     unsigned num_engines;
> > +     struct i915_gem_proto_engine *engines;
> > +};
> > +
> > +static int
> > +set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
> > +                           void *data)
> > +{
> > +     struct i915_context_engines_load_balance __user *ext =
> > +             container_of_user(base, typeof(*ext), base);
> > +     const struct set_proto_ctx_engines *set = data;
> > +     struct drm_i915_private *i915 = set->i915;
> > +     struct intel_engine_cs **siblings;
> > +     u16 num_siblings, idx;
> > +     unsigned int n;
> > +     int err;
> > +
> > +     if (!HAS_EXECLISTS(i915))
> > +             return -ENODEV;
> > +
> > +     if (intel_uc_uses_guc_submission(&i915->gt.uc))
> > +             return -ENODEV; /* not implement yet */
> > +
> > +     if (get_user(idx, &ext->engine_index))
> > +             return -EFAULT;
> > +
> > +     if (idx >= set->num_engines) {
> > +             drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n",
> > +                     idx, set->num_engines);
> > +             return -EINVAL;
> > +     }
> > +
> > +     idx = array_index_nospec(idx, set->num_engines);
> > +     if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
> > +             drm_dbg(&i915->drm,
> > +                     "Invalid placement[%d], already occupied\n", idx);
> > +             return -EEXIST;
> > +     }
> > +
> > +     if (get_user(num_siblings, &ext->num_siblings))
> > +             return -EFAULT;
> > +
> > +     err = check_user_mbz(&ext->flags);
> > +     if (err)
> > +             return err;
> > +
> > +     err = check_user_mbz(&ext->mbz64);
> > +     if (err)
> > +             return err;
> > +
> > +     if (num_siblings == 0)
> > +             return 0;
> > +
> > +     siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
> > +     if (!siblings)
> > +             return -ENOMEM;
> > +
> > +     for (n = 0; n < num_siblings; n++) {
> > +             struct i915_engine_class_instance ci;
> > +
> > +             if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) {
> > +                     err = -EFAULT;
> > +                     goto err_siblings;
> > +             }
> > +
> > +             siblings[n] = intel_engine_lookup_user(i915,
> > +                                                    ci.engine_class,
> > +                                                    ci.engine_instance);
> > +             if (!siblings[n]) {
> > +                     drm_dbg(&i915->drm,
> > +                             "Invalid sibling[%d]: { class:%d, inst:%d }\n",
> > +                             n, ci.engine_class, ci.engine_instance);
> > +                     err = -EINVAL;
> > +                     goto err_siblings;
> > +             }
> > +     }
> > +
> > +     if (num_siblings == 1) {
> > +             set->engines[idx].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> > +             set->engines[idx].engine = siblings[0];
> > +             kfree(siblings);
> > +     } else {
> > +             set->engines[idx].type = I915_GEM_ENGINE_TYPE_BALANCED;
> > +             set->engines[idx].num_siblings = num_siblings;
> > +             set->engines[idx].siblings = siblings;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_siblings:
> > +     kfree(siblings);
> > +
> > +     return err;
> > +}
> > +
> > +static int
> > +set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
> > +{
> > +     struct i915_context_engines_bond __user *ext =
> > +             container_of_user(base, typeof(*ext), base);
> > +     const struct set_proto_ctx_engines *set = data;
> > +     struct drm_i915_private *i915 = set->i915;
> > +     struct i915_engine_class_instance ci;
> > +     struct intel_engine_cs *master;
> > +     u16 idx, num_bonds;
> > +     int err, n;
> > +
> > +     if (get_user(idx, &ext->virtual_index))
> > +             return -EFAULT;
> > +
> > +     if (idx >= set->num_engines) {
> > +             drm_dbg(&i915->drm,
> > +                     "Invalid index for virtual engine: %d >= %d\n",
> > +                     idx, set->num_engines);
> > +             return -EINVAL;
> > +     }
> > +
> > +     idx = array_index_nospec(idx, set->num_engines);
> > +     if (set->engines[idx].type == I915_GEM_ENGINE_TYPE_INVALID) {
> > +             drm_dbg(&i915->drm, "Invalid engine at %d\n", idx);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_PHYSICAL) {
> > +             drm_dbg(&i915->drm,
> > +                     "Bonding with virtual engines not allowed\n");
> > +             return -EINVAL;
> > +     }
> > +
> > +     err = check_user_mbz(&ext->flags);
> > +     if (err)
> > +             return err;
> > +
> > +     for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
> > +             err = check_user_mbz(&ext->mbz64[n]);
> > +             if (err)
> > +                     return err;
> > +     }
> > +
> > +     if (copy_from_user(&ci, &ext->master, sizeof(ci)))
> > +             return -EFAULT;
> > +
> > +     master = intel_engine_lookup_user(i915,
> > +                                       ci.engine_class,
> > +                                       ci.engine_instance);
> > +     if (!master) {
> > +             drm_dbg(&i915->drm,
> > +                     "Unrecognised master engine: { class:%u, instance:%u }\n",
> > +                     ci.engine_class, ci.engine_instance);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (get_user(num_bonds, &ext->num_bonds))
> > +             return -EFAULT;
> > +
> > +     for (n = 0; n < num_bonds; n++) {
> > +             struct intel_engine_cs *bond;
> > +
> > +             if (copy_from_user(&ci, &ext->engines[n], sizeof(ci)))
> > +                     return -EFAULT;
> > +
> > +             bond = intel_engine_lookup_user(i915,
> > +                                             ci.engine_class,
> > +                                             ci.engine_instance);
> > +             if (!bond) {
> > +                     drm_dbg(&i915->drm,
> > +                             "Unrecognised engine[%d] for bonding: { class:%d, instance: %d }\n",
> > +                             n, ci.engine_class, ci.engine_instance);
> > +                     return -EINVAL;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static const i915_user_extension_fn set_proto_ctx_engines_extensions[] = {
> > +     [I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_proto_ctx_engines_balance,
> > +     [I915_CONTEXT_ENGINES_EXT_BOND] = set_proto_ctx_engines_bond,
> > +};
> > +
> > +static int set_proto_ctx_engines(struct drm_i915_file_private *fpriv,
> > +                              struct i915_gem_proto_context *pc,
> > +                              const struct drm_i915_gem_context_param *args)
> > +{
> > +     struct drm_i915_private *i915 = fpriv->dev_priv;
> > +     struct set_proto_ctx_engines set = { .i915 = i915 };
> > +     struct i915_context_param_engines __user *user =
> > +             u64_to_user_ptr(args->value);
> > +     unsigned int n;
> > +     u64 extensions;
> > +     int err;
> > +
> > +     if (!args->size) {
> > +             kfree(pc->user_engines);
> > +             pc->num_user_engines = -1;
> > +             pc->user_engines = NULL;
> > +             return 0;
> > +     }
> > +
> > +     BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
> > +     if (args->size < sizeof(*user) ||
> > +         !IS_ALIGNED(args->size, sizeof(*user->engines))) {
> > +             drm_dbg(&i915->drm, "Invalid size for engine array: %d\n",
> > +                     args->size);
> > +             return -EINVAL;
> > +     }
> > +
> > +     set.num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> > +     if (set.num_engines > I915_EXEC_RING_MASK + 1)
> > +             return -EINVAL;
> > +
> > +     set.engines = kmalloc_array(set.num_engines, sizeof(*set.engines), GFP_KERNEL);
> > +     if (!set.engines)
> > +             return -ENOMEM;
> > +
> > +     for (n = 0; n < set.num_engines; n++) {
> > +             struct i915_engine_class_instance ci;
> > +             struct intel_engine_cs *engine;
> > +
> > +             if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
> > +                     kfree(set.engines);
> > +                     return -EFAULT;
> > +             }
> > +
> > +             memset(&set.engines[n], 0, sizeof(set.engines[n]));
> > +
> > +             if (ci.engine_class == (u16)I915_ENGINE_CLASS_INVALID &&
> > +                 ci.engine_instance == (u16)I915_ENGINE_CLASS_INVALID_NONE)
> > +                     continue;
> > +
> > +             engine = intel_engine_lookup_user(i915,
> > +                                               ci.engine_class,
> > +                                               ci.engine_instance);
> > +             if (!engine) {
> > +                     drm_dbg(&i915->drm,
> > +                             "Invalid engine[%d]: { class:%d, instance:%d }\n",
> > +                             n, ci.engine_class, ci.engine_instance);
> > +                     kfree(set.engines);
> > +                     return -ENOENT;
> > +             }
> > +
> > +             set.engines[n].type = I915_GEM_ENGINE_TYPE_PHYSICAL;
> > +             set.engines[n].engine = engine;
> > +     }
> > +
> > +     err = -EFAULT;
> > +     if (!get_user(extensions, &user->extensions))
> > +             err = i915_user_extensions(u64_to_user_ptr(extensions),
> > +                                        set_proto_ctx_engines_extensions,
> > +                                        ARRAY_SIZE(set_proto_ctx_engines_extensions),
> > +                                        &set);
> > +     if (err) {
> > +             kfree(set.engines);
> > +             return err;
> > +     }
> > +
> > +     kfree(pc->user_engines);
> > +     pc->num_user_engines = set.num_engines;
> > +     pc->user_engines = set.engines;
> > +
> > +     return 0;
> > +}
> > +
> > +static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
> > +                            struct i915_gem_proto_context *pc,
> > +                            struct drm_i915_gem_context_param *args)
> > +{
> > +     int ret = 0;
> > +
> > +     switch (args->param) {
> > +     case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
>
> Atomic bitops like in previous patches: Pls no :-)

Yup.  Fixed.

> > +             else
> > +                     clear_bit(UCONTEXT_NO_ERROR_CAPTURE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_BANNABLE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (!capable(CAP_SYS_ADMIN) && !args->value)
> > +                     ret = -EPERM;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> > +             else
> > +                     clear_bit(UCONTEXT_BANNABLE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_RECOVERABLE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> > +             else
> > +                     clear_bit(UCONTEXT_RECOVERABLE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_PRIORITY:
> > +             ret = validate_priority(fpriv->dev_priv, args);
> > +             if (!ret)
> > +                     pc->sched.priority = args->value;
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_SSEU:
> > +             ret = -ENOTSUPP;
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_VM:
> > +             ret = set_proto_ctx_vm(fpriv, pc, args);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_ENGINES:
> > +             ret = set_proto_ctx_engines(fpriv, pc, args);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_PERSISTENCE:
> > +             if (args->size)
> > +                     ret = -EINVAL;
> > +             else if (args->value)
> > +                     set_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> > +             else
> > +                     clear_bit(UCONTEXT_PERSISTENCE, &pc->user_flags);
> > +             break;
> > +
> > +     case I915_CONTEXT_PARAM_NO_ZEROMAP:
> > +     case I915_CONTEXT_PARAM_BAN_PERIOD:
> > +     case I915_CONTEXT_PARAM_RINGSIZE:
> > +     default:
> > +             ret = -EINVAL;
> > +             break;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> >  static struct i915_address_space *
> >  context_get_vm_rcu(struct i915_gem_context *ctx)
> >  {
> > @@ -450,6 +862,47 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
> >       return e;
> >  }
> >
> > +static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
> > +                                          unsigned int num_engines,
> > +                                          struct i915_gem_proto_engine *pe)
> > +{
> > +     struct i915_gem_engines *e;
> > +     unsigned int n;
> > +
> > +     e = alloc_engines(num_engines);
> > +     for (n = 0; n < num_engines; n++) {
> > +             struct intel_context *ce;
> > +
> > +             switch (pe[n].type) {
> > +             case I915_GEM_ENGINE_TYPE_PHYSICAL:
> > +                     ce = intel_context_create(pe[n].engine);
> > +                     break;
> > +
> > +             case I915_GEM_ENGINE_TYPE_BALANCED:
> > +                     ce = intel_execlists_create_virtual(pe[n].siblings,
> > +                                                         pe[n].num_siblings);
> > +                     break;
> > +
> > +             case I915_GEM_ENGINE_TYPE_INVALID:
> > +             default:
> > +                     GEM_WARN_ON(pe[n].type != I915_GEM_ENGINE_TYPE_INVALID);
> > +                     continue;
> > +             }
> > +
> > +             if (IS_ERR(ce)) {
> > +                     __free_engines(e, n);
> > +                     return ERR_CAST(ce);
> > +             }
> > +
> > +             intel_context_set_gem(ce, ctx);
> > +
> > +             e->engines[n] = ce;
> > +     }
> > +     e->num_engines = num_engines;
> > +
> > +     return e;
> > +}
> > +
> >  void i915_gem_context_release(struct kref *ref)
> >  {
> >       struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
> > @@ -890,6 +1343,24 @@ i915_gem_create_context(struct drm_i915_private *i915,
> >               mutex_unlock(&ctx->mutex);
> >       }
> >
> > +     if (pc->num_user_engines >= 0) {
> > +             struct i915_gem_engines *engines;
> > +
> > +             engines = user_engines(ctx, pc->num_user_engines,
> > +                                    pc->user_engines);
> > +             if (IS_ERR(engines)) {
> > +                     context_close(ctx);
> > +                     return ERR_CAST(engines);
> > +             }
> > +
> > +             mutex_lock(&ctx->engines_mutex);
> > +             i915_gem_context_set_user_engines(ctx);
> > +             engines = rcu_replace_pointer(ctx->engines, engines, 1);
> > +             mutex_unlock(&ctx->engines_mutex);
> > +
> > +             free_engines(engines);
> > +     }
> > +
> >       if (pc->single_timeline) {
> >               ret = drm_syncobj_create(&ctx->syncobj,
> >                                        DRM_SYNCOBJ_CREATE_SIGNALED,
> > @@ -916,12 +1387,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915)
> >       init_contexts(&i915->gem.contexts);
> >  }
> >
> > -static int gem_context_register(struct i915_gem_context *ctx,
> > -                             struct drm_i915_file_private *fpriv,
> > -                             u32 *id)
> > +static void gem_context_register(struct i915_gem_context *ctx,
> > +                              struct drm_i915_file_private *fpriv,
> > +                              u32 id)
> >  {
> >       struct drm_i915_private *i915 = ctx->i915;
> > -     int ret;
> > +     void *old;
> >
> >       ctx->file_priv = fpriv;
> >
> > @@ -930,19 +1401,12 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >                current->comm, pid_nr(ctx->pid));
> >
> >       /* And finally expose ourselves to userspace via the idr */
> > -     ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL);
> > -     if (ret)
> > -             goto err_pid;
> > +     old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
> > +     GEM_BUG_ON(old);
> >
> >       spin_lock(&i915->gem.contexts.lock);
> >       list_add_tail(&ctx->link, &i915->gem.contexts.list);
> >       spin_unlock(&i915->gem.contexts.lock);
> > -
> > -     return 0;
> > -
> > -err_pid:
> > -     put_pid(fetch_and_zero(&ctx->pid));
> > -     return ret;
> >  }
> >
> >  int i915_gem_context_open(struct drm_i915_private *i915,
> > @@ -952,9 +1416,12 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >       struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> > -     u32 id;
> >
> > -     xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC);
> > +     mutex_init(&file_priv->proto_context_lock);
> > +     xa_init_flags(&file_priv->proto_context_xa, XA_FLAGS_ALLOC);
> > +
> > +     /* 0 reserved for the default context */
> > +     xa_init_flags(&file_priv->context_xa, XA_FLAGS_ALLOC1);
> >
> >       /* 0 reserved for invalid/unassigned ppgtt */
> >       xa_init_flags(&file_priv->vm_xa, XA_FLAGS_ALLOC1);
> > @@ -972,28 +1439,31 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >               goto err;
> >       }
> >
> > -     err = gem_context_register(ctx, file_priv, &id);
> > -     if (err < 0)
> > -             goto err_ctx;
> > +     gem_context_register(ctx, file_priv, 0);
> >
> > -     GEM_BUG_ON(id);
> >       return 0;
> >
> > -err_ctx:
> > -     context_close(ctx);
> >  err:
> >       xa_destroy(&file_priv->vm_xa);
> >       xa_destroy(&file_priv->context_xa);
> > +     xa_destroy(&file_priv->proto_context_xa);
> > +     mutex_destroy(&file_priv->proto_context_lock);
> >       return err;
> >  }
> >
> >  void i915_gem_context_close(struct drm_file *file)
> >  {
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_address_space *vm;
> >       struct i915_gem_context *ctx;
> >       unsigned long idx;
> >
> > +     xa_for_each(&file_priv->proto_context_xa, idx, pc)
> > +             proto_context_close(pc);
> > +     xa_destroy(&file_priv->proto_context_xa);
> > +     mutex_destroy(&file_priv->proto_context_lock);
> > +
> >       xa_for_each(&file_priv->context_xa, idx, ctx)
> >               context_close(ctx);
> >       xa_destroy(&file_priv->context_xa);
> > @@ -1918,7 +2388,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
> >  }
> >
> >  struct create_ext {
> > -     struct i915_gem_context *ctx;
> > +     struct i915_gem_proto_context *pc;
> >       struct drm_i915_file_private *fpriv;
> >  };
> >
> > @@ -1933,7 +2403,7 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
> >       if (local.param.ctx_id)
> >               return -EINVAL;
> >
> > -     return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
> > +     return set_proto_ctx_param(arg->fpriv, arg->pc, &local.param);
> >  }
> >
> >  static int invalid_ext(struct i915_user_extension __user *ext, void *data)
> > @@ -1951,12 +2421,71 @@ static bool client_is_banned(struct drm_i915_file_private *file_priv)
> >       return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
> >  }
> >
> > +static inline struct i915_gem_context *
> > +__context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > +{
> > +     struct i915_gem_context *ctx;
> > +
> > +     rcu_read_lock();
> > +     ctx = xa_load(&file_priv->context_xa, id);
> > +     if (ctx && !kref_get_unless_zero(&ctx->ref))
> > +             ctx = NULL;
> > +     rcu_read_unlock();
> > +
> > +     return ctx;
> > +}
> > +
> > +struct i915_gem_context *
> > +lazy_create_context_locked(struct drm_i915_file_private *file_priv,
> > +                        struct i915_gem_proto_context *pc, u32 id)
> > +{
> > +     struct i915_gem_context *ctx;
> > +     void *old;
>
> assert_lock_held is alwasy nice in all _locked functions. It entirely
> compiles out without CONFIG_PROVE_LOCKING enabled.

Done.

> > +
> > +     ctx = i915_gem_create_context(file_priv->dev_priv, pc);
>
> I think we need a prep patch which changes the calling convetion of this
> and anything it calls to only return a NULL pointer. Then
> i915_gem_context_lookup below can return the ERR_PTR(-ENOMEM) below for
> that case, and we know that we're never returning a wrong error pointer.
>
> > +     if (IS_ERR(ctx))
> > +             return ctx;
> > +
> > +     gem_context_register(ctx, file_priv, id);
> > +
> > +     old = xa_erase(&file_priv->proto_context_xa, id);
> > +     GEM_BUG_ON(old != pc);
> > +     proto_context_close(pc);
> > +
> > +     /* One for the xarray and one for the caller */
> > +     return i915_gem_context_get(ctx);
> > +}
> > +
> > +struct i915_gem_context *
> > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > +{
> > +     struct i915_gem_proto_context *pc;
> > +     struct i915_gem_context *ctx;
> > +
> > +     ctx = __context_lookup(file_priv, id);
> > +     if (ctx)
> > +             return ctx;
> > +
> > +     mutex_lock(&file_priv->proto_context_lock);
> > +     /* Try one more time under the lock */
> > +     ctx = __context_lookup(file_priv, id);
> > +     if (!ctx) {
> > +             pc = xa_load(&file_priv->proto_context_xa, id);
> > +             if (!pc)
> > +                     ctx = ERR_PTR(-ENOENT);
> > +             else
> > +                     ctx = lazy_create_context_locked(file_priv, pc, id);
> > +     }
> > +     mutex_unlock(&file_priv->proto_context_lock);
> > +
> > +     return ctx;
> > +}
> > +
> >  int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >                                 struct drm_file *file)
> >  {
> >       struct drm_i915_private *i915 = to_i915(dev);
> >       struct drm_i915_gem_context_create_ext *args = data;
> > -     struct i915_gem_proto_context *pc;
> >       struct create_ext ext_data;
> >       int ret;
> >       u32 id;
> > @@ -1979,14 +2508,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >               return -EIO;
> >       }
> >
> > -     pc = proto_context_create(i915, args->flags);
> > -     if (IS_ERR(pc))
> > -             return PTR_ERR(pc);
> > -
> > -     ext_data.ctx = i915_gem_create_context(i915, pc);
> > -     proto_context_close(pc);
> > -     if (IS_ERR(ext_data.ctx))
> > -             return PTR_ERR(ext_data.ctx);
> > +     ext_data.pc = proto_context_create(i915, args->flags);
> > +     if (IS_ERR(ext_data.pc))
> > +             return PTR_ERR(ext_data.pc);
> >
> >       if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
> >               ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
> > @@ -1994,20 +2518,20 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >                                          ARRAY_SIZE(create_extensions),
> >                                          &ext_data);
> >               if (ret)
> > -                     goto err_ctx;
> > +                     goto err_pc;
> >       }
> >
> > -     ret = gem_context_register(ext_data.ctx, ext_data.fpriv, &id);
> > +     ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
> >       if (ret < 0)
> > -             goto err_ctx;
> > +             goto err_pc;
> >
> >       args->ctx_id = id;
> >       drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id);
> >
> >       return 0;
> >
> > -err_ctx:
> > -     context_close(ext_data.ctx);
> > +err_pc:
> > +     proto_context_close(ext_data.pc);
> >       return ret;
> >  }
> >
> > @@ -2016,6 +2540,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >  {
> >       struct drm_i915_gem_context_destroy *args = data;
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >
> >       if (args->pad != 0)
> > @@ -2024,11 +2549,21 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >       if (!args->ctx_id)
> >               return -ENOENT;
> >
> > +     mutex_lock(&file_priv->proto_context_lock);
> >       ctx = xa_erase(&file_priv->context_xa, args->ctx_id);
> > -     if (!ctx)
> > +     pc = xa_erase(&file_priv->proto_context_xa, args->ctx_id);
> > +     mutex_unlock(&file_priv->proto_context_lock);
> > +
> > +     if (!ctx && !pc)
> >               return -ENOENT;
> > +     GEM_WARN_ON(ctx && pc);
> > +
> > +     if (pc)
> > +             proto_context_close(pc);
> > +
> > +     if (ctx)
> > +             context_close(ctx);
> >
> > -     context_close(ctx);
> >       return 0;
> >  }
> >
> > @@ -2161,16 +2696,48 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >  {
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> >       struct drm_i915_gem_context_param *args = data;
> > +     struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> > -     int ret;
> > +     int ret = 0;
> >
> > -     ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > -     if (IS_ERR(ctx))
> > -             return PTR_ERR(ctx);
> > +     ctx = __context_lookup(file_priv, args->ctx_id);
> > +     if (ctx)
> > +             goto set_ctx_param;
> >
> > -     ret = ctx_setparam(file_priv, ctx, args);
> > +     mutex_lock(&file_priv->proto_context_lock);
> > +     ctx = __context_lookup(file_priv, args->ctx_id);
> > +     if (ctx)
> > +             goto unlock;
> > +
> > +     pc = xa_load(&file_priv->proto_context_xa, args->ctx_id);
> > +     if (!pc) {
> > +             ret = -ENOENT;
> > +             goto unlock;
> > +     }
> > +
> > +     ret = set_proto_ctx_param(file_priv, pc, args);
>
> I think we should have a FIXME here of not allowing this on some future
> platforms because just use CTX_CREATE_EXT.

Done.

> > +     if (ret == -ENOTSUPP) {
> > +             /* Some params, specifically SSEU, can only be set on fully
>
> I think this needs a FIXME: that this only holds during the conversion?
> Otherwise we kinda have a bit a problem me thinks ...

I'm not sure what you mean by that.

> > +              * created contexts.
> > +              */
> > +             ret = 0;
> > +             ctx = lazy_create_context_locked(file_priv, pc, args->ctx_id);
> > +             if (IS_ERR(ctx)) {
> > +                     ret = PTR_ERR(ctx);
> > +                     ctx = NULL;
> > +             }
> > +     }
> > +
> > +unlock:
> > +     mutex_unlock(&file_priv->proto_context_lock);
> > +
> > +set_ctx_param:
> > +     if (!ret && ctx)
> > +             ret = ctx_setparam(file_priv, ctx, args);
> > +
> > +     if (ctx)
> > +             i915_gem_context_put(ctx);
> >
> > -     i915_gem_context_put(ctx);
> >       return ret;
> >  }
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > index b5c908f3f4f22..20411db84914a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > @@ -133,6 +133,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file);
> >
> > +struct i915_gem_context *
> > +i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
> > +
> >  static inline struct i915_gem_context *
> >  i915_gem_context_get(struct i915_gem_context *ctx)
> >  {
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index a42c429f94577..067ea3030ac91 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -46,6 +46,26 @@ struct i915_gem_engines_iter {
> >       const struct i915_gem_engines *engines;
> >  };
> >
> > +enum i915_gem_engine_type {
> > +     I915_GEM_ENGINE_TYPE_INVALID = 0,
> > +     I915_GEM_ENGINE_TYPE_PHYSICAL,
> > +     I915_GEM_ENGINE_TYPE_BALANCED,
> > +};
> > +
>
> Some kerneldoc missing?

Yup.  Fixed.

> > +struct i915_gem_proto_engine {
> > +     /** @type: Type of this engine */
> > +     enum i915_gem_engine_type type;
> > +
> > +     /** @num_siblings: Engine, for physical */
> > +     struct intel_engine_cs *engine;
> > +
> > +     /** @num_siblings: Number of balanced siblings */
> > +     unsigned int num_siblings;
> > +
> > +     /** @num_siblings: Balanced siblings */
> > +     struct intel_engine_cs **siblings;
>
> I guess you're stuffing both balanced and siblings into one?

Nope.  Thanks to the patch to disable balance+bonded, we just throw
the bonding info away. :-D

> > +};
> > +
> >  /**
> >   * struct i915_gem_proto_context - prototype context
> >   *
> > @@ -64,6 +84,12 @@ struct i915_gem_proto_context {
> >       /** @sched: See i915_gem_context::sched */
> >       struct i915_sched_attr sched;
> >
> > +     /** @num_user_engines: Number of user-specified engines or -1 */
> > +     int num_user_engines;
> > +
> > +     /** @num_user_engines: User-specified engines */
> > +     struct i915_gem_proto_engine *user_engines;
> > +
> >       bool single_timeline;
> >  };
> >
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > index e0f512ef7f3c6..32cf2103828f9 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
> > @@ -80,6 +80,7 @@ void mock_init_contexts(struct drm_i915_private *i915)
> >  struct i915_gem_context *
> >  live_context(struct drm_i915_private *i915, struct file *file)
> >  {
> > +     struct drm_i915_file_private *fpriv = to_drm_file(file)->driver_priv;
> >       struct i915_gem_proto_context *pc;
> >       struct i915_gem_context *ctx;
> >       int err;
> > @@ -96,10 +97,12 @@ live_context(struct drm_i915_private *i915, struct file *file)
> >
> >       i915_gem_context_set_no_error_capture(ctx);
> >
> > -     err = gem_context_register(ctx, to_drm_file(file)->driver_priv, &id);
> > +     err = xa_alloc(&fpriv->context_xa, &id, NULL, xa_limit_32b, GFP_KERNEL);
> >       if (err < 0)
> >               goto err_ctx;
> >
> > +     gem_context_register(ctx, fpriv, id);
> > +
> >       return ctx;
> >
> >  err_ctx:
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 004ed0e59c999..365c042529d72 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -200,6 +200,9 @@ struct drm_i915_file_private {
> >               struct rcu_head rcu;
> >       };
> >
> > +     struct mutex proto_context_lock;
> > +     struct xarray proto_context_xa;
>
> Kerneldoc here please. Ideally also for the context_xa below (but maybe
> that's for later).
>
> Also please add a hint to the proto context struct that it's all fully
> protected by proto_context_lock above and is never visible outside of
> that.

Both done.

> > +
> >       struct xarray context_xa;
> >       struct xarray vm_xa;
> >
> > @@ -1840,20 +1843,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
> >
> >  struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags);
> >
> > -static inline struct i915_gem_context *
> > -i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
> > -{
> > -     struct i915_gem_context *ctx;
> > -
> > -     rcu_read_lock();
> > -     ctx = xa_load(&file_priv->context_xa, id);
> > -     if (ctx && !kref_get_unless_zero(&ctx->ref))
> > -             ctx = NULL;
> > -     rcu_read_unlock();
> > -
> > -     return ctx ? ctx : ERR_PTR(-ENOENT);
> > -}
> > -
> >  /* i915_gem_evict.c */
> >  int __must_check i915_gem_evict_something(struct i915_address_space *vm,
> >                                         u64 min_size, u64 alignment,
>
> I think I'll check details when I'm not getting distracted by the
> vm/engines validation code that I think shouldn't be here :-)

No worries.  I should be sending out a new version of the series
shortly that's hopefully easier to read.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-29 17:13               ` Daniel Vetter
@ 2021-04-29 18:41                 ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 18:41 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Tvrtko Ursulin, Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 12:13 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 07:12:05PM +0200, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 09:54:15AM -0500, Jason Ekstrand wrote:
> > > On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >
> > > >
> > > > On 28/04/2021 18:24, Jason Ekstrand wrote:
> > > > > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > > > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > > >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > > >>> Instead of handling it like a context param, unconditionally set it when
> > > > >>> intel_contexts are created.  This doesn't fix anything but does simplify
> > > > >>> the code a bit.
> > > > >>>
> > > > >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > >>> ---
> > > > >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> > > > >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> > > > >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> > > > >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> > > > >>>
> > > > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> > > > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > > >>>            intel_engine_has_timeslices(ce->engine))
> > > > >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > >>>
> > > > >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > > > >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > > > >>> +         ctx->i915->params.request_timeout_ms) {
> > > > >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > > > >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> > > > >>
> > > > >> Blank line between declarations and code please, or just lose the local.
> > > > >>
> > > > >> Otherwise looks okay. Slight change that same GEM context can now have a
> > > > >> mix of different request expirations isn't interesting I think. At least
> > > > >> the change goes away by the end of the series.
> > > > >
> > > > > In order for that to happen, I think you'd have to have a race between
> > > > > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > > > > sysfs.  Or am I missing something?  Given that timeouts are really
> > > > > per-engine anyway, I don't think we need to care too much about that.
> > > >
> > > > We don't care, no.
> > > >
> > > > For completeness only - by the end of the series it is what you say. But
> > > > at _this_ point in the series though it is if modparam changes at any
> > > > point between context create and replacing engines. Which is a change
> > > > compared to before this patch, since modparam was cached in the GEM
> > > > context so far. So one GEM context was a single request_timeout_ms.
> > >
> > > I've added the following to the commit message:
> > >
> > > It also means that sync files exported from different engines on a
> > > SINGLE_TIMELINE context will have different fence contexts.  This is
> > > visible to userspace if it looks at the obj_name field of
> > > sync_fence_info.
> > >
> > > How's that sound?
> >
> > If you add "Which media-driver as the sole user of this doesn't do" then I
> > think it's perfect.
>
> Uh I think you replied to the wrong thread :-)

Indeed!

> This here is about watchdog, not timeline.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-29 18:41                 ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 18:41 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 12:13 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 07:12:05PM +0200, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 09:54:15AM -0500, Jason Ekstrand wrote:
> > > On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >
> > > >
> > > > On 28/04/2021 18:24, Jason Ekstrand wrote:
> > > > > On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> > > > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > > >> On 23/04/2021 23:31, Jason Ekstrand wrote:
> > > > >>> Instead of handling it like a context param, unconditionally set it when
> > > > >>> intel_contexts are created.  This doesn't fix anything but does simplify
> > > > >>> the code a bit.
> > > > >>>
> > > > >>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > > >>> ---
> > > > >>>    drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> > > > >>>    .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> > > > >>>    drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> > > > >>>    3 files changed, 6 insertions(+), 44 deletions(-)
> > > > >>>
> > > > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > >>> index 35bcdeddfbf3f..1091cc04a242a 100644
> > > > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > >>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> > > > >>>            intel_engine_has_timeslices(ce->engine))
> > > > >>>                __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> > > > >>>
> > > > >>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> > > > >>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> > > > >>> +         ctx->i915->params.request_timeout_ms) {
> > > > >>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> > > > >>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> > > > >>
> > > > >> Blank line between declarations and code please, or just lose the local.
> > > > >>
> > > > >> Otherwise looks okay. Slight change that same GEM context can now have a
> > > > >> mix of different request expirations isn't interesting I think. At least
> > > > >> the change goes away by the end of the series.
> > > > >
> > > > > In order for that to happen, I think you'd have to have a race between
> > > > > CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> > > > > sysfs.  Or am I missing something?  Given that timeouts are really
> > > > > per-engine anyway, I don't think we need to care too much about that.
> > > >
> > > > We don't care, no.
> > > >
> > > > For completeness only - by the end of the series it is what you say. But
> > > > at _this_ point in the series though it is if modparam changes at any
> > > > point between context create and replacing engines. Which is a change
> > > > compared to before this patch, since modparam was cached in the GEM
> > > > context so far. So one GEM context was a single request_timeout_ms.
> > >
> > > I've added the following to the commit message:
> > >
> > > It also means that sync files exported from different engines on a
> > > SINGLE_TIMELINE context will have different fence contexts.  This is
> > > visible to userspace if it looks at the obj_name field of
> > > sync_fence_info.
> > >
> > > How's that sound?
> >
> > If you add "Which media-driver as the sole user of this doesn't do" then I
> > think it's perfect.
>
> Uh I think you replied to the wrong thread :-)

Indeed!

> This here is about watchdog, not timeline.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-29 18:16       ` Jason Ekstrand
@ 2021-04-29 18:56         ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 18:56 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> >
> > I think we should have a FIXME here of not allowing this on some future
> > platforms because just use CTX_CREATE_EXT.
> 
> Done.
> 
> > > +     if (ret == -ENOTSUPP) {
> > > +             /* Some params, specifically SSEU, can only be set on fully
> >
> > I think this needs a FIXME: that this only holds during the conversion?
> > Otherwise we kinda have a bit a problem me thinks ...
> 
> I'm not sure what you mean by that.

Well I'm at least assuming that we wont have this case anymore, i.e.
there's only two kinds of parameters:
- those which are valid only on proto context
- those which are valid on both (like priority)

This SSEU thing looks like a 3rd parameter, which is only valid on
finalized context. That feels all kinds of wrong. Will it stay? If yes
*ugh* and why?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-29 18:56         ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 18:56 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> >
> > I think we should have a FIXME here of not allowing this on some future
> > platforms because just use CTX_CREATE_EXT.
> 
> Done.
> 
> > > +     if (ret == -ENOTSUPP) {
> > > +             /* Some params, specifically SSEU, can only be set on fully
> >
> > I think this needs a FIXME: that this only holds during the conversion?
> > Otherwise we kinda have a bit a problem me thinks ...
> 
> I'm not sure what you mean by that.

Well I'm at least assuming that we wont have this case anymore, i.e.
there's only two kinds of parameters:
- those which are valid only on proto context
- those which are valid on both (like priority)

This SSEU thing looks like a 3rd parameter, which is only valid on
finalized context. That feels all kinds of wrong. Will it stay? If yes
*ugh* and why?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-29 18:56         ` Daniel Vetter
@ 2021-04-29 19:01           ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 19:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > >
> > > I think we should have a FIXME here of not allowing this on some future
> > > platforms because just use CTX_CREATE_EXT.
> >
> > Done.
> >
> > > > +     if (ret == -ENOTSUPP) {
> > > > +             /* Some params, specifically SSEU, can only be set on fully
> > >
> > > I think this needs a FIXME: that this only holds during the conversion?
> > > Otherwise we kinda have a bit a problem me thinks ...
> >
> > I'm not sure what you mean by that.
>
> Well I'm at least assuming that we wont have this case anymore, i.e.
> there's only two kinds of parameters:
> - those which are valid only on proto context
> - those which are valid on both (like priority)
>
> This SSEU thing looks like a 3rd parameter, which is only valid on
> finalized context. That feels all kinds of wrong. Will it stay? If yes
> *ugh* and why?

Because I was being lazy.  The SSEU stuff is a fairly complex param to
parse and it's always set live.  I can factor out the SSEU parsing
code if you want and it shouldn't be too bad in the end.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-29 19:01           ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 19:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > >
> > > I think we should have a FIXME here of not allowing this on some future
> > > platforms because just use CTX_CREATE_EXT.
> >
> > Done.
> >
> > > > +     if (ret == -ENOTSUPP) {
> > > > +             /* Some params, specifically SSEU, can only be set on fully
> > >
> > > I think this needs a FIXME: that this only holds during the conversion?
> > > Otherwise we kinda have a bit a problem me thinks ...
> >
> > I'm not sure what you mean by that.
>
> Well I'm at least assuming that we wont have this case anymore, i.e.
> there's only two kinds of parameters:
> - those which are valid only on proto context
> - those which are valid on both (like priority)
>
> This SSEU thing looks like a 3rd parameter, which is only valid on
> finalized context. That feels all kinds of wrong. Will it stay? If yes
> *ugh* and why?

Because I was being lazy.  The SSEU stuff is a fairly complex param to
parse and it's always set live.  I can factor out the SSEU parsing
code if you want and it shouldn't be too bad in the end.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-29 19:01           ` Jason Ekstrand
@ 2021-04-29 19:07             ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 19:07 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > >
> > > > I think we should have a FIXME here of not allowing this on some future
> > > > platforms because just use CTX_CREATE_EXT.
> > >
> > > Done.
> > >
> > > > > +     if (ret == -ENOTSUPP) {
> > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > >
> > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > Otherwise we kinda have a bit a problem me thinks ...
> > >
> > > I'm not sure what you mean by that.
> >
> > Well I'm at least assuming that we wont have this case anymore, i.e.
> > there's only two kinds of parameters:
> > - those which are valid only on proto context
> > - those which are valid on both (like priority)
> >
> > This SSEU thing looks like a 3rd parameter, which is only valid on
> > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > *ugh* and why?
> 
> Because I was being lazy.  The SSEU stuff is a fairly complex param to
> parse and it's always set live.  I can factor out the SSEU parsing
> code if you want and it shouldn't be too bad in the end.

Yeah I think the special case here is a bit too jarring.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-29 19:07             ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-29 19:07 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > >
> > > > I think we should have a FIXME here of not allowing this on some future
> > > > platforms because just use CTX_CREATE_EXT.
> > >
> > > Done.
> > >
> > > > > +     if (ret == -ENOTSUPP) {
> > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > >
> > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > Otherwise we kinda have a bit a problem me thinks ...
> > >
> > > I'm not sure what you mean by that.
> >
> > Well I'm at least assuming that we wont have this case anymore, i.e.
> > there's only two kinds of parameters:
> > - those which are valid only on proto context
> > - those which are valid on both (like priority)
> >
> > This SSEU thing looks like a 3rd parameter, which is only valid on
> > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > *ugh* and why?
> 
> Because I was being lazy.  The SSEU stuff is a fairly complex param to
> parse and it's always set live.  I can factor out the SSEU parsing
> code if you want and it shouldn't be too bad in the end.

Yeah I think the special case here is a bit too jarring.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-29  8:01               ` Tvrtko Ursulin
@ 2021-04-29 19:16                 ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 19:16 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 28/04/2021 18:09, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 28/04/2021 15:02, Daniel Vetter wrote:
> >>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>>>
> >>>> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>>>> There's no sense in allowing userspace to create more engines than it
> >>>>>> can possibly access via execbuf.
> >>>>>>
> >>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>>>> ---
> >>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>>>     1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>>>                     return -EINVAL;
> >>>>>>             }
> >>>>>> -  /*
> >>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>>>> -   * first 64 engines defined here.
> >>>>>> -   */
> >>>>>>             num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>>>
> >>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>>>> directly here */ since I had to check that :-)
> >>>>>
> >>>>> Same story about igt testcases needed, just to be sure.
> >>>>>
> >>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>
> >>>> I am not sure about the churn vs benefit ratio here. There are also patches
> >>>> which extend the engine selection field in execbuf2 over the unused
> >>>> constants bits (with an explicit flag). So churn upstream and churn in
> >>>> internal (if interesting) for not much benefit.
> >>>
> >>> This isn't churn.
> >>>
> >>> This is "lock done uapi properly".
> >
> > Pretty much.
>
> Still haven't heard what concrete problems it solves.
>
> >> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> >> for other people and man hours spent which no one will ever properly
> >> account against.
> >>
> >> Number of contexts in the engine map should not really be tied to
> >> execbuf2. As is demonstrated by the incoming work to address more than
> >> 63 engines, either as an extension to execbuf2 or future execbuf3.
> >
> > Which userspace driver has requested more than 64 engines in a single context?
>
> No need to artificially limit hardware capabilities in the uapi by
> implementing a policy in the kernel. Which will need to be
> removed/changed shortly anyway. This particular patch is work and
> creates more work (which other people who will get to fix the fallout
> will spend man hours to figure out what and why broke) for no benefit.
> Or you are yet to explain what the benefit is in concrete terms.

You keep complaining about how much work it takes and yet I've spent
more time replying to your e-mails on this patch than I spent writing
the patch and the IGT test.  Also, if it takes so much time to add a
restriction, then why are we spending time figuring out how to modify
the uAPI to allow you to execbuf on a context with more than 64
engines?  If we're worried about engineering man-hours, then limiting
to 64 IS the pragmatic solution.

> Why don't you limit it to number of physical engines then? Why don't you
> filter out duplicates? Why not limit the number of buffer objects per
> client or global based on available RAM + swap relative to minimum
> object size? Reductio ad absurdum yes, but illustrating the, in this
> case, a thin line between "locking down uapi" and adding too much policy
> where it is not appropriate.

All this patch does is say that  you're not allowed to create a
context with more engines than the execbuf API will let you use.  We
already have an artificial limit.  All this does is push the error
handling further up the stack.  If someone comes up with a mechanism
to execbuf on engine 65 (they'd better have an open-source user if it
involves changing API), I'm very happy for them to bump this limit at
the same time.  It'll take them 5 minutes and it'll be something they
find while writing the IGT test.

> > Also, for execbuf3, I'd like to get rid of contexts entirely and have
> > engines be their own userspace-visible object.  If we go this
> > direction, you can have UINT32_MAX of them.  Problem solved.
>
> Not the problem I am pointing at though.

You listed two ways that accessing engine 65 can happen: Extending
execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
pointed out above, it'll hopefully be a non-issue.  If someone extends
execbuf2 to support more than 64 engines and does not have a userspace
customer that wants said new API change, I will NAK the patch.  If
you've got a 3rd way that someone can get at engine 65 such that this
is a problem, I'd love to hear about it.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-29 19:16                 ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 19:16 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 28/04/2021 18:09, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 28/04/2021 15:02, Daniel Vetter wrote:
> >>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>>>
> >>>> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>>>> There's no sense in allowing userspace to create more engines than it
> >>>>>> can possibly access via execbuf.
> >>>>>>
> >>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>>>> ---
> >>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>>>     1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>>>                     return -EINVAL;
> >>>>>>             }
> >>>>>> -  /*
> >>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>>>> -   * first 64 engines defined here.
> >>>>>> -   */
> >>>>>>             num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>>>
> >>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>>>> directly here */ since I had to check that :-)
> >>>>>
> >>>>> Same story about igt testcases needed, just to be sure.
> >>>>>
> >>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>
> >>>> I am not sure about the churn vs benefit ratio here. There are also patches
> >>>> which extend the engine selection field in execbuf2 over the unused
> >>>> constants bits (with an explicit flag). So churn upstream and churn in
> >>>> internal (if interesting) for not much benefit.
> >>>
> >>> This isn't churn.
> >>>
> >>> This is "lock done uapi properly".
> >
> > Pretty much.
>
> Still haven't heard what concrete problems it solves.
>
> >> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> >> for other people and man hours spent which no one will ever properly
> >> account against.
> >>
> >> Number of contexts in the engine map should not really be tied to
> >> execbuf2. As is demonstrated by the incoming work to address more than
> >> 63 engines, either as an extension to execbuf2 or future execbuf3.
> >
> > Which userspace driver has requested more than 64 engines in a single context?
>
> No need to artificially limit hardware capabilities in the uapi by
> implementing a policy in the kernel. Which will need to be
> removed/changed shortly anyway. This particular patch is work and
> creates more work (which other people who will get to fix the fallout
> will spend man hours to figure out what and why broke) for no benefit.
> Or you are yet to explain what the benefit is in concrete terms.

You keep complaining about how much work it takes and yet I've spent
more time replying to your e-mails on this patch than I spent writing
the patch and the IGT test.  Also, if it takes so much time to add a
restriction, then why are we spending time figuring out how to modify
the uAPI to allow you to execbuf on a context with more than 64
engines?  If we're worried about engineering man-hours, then limiting
to 64 IS the pragmatic solution.

> Why don't you limit it to number of physical engines then? Why don't you
> filter out duplicates? Why not limit the number of buffer objects per
> client or global based on available RAM + swap relative to minimum
> object size? Reductio ad absurdum yes, but illustrating the, in this
> case, a thin line between "locking down uapi" and adding too much policy
> where it is not appropriate.

All this patch does is say that  you're not allowed to create a
context with more engines than the execbuf API will let you use.  We
already have an artificial limit.  All this does is push the error
handling further up the stack.  If someone comes up with a mechanism
to execbuf on engine 65 (they'd better have an open-source user if it
involves changing API), I'm very happy for them to bump this limit at
the same time.  It'll take them 5 minutes and it'll be something they
find while writing the IGT test.

> > Also, for execbuf3, I'd like to get rid of contexts entirely and have
> > engines be their own userspace-visible object.  If we go this
> > direction, you can have UINT32_MAX of them.  Problem solved.
>
> Not the problem I am pointing at though.

You listed two ways that accessing engine 65 can happen: Extending
execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
pointed out above, it'll hopefully be a non-issue.  If someone extends
execbuf2 to support more than 64 engines and does not have a userspace
customer that wants said new API change, I will NAK the patch.  If
you've got a 3rd way that someone can get at engine 65 such that this
is a problem, I'd love to hear about it.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-29 19:07             ` Daniel Vetter
@ 2021-04-29 21:35               ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 21:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > >
> > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > platforms because just use CTX_CREATE_EXT.
> > > >
> > > > Done.
> > > >
> > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > >
> > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > >
> > > > I'm not sure what you mean by that.
> > >
> > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > there's only two kinds of parameters:
> > > - those which are valid only on proto context
> > > - those which are valid on both (like priority)
> > >
> > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > *ugh* and why?
> >
> > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > parse and it's always set live.  I can factor out the SSEU parsing
> > code if you want and it shouldn't be too bad in the end.
>
> Yeah I think the special case here is a bit too jarring.

I rolled a v5 that allows you to set SSEU as a create param.  I'm not
a huge fan of that much code duplication for the SSEU set but I guess
that's what we get for deciding to "unify" our context creation
parameter path with our on-the-fly parameter path....

You can look at it here:

https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588

I'm also going to send it to trybot.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-29 21:35               ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-29 21:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > >
> > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > platforms because just use CTX_CREATE_EXT.
> > > >
> > > > Done.
> > > >
> > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > >
> > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > >
> > > > I'm not sure what you mean by that.
> > >
> > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > there's only two kinds of parameters:
> > > - those which are valid only on proto context
> > > - those which are valid on both (like priority)
> > >
> > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > *ugh* and why?
> >
> > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > parse and it's always set live.  I can factor out the SSEU parsing
> > code if you want and it shouldn't be too bad in the end.
>
> Yeah I think the special case here is a bit too jarring.

I rolled a v5 that allows you to set SSEU as a create param.  I'm not
a huge fan of that much code duplication for the SSEU set but I guess
that's what we get for deciding to "unify" our context creation
parameter path with our on-the-fly parameter path....

You can look at it here:

https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588

I'm also going to send it to trybot.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-29 12:14                   ` Daniel Vetter
@ 2021-04-30  4:03                     ` Matthew Brost
  -1 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-04-30  4:03 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand

On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > until that lands.
> > > >
> > > > Mind being a bit more specific about that?  The motivation for this
> > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > Instead you create engines and then bond them together after the fact.
> > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > we want a more explicit API where you specify a set of engine
> > > > class/instance pairs to bond together into a single engine similar to
> > > > how the current balancing API works.
> > > >
> > > > Of course, that's all focused on the API and not the internals.  But,
> > > > again, I'm not sure how we want things to look internally.  What we've
> > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > much not the expert there.  I don't want to be working at cross
> > > > purposes to you and I'm happy to leave bits if you think they're
> > > > useful.  But I thought I was clearing things away so that you can put
> > > > in what you actually want for GuC/parallel submit.
> > > >
> > >
> > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > still likely be used for the new parallel submission interface with
> > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > only simple ones.
> > 
> > I'm fine with leaving some of the internal bits for a little while if
> > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > thoughts?
> 
> Yeah I'm also wondering why we need this. Essentially your insight (and
> Tony Ye from media team confirmed) is that media umd never uses bonded on
> virtual engines.
>

Well you should use virtual engines with parallel submission interface 
if are you using it correctly.

e.g. You want a 2 wide parallel submission and there are 4 engine
instances.

You'd create 2 VEs:

A: 0, 2
B: 1, 3
set_parallel

For GuC submission we just configure context and the GuC load balances
it.

For execlists we'd need to create bonds.

Also likely the reason virtual engines wasn't used with the old
interface was we only had 2 instances max per class so no need for
virtual engines. If they used it for my above example if they were using
the interface correctly they would have to use virtual engines too.
 
> So the only thing we need is the await_fence submit_fence logic to stall
> the subsequent patches just long enough. I think that stays.
>

My implementation, for the new parallel submission interface, with
execlists used a bonds + priority boosts to ensure both are present at
the same time. This was used for both non-virtual and virtual engines.
This was never reviewed though and the code died on the list.

> All the additional logic with the cmpxchg lockless trickery and all that
> isn't needed, because we _never_ have to select an engine for bonded
> submission: It's always the single one available.
> 
> This would mean that for execlist parallel submit we can apply a
> limitation (beyond what GuC supports perhaps) and it's all ok. With that
> everything except the submit fence await logic itself can go I think.
> 
> Also one for Matt: We decided to ZBB implementing parallel submit on
> execlist, it's going to be just for GuC. At least until someone starts
> screaming really loudly.

If this is the case, then bonds can be deleted.

Matt

> 
> Cheers, Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-30  4:03                     ` Matthew Brost
  0 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-04-30  4:03 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > >
> > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > until that lands.
> > > >
> > > > Mind being a bit more specific about that?  The motivation for this
> > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > Instead you create engines and then bond them together after the fact.
> > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > we want a more explicit API where you specify a set of engine
> > > > class/instance pairs to bond together into a single engine similar to
> > > > how the current balancing API works.
> > > >
> > > > Of course, that's all focused on the API and not the internals.  But,
> > > > again, I'm not sure how we want things to look internally.  What we've
> > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > much not the expert there.  I don't want to be working at cross
> > > > purposes to you and I'm happy to leave bits if you think they're
> > > > useful.  But I thought I was clearing things away so that you can put
> > > > in what you actually want for GuC/parallel submit.
> > > >
> > >
> > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > still likely be used for the new parallel submission interface with
> > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > only simple ones.
> > 
> > I'm fine with leaving some of the internal bits for a little while if
> > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > thoughts?
> 
> Yeah I'm also wondering why we need this. Essentially your insight (and
> Tony Ye from media team confirmed) is that media umd never uses bonded on
> virtual engines.
>

Well you should use virtual engines with parallel submission interface 
if are you using it correctly.

e.g. You want a 2 wide parallel submission and there are 4 engine
instances.

You'd create 2 VEs:

A: 0, 2
B: 1, 3
set_parallel

For GuC submission we just configure context and the GuC load balances
it.

For execlists we'd need to create bonds.

Also likely the reason virtual engines wasn't used with the old
interface was we only had 2 instances max per class so no need for
virtual engines. If they used it for my above example if they were using
the interface correctly they would have to use virtual engines too.
 
> So the only thing we need is the await_fence submit_fence logic to stall
> the subsequent patches just long enough. I think that stays.
>

My implementation, for the new parallel submission interface, with
execlists used a bonds + priority boosts to ensure both are present at
the same time. This was used for both non-virtual and virtual engines.
This was never reviewed though and the code died on the list.

> All the additional logic with the cmpxchg lockless trickery and all that
> isn't needed, because we _never_ have to select an engine for bonded
> submission: It's always the single one available.
> 
> This would mean that for execlist parallel submit we can apply a
> limitation (beyond what GuC supports perhaps) and it's all ok. With that
> everything except the submit fence await logic itself can go I think.
> 
> Also one for Matt: We decided to ZBB implementing parallel submit on
> execlist, it's going to be just for GuC. At least until someone starts
> screaming really loudly.

If this is the case, then bonds can be deleted.

Matt

> 
> Cheers, Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-29 21:35               ` Jason Ekstrand
@ 2021-04-30  6:53                 ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30  6:53 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > >
> > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > platforms because just use CTX_CREATE_EXT.
> > > > >
> > > > > Done.
> > > > >
> > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > >
> > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > >
> > > > > I'm not sure what you mean by that.
> > > >
> > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > there's only two kinds of parameters:
> > > > - those which are valid only on proto context
> > > > - those which are valid on both (like priority)
> > > >
> > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > *ugh* and why?
> > >
> > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > parse and it's always set live.  I can factor out the SSEU parsing
> > > code if you want and it shouldn't be too bad in the end.
> >
> > Yeah I think the special case here is a bit too jarring.
>
> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> a huge fan of that much code duplication for the SSEU set but I guess
> that's what we get for deciding to "unify" our context creation
> parameter path with our on-the-fly parameter path....
>
> You can look at it here:
>
> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588

Hm yeah the duplication of the render engine check is a bit annoying.
What's worse, if you tthrow another set_engines on top it's probably
all wrong then. The old thing solved that by just throwing that
intel_context away.

You're also not keeping the engine id in the proto ctx for this, so
there's probably some gaps there. We'd need to clear the SSEU if
userspace puts another context there. But also no userspace does that.

Plus cursory review of userspace show
- mesa doesn't set this
- compute sets its right before running the batch
- media sets it as the last thing of context creation

So it's kinda not needed. But also we're asking umd to switch over to
CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
puzzled. And we've confused them enough already with our uapis.

Another idea: proto_set_sseu just stores the uapi struct and a note
that it's set, and checks nothing. To validate sseu on proto context
we do (but only when an sseu parameter is set):
1. finalize the context
2. call the real set_sseu for validation
3. throw the finalized context away again, it was just for validating
the overall thing

That way we don't have to consider all the interactions of setting
sseu and engines in any order on proto context, validation code is
guaranteed shared. Only downside is that there's a slight chance in
behaviour: SSEU, then setting another engine in that slot will fail
instead of throwing the sseu parameters away. That's the right thing
for CTX_CREATE_EXT anyway, and current userspace doesn't care.

Thoughts?

> I'm also going to send it to trybot.

If you resend pls include all my r-b, I think some got lost in v4.
Also, in the kernel at least we expect minimal commit message with a
bit of context, there's no Part-of: link pointing at the entire MR
with overview and discussion, the patchwork Link: we add is a pretty
bad substitute. Some of the new patches in v4 are a bit too terse on
that.

And finally I'm still not a big fan of the add/remove split over
patches, but oh well.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30  6:53                 ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30  6:53 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > >
> > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > platforms because just use CTX_CREATE_EXT.
> > > > >
> > > > > Done.
> > > > >
> > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > >
> > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > >
> > > > > I'm not sure what you mean by that.
> > > >
> > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > there's only two kinds of parameters:
> > > > - those which are valid only on proto context
> > > > - those which are valid on both (like priority)
> > > >
> > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > *ugh* and why?
> > >
> > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > parse and it's always set live.  I can factor out the SSEU parsing
> > > code if you want and it shouldn't be too bad in the end.
> >
> > Yeah I think the special case here is a bit too jarring.
>
> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> a huge fan of that much code duplication for the SSEU set but I guess
> that's what we get for deciding to "unify" our context creation
> parameter path with our on-the-fly parameter path....
>
> You can look at it here:
>
> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588

Hm yeah the duplication of the render engine check is a bit annoying.
What's worse, if you tthrow another set_engines on top it's probably
all wrong then. The old thing solved that by just throwing that
intel_context away.

You're also not keeping the engine id in the proto ctx for this, so
there's probably some gaps there. We'd need to clear the SSEU if
userspace puts another context there. But also no userspace does that.

Plus cursory review of userspace show
- mesa doesn't set this
- compute sets its right before running the batch
- media sets it as the last thing of context creation

So it's kinda not needed. But also we're asking umd to switch over to
CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
puzzled. And we've confused them enough already with our uapis.

Another idea: proto_set_sseu just stores the uapi struct and a note
that it's set, and checks nothing. To validate sseu on proto context
we do (but only when an sseu parameter is set):
1. finalize the context
2. call the real set_sseu for validation
3. throw the finalized context away again, it was just for validating
the overall thing

That way we don't have to consider all the interactions of setting
sseu and engines in any order on proto context, validation code is
guaranteed shared. Only downside is that there's a slight chance in
behaviour: SSEU, then setting another engine in that slot will fail
instead of throwing the sseu parameters away. That's the right thing
for CTX_CREATE_EXT anyway, and current userspace doesn't care.

Thoughts?

> I'm also going to send it to trybot.

If you resend pls include all my r-b, I think some got lost in v4.
Also, in the kernel at least we expect minimal commit message with a
bit of context, there's no Part-of: link pointing at the entire MR
with overview and discussion, the patchwork Link: we add is a pretty
bad substitute. Some of the new patches in v4 are a bit too terse on
that.

And finally I'm still not a big fan of the add/remove split over
patches, but oh well.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-30  4:03                     ` Matthew Brost
@ 2021-04-30 10:11                       ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 10:11 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Maling list - DRI developers, Intel GFX, Jason Ekstrand

On Thu, Apr 29, 2021 at 09:03:48PM -0700, Matthew Brost wrote:
> On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> > On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > > until that lands.
> > > > >
> > > > > Mind being a bit more specific about that?  The motivation for this
> > > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > > Instead you create engines and then bond them together after the fact.
> > > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > > we want a more explicit API where you specify a set of engine
> > > > > class/instance pairs to bond together into a single engine similar to
> > > > > how the current balancing API works.
> > > > >
> > > > > Of course, that's all focused on the API and not the internals.  But,
> > > > > again, I'm not sure how we want things to look internally.  What we've
> > > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > > much not the expert there.  I don't want to be working at cross
> > > > > purposes to you and I'm happy to leave bits if you think they're
> > > > > useful.  But I thought I was clearing things away so that you can put
> > > > > in what you actually want for GuC/parallel submit.
> > > > >
> > > >
> > > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > > still likely be used for the new parallel submission interface with
> > > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > > only simple ones.
> > > 
> > > I'm fine with leaving some of the internal bits for a little while if
> > > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > > thoughts?
> > 
> > Yeah I'm also wondering why we need this. Essentially your insight (and
> > Tony Ye from media team confirmed) is that media umd never uses bonded on
> > virtual engines.
> >
> 
> Well you should use virtual engines with parallel submission interface 
> if are you using it correctly.
> 
> e.g. You want a 2 wide parallel submission and there are 4 engine
> instances.
> 
> You'd create 2 VEs:
> 
> A: 0, 2
> B: 1, 3
> set_parallel

So tbh I'm not really liking this part. At least my understanding is that
with GuC this is really one overall virtual engine, backed by a multi-lrc.

So it should fill one engine slot, not fill multiple virtual engines and
then be an awkward thing wrapped on top.

I think (but maybe my understanding of GuC and the parallel submit execbuf
interface is wrong) that the parallel engine should occupy a single VE
slot, not require additional VE just for fun (maybe the execlist backend
would require that internally, but that should not leak into the higher
levels, much less the uapi). And you submit your multi-batch execbuf on
that single parallel VE, which then gets passed to GuC as a multi-LRC.
Internally in the backend there's a bit of fan-out to put the right
MI_BB_START into the right rings and all that, but again I think that
should be backend concerns.

Or am I missing something big here?

> For GuC submission we just configure context and the GuC load balances
> it.
> 
> For execlists we'd need to create bonds.
> 
> Also likely the reason virtual engines wasn't used with the old
> interface was we only had 2 instances max per class so no need for
> virtual engines. If they used it for my above example if they were using
> the interface correctly they would have to use virtual engines too.

They do actually use virtual engines, it's just the virtual engine only
contains a single one, and internally i915 folds that into the hw engine
directly. So we can take away the entire implementation complexity.

Also I still think for execlist we shouldn't bother with trying to enable
parallel submit. Or at least only way down if there's no other reasonable
option.

> > So the only thing we need is the await_fence submit_fence logic to stall
> > the subsequent patches just long enough. I think that stays.
> >
> 
> My implementation, for the new parallel submission interface, with
> execlists used a bonds + priority boosts to ensure both are present at
> the same time. This was used for both non-virtual and virtual engines.
> This was never reviewed though and the code died on the list.

:-(

> > All the additional logic with the cmpxchg lockless trickery and all that
> > isn't needed, because we _never_ have to select an engine for bonded
> > submission: It's always the single one available.
> > 
> > This would mean that for execlist parallel submit we can apply a
> > limitation (beyond what GuC supports perhaps) and it's all ok. With that
> > everything except the submit fence await logic itself can go I think.
> > 
> > Also one for Matt: We decided to ZBB implementing parallel submit on
> > execlist, it's going to be just for GuC. At least until someone starts
> > screaming really loudly.
> 
> If this is the case, then bonds can be deleted.

Yeah that's the goal we're aiming for.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-04-30 10:11                       ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 10:11 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Maling list - DRI developers, Intel GFX

On Thu, Apr 29, 2021 at 09:03:48PM -0700, Matthew Brost wrote:
> On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> > On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > >
> > > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > > until that lands.
> > > > >
> > > > > Mind being a bit more specific about that?  The motivation for this
> > > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > > Instead you create engines and then bond them together after the fact.
> > > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > > we want a more explicit API where you specify a set of engine
> > > > > class/instance pairs to bond together into a single engine similar to
> > > > > how the current balancing API works.
> > > > >
> > > > > Of course, that's all focused on the API and not the internals.  But,
> > > > > again, I'm not sure how we want things to look internally.  What we've
> > > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > > much not the expert there.  I don't want to be working at cross
> > > > > purposes to you and I'm happy to leave bits if you think they're
> > > > > useful.  But I thought I was clearing things away so that you can put
> > > > > in what you actually want for GuC/parallel submit.
> > > > >
> > > >
> > > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > > still likely be used for the new parallel submission interface with
> > > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > > only simple ones.
> > > 
> > > I'm fine with leaving some of the internal bits for a little while if
> > > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > > thoughts?
> > 
> > Yeah I'm also wondering why we need this. Essentially your insight (and
> > Tony Ye from media team confirmed) is that media umd never uses bonded on
> > virtual engines.
> >
> 
> Well you should use virtual engines with parallel submission interface 
> if are you using it correctly.
> 
> e.g. You want a 2 wide parallel submission and there are 4 engine
> instances.
> 
> You'd create 2 VEs:
> 
> A: 0, 2
> B: 1, 3
> set_parallel

So tbh I'm not really liking this part. At least my understanding is that
with GuC this is really one overall virtual engine, backed by a multi-lrc.

So it should fill one engine slot, not fill multiple virtual engines and
then be an awkward thing wrapped on top.

I think (but maybe my understanding of GuC and the parallel submit execbuf
interface is wrong) that the parallel engine should occupy a single VE
slot, not require additional VE just for fun (maybe the execlist backend
would require that internally, but that should not leak into the higher
levels, much less the uapi). And you submit your multi-batch execbuf on
that single parallel VE, which then gets passed to GuC as a multi-LRC.
Internally in the backend there's a bit of fan-out to put the right
MI_BB_START into the right rings and all that, but again I think that
should be backend concerns.

Or am I missing something big here?

> For GuC submission we just configure context and the GuC load balances
> it.
> 
> For execlists we'd need to create bonds.
> 
> Also likely the reason virtual engines wasn't used with the old
> interface was we only had 2 instances max per class so no need for
> virtual engines. If they used it for my above example if they were using
> the interface correctly they would have to use virtual engines too.

They do actually use virtual engines, it's just the virtual engine only
contains a single one, and internally i915 folds that into the hw engine
directly. So we can take away the entire implementation complexity.

Also I still think for execlist we shouldn't bother with trying to enable
parallel submit. Or at least only way down if there's no other reasonable
option.

> > So the only thing we need is the await_fence submit_fence logic to stall
> > the subsequent patches just long enough. I think that stays.
> >
> 
> My implementation, for the new parallel submission interface, with
> execlists used a bonds + priority boosts to ensure both are present at
> the same time. This was used for both non-virtual and virtual engines.
> This was never reviewed though and the code died on the list.

:-(

> > All the additional logic with the cmpxchg lockless trickery and all that
> > isn't needed, because we _never_ have to select an engine for bonded
> > submission: It's always the single one available.
> > 
> > This would mean that for execlist parallel submit we can apply a
> > limitation (beyond what GuC supports perhaps) and it's all ok. With that
> > everything except the submit fence await logic itself can go I think.
> > 
> > Also one for Matt: We decided to ZBB implementing parallel submit on
> > execlist, it's going to be just for GuC. At least until someone starts
> > screaming really loudly.
> 
> If this is the case, then bonds can be deleted.

Yeah that's the goal we're aiming for.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-29 14:54           ` Jason Ekstrand
@ 2021-04-30 11:18             ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 11:18 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 29/04/2021 15:54, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 28/04/2021 18:24, Jason Ekstrand wrote:
>>> On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>>>> Instead of handling it like a context param, unconditionally set it when
>>>>> intel_contexts are created.  This doesn't fix anything but does simplify
>>>>> the code a bit.
>>>>>
>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>>>>>     .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>>>>>     drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>>>>>     3 files changed, 6 insertions(+), 44 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>> index 35bcdeddfbf3f..1091cc04a242a 100644
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>>>>>             intel_engine_has_timeslices(ce->engine))
>>>>>                 __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>>>>>
>>>>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
>>>>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
>>>>> +         ctx->i915->params.request_timeout_ms) {
>>>>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
>>>>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
>>>>
>>>> Blank line between declarations and code please, or just lose the local.
>>>>
>>>> Otherwise looks okay. Slight change that same GEM context can now have a
>>>> mix of different request expirations isn't interesting I think. At least
>>>> the change goes away by the end of the series.
>>>
>>> In order for that to happen, I think you'd have to have a race between
>>> CREATE_CONTEXT and someone smashing the request_timeout_ms param via
>>> sysfs.  Or am I missing something?  Given that timeouts are really
>>> per-engine anyway, I don't think we need to care too much about that.
>>
>> We don't care, no.
>>
>> For completeness only - by the end of the series it is what you say. But
>> at _this_ point in the series though it is if modparam changes at any
>> point between context create and replacing engines. Which is a change
>> compared to before this patch, since modparam was cached in the GEM
>> context so far. So one GEM context was a single request_timeout_ms.
> 
> I've added the following to the commit message:
> 
> It also means that sync files exported from different engines on a
> SINGLE_TIMELINE context will have different fence contexts.  This is
> visible to userspace if it looks at the obj_name field of
> sync_fence_info.
> 
> How's that sound?

Wrong thread but sounds good.

I haven't looked into the fence merge logic apart from noticing context 
is used there. So I'd suggest a quick look there on top, just to make 
sure merging logic does not hold any surprises if contexts start to 
differ. Probably just results with more inefficiency somewhere, in theory.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-30 11:18             ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 11:18 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 29/04/2021 15:54, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 28/04/2021 18:24, Jason Ekstrand wrote:
>>> On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>> On 23/04/2021 23:31, Jason Ekstrand wrote:
>>>>> Instead of handling it like a context param, unconditionally set it when
>>>>> intel_contexts are created.  This doesn't fix anything but does simplify
>>>>> the code a bit.
>>>>>
>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
>>>>>     .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
>>>>>     drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
>>>>>     3 files changed, 6 insertions(+), 44 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>> index 35bcdeddfbf3f..1091cc04a242a 100644
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
>>>>>             intel_engine_has_timeslices(ce->engine))
>>>>>                 __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
>>>>>
>>>>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
>>>>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
>>>>> +         ctx->i915->params.request_timeout_ms) {
>>>>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
>>>>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
>>>>
>>>> Blank line between declarations and code please, or just lose the local.
>>>>
>>>> Otherwise looks okay. Slight change that same GEM context can now have a
>>>> mix of different request expirations isn't interesting I think. At least
>>>> the change goes away by the end of the series.
>>>
>>> In order for that to happen, I think you'd have to have a race between
>>> CREATE_CONTEXT and someone smashing the request_timeout_ms param via
>>> sysfs.  Or am I missing something?  Given that timeouts are really
>>> per-engine anyway, I don't think we need to care too much about that.
>>
>> We don't care, no.
>>
>> For completeness only - by the end of the series it is what you say. But
>> at _this_ point in the series though it is if modparam changes at any
>> point between context create and replacing engines. Which is a change
>> compared to before this patch, since modparam was cached in the GEM
>> context so far. So one GEM context was a single request_timeout_ms.
> 
> I've added the following to the commit message:
> 
> It also means that sync files exported from different engines on a
> SINGLE_TIMELINE context will have different fence contexts.  This is
> visible to userspace if it looks at the obj_name field of
> sync_fence_info.
> 
> How's that sound?

Wrong thread but sounds good.

I haven't looked into the fence merge logic apart from noticing context 
is used there. So I'd suggest a quick look there on top, just to make 
sure merging logic does not hold any surprises if contexts start to 
differ. Probably just results with more inefficiency somewhere, in theory.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-29 19:16                 ` Jason Ekstrand
@ 2021-04-30 11:40                   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 11:40 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 29/04/2021 20:16, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 28/04/2021 18:09, Jason Ekstrand wrote:
>>> On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>> On 28/04/2021 15:02, Daniel Vetter wrote:
>>>>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 28/04/2021 11:16, Daniel Vetter wrote:
>>>>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>>>>>>>> There's no sense in allowing userspace to create more engines than it
>>>>>>>> can possibly access via execbuf.
>>>>>>>>
>>>>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>>>>>> ---
>>>>>>>>      drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>>>>>>>      1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>>>>>>>                      return -EINVAL;
>>>>>>>>              }
>>>>>>>> -  /*
>>>>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>>>>>>>> -   * first 64 engines defined here.
>>>>>>>> -   */
>>>>>>>>              num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
>>>>>>>
>>>>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
>>>>>>> directly here */ since I had to check that :-)
>>>>>>>
>>>>>>> Same story about igt testcases needed, just to be sure.
>>>>>>>
>>>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>
>>>>>> I am not sure about the churn vs benefit ratio here. There are also patches
>>>>>> which extend the engine selection field in execbuf2 over the unused
>>>>>> constants bits (with an explicit flag). So churn upstream and churn in
>>>>>> internal (if interesting) for not much benefit.
>>>>>
>>>>> This isn't churn.
>>>>>
>>>>> This is "lock done uapi properly".
>>>
>>> Pretty much.
>>
>> Still haven't heard what concrete problems it solves.
>>
>>>> IMO it is a "meh" patch. Doesn't fix any problems and will create work
>>>> for other people and man hours spent which no one will ever properly
>>>> account against.
>>>>
>>>> Number of contexts in the engine map should not really be tied to
>>>> execbuf2. As is demonstrated by the incoming work to address more than
>>>> 63 engines, either as an extension to execbuf2 or future execbuf3.
>>>
>>> Which userspace driver has requested more than 64 engines in a single context?
>>
>> No need to artificially limit hardware capabilities in the uapi by
>> implementing a policy in the kernel. Which will need to be
>> removed/changed shortly anyway. This particular patch is work and
>> creates more work (which other people who will get to fix the fallout
>> will spend man hours to figure out what and why broke) for no benefit.
>> Or you are yet to explain what the benefit is in concrete terms.
> 
> You keep complaining about how much work it takes and yet I've spent
> more time replying to your e-mails on this patch than I spent writing
> the patch and the IGT test.  Also, if it takes so much time to add a
> restriction, then why are we spending time figuring out how to modify
> the uAPI to allow you to execbuf on a context with more than 64
> engines?  If we're worried about engineering man-hours, then limiting
> to 64 IS the pragmatic solution.

a)

Question of what problem does the patch fix is still unanswered.

b)

You miss the point. I'll continue in the next paragraph..

> 
>> Why don't you limit it to number of physical engines then? Why don't you
>> filter out duplicates? Why not limit the number of buffer objects per
>> client or global based on available RAM + swap relative to minimum
>> object size? Reductio ad absurdum yes, but illustrating the, in this
>> case, a thin line between "locking down uapi" and adding too much policy
>> where it is not appropriate.
> 
> All this patch does is say that  you're not allowed to create a
> context with more engines than the execbuf API will let you use.  We
> already have an artificial limit.  All this does is push the error
> handling further up the stack.  If someone comes up with a mechanism
> to execbuf on engine 65 (they'd better have an open-source user if it
> involves changing API), I'm very happy for them to bump this limit at
> the same time.  It'll take them 5 minutes and it'll be something they
> find while writing the IGT test.

.. no it won't take five minutes.

If I need to spell everything out - you will put this patch in, which 
fixes nothing, and it will propagate to the internal kernel at some 
point. Then a bunch of tests will start failing in a strange manner. 
Which will result in people triaging them, then assigning them, then 
reserving machines, setting them up, running the repro, then digging 
into the code, and eventually figuring out what happened.

It will take hours not five minutes. And there will likely be multiple 
bug reports which most likely won't be joined so mutliple people will be 
doing multi hour debug. All for nothing. So it is rather uninteresting 
how small the change is. Interesting part is how much pointless effort 
it will create across the organisation.

Of course you may not care that much about that side of things, or you 
are just not familiar in how it works in practice since you haven't been 
involved in the past years. I don't know really, but I have to raise the 
point it makes no sense to do this. Cost vs benefit is simply not nearly 
there.

>>> Also, for execbuf3, I'd like to get rid of contexts entirely and have
>>> engines be their own userspace-visible object.  If we go this
>>> direction, you can have UINT32_MAX of them.  Problem solved.
>>
>> Not the problem I am pointing at though.
> 
> You listed two ways that accessing engine 65 can happen: Extending
> execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
> pointed out above, it'll hopefully be a non-issue.  If someone extends
> execbuf2 to support more than 64 engines and does not have a userspace
> customer that wants said new API change, I will NAK the patch.  If
> you've got a 3rd way that someone can get at engine 65 such that this
> is a problem, I'd love to hear about it.

It's ever so easy to take a black and white stance but the world is more 
like shades of grey. I too am totally perplexed why we have to spend 
time arguing on a inconsequential patch.

Context create is not called "create execbuf2 context" so why be so 
wedded to adding execbuf2 restrictions into it I have no idea. If you 
were fixing some vulnerability or something I'd understand but all I've 
heard so far is along the lines of "This is proper locking down of uapi 
- end of". And endless waste of time discussion follows. We don't have 
to agree on everything anyway and I have raised my concern enough times 
now. Up to you guys to re-figure out the cost benefit on your own then.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-30 11:40                   ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 11:40 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 29/04/2021 20:16, Jason Ekstrand wrote:
> On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 28/04/2021 18:09, Jason Ekstrand wrote:
>>> On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>> On 28/04/2021 15:02, Daniel Vetter wrote:
>>>>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 28/04/2021 11:16, Daniel Vetter wrote:
>>>>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
>>>>>>>> There's no sense in allowing userspace to create more engines than it
>>>>>>>> can possibly access via execbuf.
>>>>>>>>
>>>>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>>>>>>> ---
>>>>>>>>      drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
>>>>>>>>      1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>>>>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
>>>>>>>>                      return -EINVAL;
>>>>>>>>              }
>>>>>>>> -  /*
>>>>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
>>>>>>>> -   * first 64 engines defined here.
>>>>>>>> -   */
>>>>>>>>              num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
>>>>>>>
>>>>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
>>>>>>> directly here */ since I had to check that :-)
>>>>>>>
>>>>>>> Same story about igt testcases needed, just to be sure.
>>>>>>>
>>>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>
>>>>>> I am not sure about the churn vs benefit ratio here. There are also patches
>>>>>> which extend the engine selection field in execbuf2 over the unused
>>>>>> constants bits (with an explicit flag). So churn upstream and churn in
>>>>>> internal (if interesting) for not much benefit.
>>>>>
>>>>> This isn't churn.
>>>>>
>>>>> This is "lock done uapi properly".
>>>
>>> Pretty much.
>>
>> Still haven't heard what concrete problems it solves.
>>
>>>> IMO it is a "meh" patch. Doesn't fix any problems and will create work
>>>> for other people and man hours spent which no one will ever properly
>>>> account against.
>>>>
>>>> Number of contexts in the engine map should not really be tied to
>>>> execbuf2. As is demonstrated by the incoming work to address more than
>>>> 63 engines, either as an extension to execbuf2 or future execbuf3.
>>>
>>> Which userspace driver has requested more than 64 engines in a single context?
>>
>> No need to artificially limit hardware capabilities in the uapi by
>> implementing a policy in the kernel. Which will need to be
>> removed/changed shortly anyway. This particular patch is work and
>> creates more work (which other people who will get to fix the fallout
>> will spend man hours to figure out what and why broke) for no benefit.
>> Or you are yet to explain what the benefit is in concrete terms.
> 
> You keep complaining about how much work it takes and yet I've spent
> more time replying to your e-mails on this patch than I spent writing
> the patch and the IGT test.  Also, if it takes so much time to add a
> restriction, then why are we spending time figuring out how to modify
> the uAPI to allow you to execbuf on a context with more than 64
> engines?  If we're worried about engineering man-hours, then limiting
> to 64 IS the pragmatic solution.

a)

Question of what problem does the patch fix is still unanswered.

b)

You miss the point. I'll continue in the next paragraph..

> 
>> Why don't you limit it to number of physical engines then? Why don't you
>> filter out duplicates? Why not limit the number of buffer objects per
>> client or global based on available RAM + swap relative to minimum
>> object size? Reductio ad absurdum yes, but illustrating the, in this
>> case, a thin line between "locking down uapi" and adding too much policy
>> where it is not appropriate.
> 
> All this patch does is say that  you're not allowed to create a
> context with more engines than the execbuf API will let you use.  We
> already have an artificial limit.  All this does is push the error
> handling further up the stack.  If someone comes up with a mechanism
> to execbuf on engine 65 (they'd better have an open-source user if it
> involves changing API), I'm very happy for them to bump this limit at
> the same time.  It'll take them 5 minutes and it'll be something they
> find while writing the IGT test.

.. no it won't take five minutes.

If I need to spell everything out - you will put this patch in, which 
fixes nothing, and it will propagate to the internal kernel at some 
point. Then a bunch of tests will start failing in a strange manner. 
Which will result in people triaging them, then assigning them, then 
reserving machines, setting them up, running the repro, then digging 
into the code, and eventually figuring out what happened.

It will take hours not five minutes. And there will likely be multiple 
bug reports which most likely won't be joined so mutliple people will be 
doing multi hour debug. All for nothing. So it is rather uninteresting 
how small the change is. Interesting part is how much pointless effort 
it will create across the organisation.

Of course you may not care that much about that side of things, or you 
are just not familiar in how it works in practice since you haven't been 
involved in the past years. I don't know really, but I have to raise the 
point it makes no sense to do this. Cost vs benefit is simply not nearly 
there.

>>> Also, for execbuf3, I'd like to get rid of contexts entirely and have
>>> engines be their own userspace-visible object.  If we go this
>>> direction, you can have UINT32_MAX of them.  Problem solved.
>>
>> Not the problem I am pointing at though.
> 
> You listed two ways that accessing engine 65 can happen: Extending
> execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
> pointed out above, it'll hopefully be a non-issue.  If someone extends
> execbuf2 to support more than 64 engines and does not have a userspace
> customer that wants said new API change, I will NAK the patch.  If
> you've got a 3rd way that someone can get at engine 65 such that this
> is a problem, I'd love to hear about it.

It's ever so easy to take a black and white stance but the world is more 
like shades of grey. I too am totally perplexed why we have to spend 
time arguing on a inconsequential patch.

Context create is not called "create execbuf2 context" so why be so 
wedded to adding execbuf2 restrictions into it I have no idea. If you 
were fixing some vulnerability or something I'd understand but all I've 
heard so far is along the lines of "This is proper locking down of uapi 
- end of". And endless waste of time discussion follows. We don't have 
to agree on everything anyway and I have raised my concern enough times 
now. Up to you guys to re-figure out the cost benefit on your own then.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30  6:53                 ` Daniel Vetter
@ 2021-04-30 11:58                   ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 11:58 UTC (permalink / raw)
  To: Daniel Vetter, Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 30/04/2021 07:53, Daniel Vetter wrote:
> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>
>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>
>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
>>>>>>>
>>>>>>> I think we should have a FIXME here of not allowing this on some future
>>>>>>> platforms because just use CTX_CREATE_EXT.
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>>> +     if (ret == -ENOTSUPP) {
>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
>>>>>>>
>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
>>>>>>
>>>>>> I'm not sure what you mean by that.
>>>>>
>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
>>>>> there's only two kinds of parameters:
>>>>> - those which are valid only on proto context
>>>>> - those which are valid on both (like priority)
>>>>>
>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
>>>>> *ugh* and why?
>>>>
>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
>>>> parse and it's always set live.  I can factor out the SSEU parsing
>>>> code if you want and it shouldn't be too bad in the end.
>>>
>>> Yeah I think the special case here is a bit too jarring.
>>
>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
>> a huge fan of that much code duplication for the SSEU set but I guess
>> that's what we get for deciding to "unify" our context creation
>> parameter path with our on-the-fly parameter path....
>>
>> You can look at it here:
>>
>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> 
> Hm yeah the duplication of the render engine check is a bit annoying.
> What's worse, if you tthrow another set_engines on top it's probably
> all wrong then. The old thing solved that by just throwing that
> intel_context away.
> 
> You're also not keeping the engine id in the proto ctx for this, so
> there's probably some gaps there. We'd need to clear the SSEU if
> userspace puts another context there. But also no userspace does that.
> 
> Plus cursory review of userspace show
> - mesa doesn't set this
> - compute sets its right before running the batch
> - media sets it as the last thing of context creation

Noticed a long sub-thread so looked inside..

SSEU is a really an interesting one.

For current userspace limiting to context creation is fine, since it is 
only allowed for Icelake/VME use case. But if you notice the comment inside:

		/* ABI restriction - VME use case only. */

It is a hint there was, or could be, more to this uapi than that.

And from memory I think limiting to creation time will nip the hopes 
media had to use this dynamically on other platforms in the bud. So not 
that good really. They had convincing numbers what gets significantly 
better if we allowed dynamic control to this, just that as always, open 
source userspace was not there so we never allowed it. However if you 
come up with a new world order where it can only be done at context 
creation, as said already, the possibility for that improvement (aka 
further improving the competitive advantage) is most likely dashed.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 11:58                   ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 11:58 UTC (permalink / raw)
  To: Daniel Vetter, Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers


On 30/04/2021 07:53, Daniel Vetter wrote:
> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>
>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>
>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
>>>>>>>
>>>>>>> I think we should have a FIXME here of not allowing this on some future
>>>>>>> platforms because just use CTX_CREATE_EXT.
>>>>>>
>>>>>> Done.
>>>>>>
>>>>>>>> +     if (ret == -ENOTSUPP) {
>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
>>>>>>>
>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
>>>>>>
>>>>>> I'm not sure what you mean by that.
>>>>>
>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
>>>>> there's only two kinds of parameters:
>>>>> - those which are valid only on proto context
>>>>> - those which are valid on both (like priority)
>>>>>
>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
>>>>> *ugh* and why?
>>>>
>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
>>>> parse and it's always set live.  I can factor out the SSEU parsing
>>>> code if you want and it shouldn't be too bad in the end.
>>>
>>> Yeah I think the special case here is a bit too jarring.
>>
>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
>> a huge fan of that much code duplication for the SSEU set but I guess
>> that's what we get for deciding to "unify" our context creation
>> parameter path with our on-the-fly parameter path....
>>
>> You can look at it here:
>>
>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> 
> Hm yeah the duplication of the render engine check is a bit annoying.
> What's worse, if you tthrow another set_engines on top it's probably
> all wrong then. The old thing solved that by just throwing that
> intel_context away.
> 
> You're also not keeping the engine id in the proto ctx for this, so
> there's probably some gaps there. We'd need to clear the SSEU if
> userspace puts another context there. But also no userspace does that.
> 
> Plus cursory review of userspace show
> - mesa doesn't set this
> - compute sets its right before running the batch
> - media sets it as the last thing of context creation

Noticed a long sub-thread so looked inside..

SSEU is a really an interesting one.

For current userspace limiting to context creation is fine, since it is 
only allowed for Icelake/VME use case. But if you notice the comment inside:

		/* ABI restriction - VME use case only. */

It is a hint there was, or could be, more to this uapi than that.

And from memory I think limiting to creation time will nip the hopes 
media had to use this dynamically on other platforms in the bud. So not 
that good really. They had convincing numbers what gets significantly 
better if we allowed dynamic control to this, just that as always, open 
source userspace was not there so we never allowed it. However if you 
come up with a new world order where it can only be done at context 
creation, as said already, the possibility for that improvement (aka 
further improving the competitive advantage) is most likely dashed.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 11:58                   ` Tvrtko Ursulin
@ 2021-04-30 12:30                     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 12:30 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand

On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 30/04/2021 07:53, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >>
> >> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>
> >>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> >>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> >>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
> >>>>>>>
> >>>>>>> I think we should have a FIXME here of not allowing this on some future
> >>>>>>> platforms because just use CTX_CREATE_EXT.
> >>>>>>
> >>>>>> Done.
> >>>>>>
> >>>>>>>> +     if (ret == -ENOTSUPP) {
> >>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
> >>>>>>>
> >>>>>>> I think this needs a FIXME: that this only holds during the conversion?
> >>>>>>> Otherwise we kinda have a bit a problem me thinks ...
> >>>>>>
> >>>>>> I'm not sure what you mean by that.
> >>>>>
> >>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
> >>>>> there's only two kinds of parameters:
> >>>>> - those which are valid only on proto context
> >>>>> - those which are valid on both (like priority)
> >>>>>
> >>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
> >>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
> >>>>> *ugh* and why?
> >>>>
> >>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
> >>>> parse and it's always set live.  I can factor out the SSEU parsing
> >>>> code if you want and it shouldn't be too bad in the end.
> >>>
> >>> Yeah I think the special case here is a bit too jarring.
> >>
> >> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> >> a huge fan of that much code duplication for the SSEU set but I guess
> >> that's what we get for deciding to "unify" our context creation
> >> parameter path with our on-the-fly parameter path....
> >>
> >> You can look at it here:
> >>
> >> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> >
> > Hm yeah the duplication of the render engine check is a bit annoying.
> > What's worse, if you tthrow another set_engines on top it's probably
> > all wrong then. The old thing solved that by just throwing that
> > intel_context away.
> >
> > You're also not keeping the engine id in the proto ctx for this, so
> > there's probably some gaps there. We'd need to clear the SSEU if
> > userspace puts another context there. But also no userspace does that.
> >
> > Plus cursory review of userspace show
> > - mesa doesn't set this
> > - compute sets its right before running the batch
> > - media sets it as the last thing of context creation
>
> Noticed a long sub-thread so looked inside..
>
> SSEU is a really an interesting one.
>
> For current userspace limiting to context creation is fine, since it is
> only allowed for Icelake/VME use case. But if you notice the comment inside:
>
>                 /* ABI restriction - VME use case only. */
>
> It is a hint there was, or could be, more to this uapi than that.
>
> And from memory I think limiting to creation time will nip the hopes
> media had to use this dynamically on other platforms in the bud. So not
> that good really. They had convincing numbers what gets significantly
> better if we allowed dynamic control to this, just that as always, open
> source userspace was not there so we never allowed it. However if you
> come up with a new world order where it can only be done at context
> creation, as said already, the possibility for that improvement (aka
> further improving the competitive advantage) is most likely dashed.

Hm are you sure that this is create-time only? media-driver uses it
like that, but from my checking compute-runtime updates SSEU mode
before every execbuf call. So it very much looked like we have to keep
this dynamic.

Or do you mean this is defacto dead code? this = compute setting it
before every batch I mean here.
-Daniel




--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 12:30                     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 12:30 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 30/04/2021 07:53, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >>
> >> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>
> >>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> >>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> >>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
> >>>>>>>
> >>>>>>> I think we should have a FIXME here of not allowing this on some future
> >>>>>>> platforms because just use CTX_CREATE_EXT.
> >>>>>>
> >>>>>> Done.
> >>>>>>
> >>>>>>>> +     if (ret == -ENOTSUPP) {
> >>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
> >>>>>>>
> >>>>>>> I think this needs a FIXME: that this only holds during the conversion?
> >>>>>>> Otherwise we kinda have a bit a problem me thinks ...
> >>>>>>
> >>>>>> I'm not sure what you mean by that.
> >>>>>
> >>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
> >>>>> there's only two kinds of parameters:
> >>>>> - those which are valid only on proto context
> >>>>> - those which are valid on both (like priority)
> >>>>>
> >>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
> >>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
> >>>>> *ugh* and why?
> >>>>
> >>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
> >>>> parse and it's always set live.  I can factor out the SSEU parsing
> >>>> code if you want and it shouldn't be too bad in the end.
> >>>
> >>> Yeah I think the special case here is a bit too jarring.
> >>
> >> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> >> a huge fan of that much code duplication for the SSEU set but I guess
> >> that's what we get for deciding to "unify" our context creation
> >> parameter path with our on-the-fly parameter path....
> >>
> >> You can look at it here:
> >>
> >> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> >
> > Hm yeah the duplication of the render engine check is a bit annoying.
> > What's worse, if you tthrow another set_engines on top it's probably
> > all wrong then. The old thing solved that by just throwing that
> > intel_context away.
> >
> > You're also not keeping the engine id in the proto ctx for this, so
> > there's probably some gaps there. We'd need to clear the SSEU if
> > userspace puts another context there. But also no userspace does that.
> >
> > Plus cursory review of userspace show
> > - mesa doesn't set this
> > - compute sets its right before running the batch
> > - media sets it as the last thing of context creation
>
> Noticed a long sub-thread so looked inside..
>
> SSEU is a really an interesting one.
>
> For current userspace limiting to context creation is fine, since it is
> only allowed for Icelake/VME use case. But if you notice the comment inside:
>
>                 /* ABI restriction - VME use case only. */
>
> It is a hint there was, or could be, more to this uapi than that.
>
> And from memory I think limiting to creation time will nip the hopes
> media had to use this dynamically on other platforms in the bud. So not
> that good really. They had convincing numbers what gets significantly
> better if we allowed dynamic control to this, just that as always, open
> source userspace was not there so we never allowed it. However if you
> come up with a new world order where it can only be done at context
> creation, as said already, the possibility for that improvement (aka
> further improving the competitive advantage) is most likely dashed.

Hm are you sure that this is create-time only? media-driver uses it
like that, but from my checking compute-runtime updates SSEU mode
before every execbuf call. So it very much looked like we have to keep
this dynamic.

Or do you mean this is defacto dead code? this = compute setting it
before every batch I mean here.
-Daniel




--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 12:30                     ` Daniel Vetter
@ 2021-04-30 12:44                       ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 12:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand



On 30/04/2021 13:30, Daniel Vetter wrote:
> On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 30/04/2021 07:53, Daniel Vetter wrote:
>>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>>>
>>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>
>>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
>>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
>>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
>>>>>>>>>
>>>>>>>>> I think we should have a FIXME here of not allowing this on some future
>>>>>>>>> platforms because just use CTX_CREATE_EXT.
>>>>>>>>
>>>>>>>> Done.
>>>>>>>>
>>>>>>>>>> +     if (ret == -ENOTSUPP) {
>>>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
>>>>>>>>>
>>>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
>>>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
>>>>>>>>
>>>>>>>> I'm not sure what you mean by that.
>>>>>>>
>>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
>>>>>>> there's only two kinds of parameters:
>>>>>>> - those which are valid only on proto context
>>>>>>> - those which are valid on both (like priority)
>>>>>>>
>>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
>>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
>>>>>>> *ugh* and why?
>>>>>>
>>>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
>>>>>> parse and it's always set live.  I can factor out the SSEU parsing
>>>>>> code if you want and it shouldn't be too bad in the end.
>>>>>
>>>>> Yeah I think the special case here is a bit too jarring.
>>>>
>>>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
>>>> a huge fan of that much code duplication for the SSEU set but I guess
>>>> that's what we get for deciding to "unify" our context creation
>>>> parameter path with our on-the-fly parameter path....
>>>>
>>>> You can look at it here:
>>>>
>>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
>>>
>>> Hm yeah the duplication of the render engine check is a bit annoying.
>>> What's worse, if you tthrow another set_engines on top it's probably
>>> all wrong then. The old thing solved that by just throwing that
>>> intel_context away.
>>>
>>> You're also not keeping the engine id in the proto ctx for this, so
>>> there's probably some gaps there. We'd need to clear the SSEU if
>>> userspace puts another context there. But also no userspace does that.
>>>
>>> Plus cursory review of userspace show
>>> - mesa doesn't set this
>>> - compute sets its right before running the batch
>>> - media sets it as the last thing of context creation
>>
>> Noticed a long sub-thread so looked inside..
>>
>> SSEU is a really an interesting one.
>>
>> For current userspace limiting to context creation is fine, since it is
>> only allowed for Icelake/VME use case. But if you notice the comment inside:
>>
>>                  /* ABI restriction - VME use case only. */
>>
>> It is a hint there was, or could be, more to this uapi than that.
>>
>> And from memory I think limiting to creation time will nip the hopes
>> media had to use this dynamically on other platforms in the bud. So not
>> that good really. They had convincing numbers what gets significantly
>> better if we allowed dynamic control to this, just that as always, open
>> source userspace was not there so we never allowed it. However if you
>> come up with a new world order where it can only be done at context
>> creation, as said already, the possibility for that improvement (aka
>> further improving the competitive advantage) is most likely dashed.
> 
> Hm are you sure that this is create-time only? media-driver uses it
> like that, but from my checking compute-runtime updates SSEU mode
> before every execbuf call. So it very much looked like we have to keep
> this dynamic.

Ah okay, I assumed it's more of the overall drive to eliminate 
set_param. If sseu set_param stays then it's fine for what I had in mind.

> Or do you mean this is defacto dead code? this = compute setting it
> before every batch I mean here.

No idea, wasn't aware of the compute usage.

Before every execbuf is not very ideal though since we have to inject a 
foreign context operation to update context image, which means stream of 
work belonging to the context cannot be coalesced (assuming it could to 
start with). There is also a hw cost to reconfigure the sseu which adds 
latency on top.

Anyway, I was only aware of the current media usage, which is static as 
you say, and future/wishlist media usage, which would be dynamic, but a 
complicated story to get right (partly due downsides mentioned in the 
previous paragraph mean balancing benefit vs cost of dynamic sseu is not 
easy).

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 12:44                       ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 12:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers



On 30/04/2021 13:30, Daniel Vetter wrote:
> On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 30/04/2021 07:53, Daniel Vetter wrote:
>>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>>>
>>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>
>>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
>>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
>>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
>>>>>>>>>
>>>>>>>>> I think we should have a FIXME here of not allowing this on some future
>>>>>>>>> platforms because just use CTX_CREATE_EXT.
>>>>>>>>
>>>>>>>> Done.
>>>>>>>>
>>>>>>>>>> +     if (ret == -ENOTSUPP) {
>>>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
>>>>>>>>>
>>>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
>>>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
>>>>>>>>
>>>>>>>> I'm not sure what you mean by that.
>>>>>>>
>>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
>>>>>>> there's only two kinds of parameters:
>>>>>>> - those which are valid only on proto context
>>>>>>> - those which are valid on both (like priority)
>>>>>>>
>>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
>>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
>>>>>>> *ugh* and why?
>>>>>>
>>>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
>>>>>> parse and it's always set live.  I can factor out the SSEU parsing
>>>>>> code if you want and it shouldn't be too bad in the end.
>>>>>
>>>>> Yeah I think the special case here is a bit too jarring.
>>>>
>>>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
>>>> a huge fan of that much code duplication for the SSEU set but I guess
>>>> that's what we get for deciding to "unify" our context creation
>>>> parameter path with our on-the-fly parameter path....
>>>>
>>>> You can look at it here:
>>>>
>>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
>>>
>>> Hm yeah the duplication of the render engine check is a bit annoying.
>>> What's worse, if you tthrow another set_engines on top it's probably
>>> all wrong then. The old thing solved that by just throwing that
>>> intel_context away.
>>>
>>> You're also not keeping the engine id in the proto ctx for this, so
>>> there's probably some gaps there. We'd need to clear the SSEU if
>>> userspace puts another context there. But also no userspace does that.
>>>
>>> Plus cursory review of userspace show
>>> - mesa doesn't set this
>>> - compute sets its right before running the batch
>>> - media sets it as the last thing of context creation
>>
>> Noticed a long sub-thread so looked inside..
>>
>> SSEU is a really an interesting one.
>>
>> For current userspace limiting to context creation is fine, since it is
>> only allowed for Icelake/VME use case. But if you notice the comment inside:
>>
>>                  /* ABI restriction - VME use case only. */
>>
>> It is a hint there was, or could be, more to this uapi than that.
>>
>> And from memory I think limiting to creation time will nip the hopes
>> media had to use this dynamically on other platforms in the bud. So not
>> that good really. They had convincing numbers what gets significantly
>> better if we allowed dynamic control to this, just that as always, open
>> source userspace was not there so we never allowed it. However if you
>> come up with a new world order where it can only be done at context
>> creation, as said already, the possibility for that improvement (aka
>> further improving the competitive advantage) is most likely dashed.
> 
> Hm are you sure that this is create-time only? media-driver uses it
> like that, but from my checking compute-runtime updates SSEU mode
> before every execbuf call. So it very much looked like we have to keep
> this dynamic.

Ah okay, I assumed it's more of the overall drive to eliminate 
set_param. If sseu set_param stays then it's fine for what I had in mind.

> Or do you mean this is defacto dead code? this = compute setting it
> before every batch I mean here.

No idea, wasn't aware of the compute usage.

Before every execbuf is not very ideal though since we have to inject a 
foreign context operation to update context image, which means stream of 
work belonging to the context cannot be coalesced (assuming it could to 
start with). There is also a hw cost to reconfigure the sseu which adds 
latency on top.

Anyway, I was only aware of the current media usage, which is static as 
you say, and future/wishlist media usage, which would be dynamic, but a 
complicated story to get right (partly due downsides mentioned in the 
previous paragraph mean balancing benefit vs cost of dynamic sseu is not 
easy).

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 12:44                       ` Tvrtko Ursulin
@ 2021-04-30 13:07                         ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 13:07 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand

On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
>
> On 30/04/2021 13:30, Daniel Vetter wrote:
> > On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 30/04/2021 07:53, Daniel Vetter wrote:
> >>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >>>>
> >>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>
> >>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> >>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> >>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
> >>>>>>>>>
> >>>>>>>>> I think we should have a FIXME here of not allowing this on some future
> >>>>>>>>> platforms because just use CTX_CREATE_EXT.
> >>>>>>>>
> >>>>>>>> Done.
> >>>>>>>>
> >>>>>>>>>> +     if (ret == -ENOTSUPP) {
> >>>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
> >>>>>>>>>
> >>>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
> >>>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
> >>>>>>>>
> >>>>>>>> I'm not sure what you mean by that.
> >>>>>>>
> >>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
> >>>>>>> there's only two kinds of parameters:
> >>>>>>> - those which are valid only on proto context
> >>>>>>> - those which are valid on both (like priority)
> >>>>>>>
> >>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
> >>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
> >>>>>>> *ugh* and why?
> >>>>>>
> >>>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
> >>>>>> parse and it's always set live.  I can factor out the SSEU parsing
> >>>>>> code if you want and it shouldn't be too bad in the end.
> >>>>>
> >>>>> Yeah I think the special case here is a bit too jarring.
> >>>>
> >>>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> >>>> a huge fan of that much code duplication for the SSEU set but I guess
> >>>> that's what we get for deciding to "unify" our context creation
> >>>> parameter path with our on-the-fly parameter path....
> >>>>
> >>>> You can look at it here:
> >>>>
> >>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> >>>
> >>> Hm yeah the duplication of the render engine check is a bit annoying.
> >>> What's worse, if you tthrow another set_engines on top it's probably
> >>> all wrong then. The old thing solved that by just throwing that
> >>> intel_context away.
> >>>
> >>> You're also not keeping the engine id in the proto ctx for this, so
> >>> there's probably some gaps there. We'd need to clear the SSEU if
> >>> userspace puts another context there. But also no userspace does that.
> >>>
> >>> Plus cursory review of userspace show
> >>> - mesa doesn't set this
> >>> - compute sets its right before running the batch
> >>> - media sets it as the last thing of context creation
> >>
> >> Noticed a long sub-thread so looked inside..
> >>
> >> SSEU is a really an interesting one.
> >>
> >> For current userspace limiting to context creation is fine, since it is
> >> only allowed for Icelake/VME use case. But if you notice the comment inside:
> >>
> >>                  /* ABI restriction - VME use case only. */
> >>
> >> It is a hint there was, or could be, more to this uapi than that.
> >>
> >> And from memory I think limiting to creation time will nip the hopes
> >> media had to use this dynamically on other platforms in the bud. So not
> >> that good really. They had convincing numbers what gets significantly
> >> better if we allowed dynamic control to this, just that as always, open
> >> source userspace was not there so we never allowed it. However if you
> >> come up with a new world order where it can only be done at context
> >> creation, as said already, the possibility for that improvement (aka
> >> further improving the competitive advantage) is most likely dashed.
> >
> > Hm are you sure that this is create-time only? media-driver uses it
> > like that, but from my checking compute-runtime updates SSEU mode
> > before every execbuf call. So it very much looked like we have to keep
> > this dynamic.
>
> Ah okay, I assumed it's more of the overall drive to eliminate
> set_param. If sseu set_param stays then it's fine for what I had in mind.
>
> > Or do you mean this is defacto dead code? this = compute setting it
> > before every batch I mean here.
>
> No idea, wasn't aware of the compute usage.
>
> Before every execbuf is not very ideal though since we have to inject a
> foreign context operation to update context image, which means stream of
> work belonging to the context cannot be coalesced (assuming it could to
> start with). There is also a hw cost to reconfigure the sseu which adds
> latency on top.

They filter out no-op changes. I just meant that from look at
compute-runtime, it seems like sseu can change whenever.
-Daniel

> Anyway, I was only aware of the current media usage, which is static as
> you say, and future/wishlist media usage, which would be dynamic, but a
> complicated story to get right (partly due downsides mentioned in the
> previous paragraph mean balancing benefit vs cost of dynamic sseu is not
> easy).
>
> Regards,
>
> Tvrtko



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 13:07                         ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 13:07 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
>
> On 30/04/2021 13:30, Daniel Vetter wrote:
> > On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 30/04/2021 07:53, Daniel Vetter wrote:
> >>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >>>>
> >>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>
> >>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> >>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> >>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
> >>>>>>>>>
> >>>>>>>>> I think we should have a FIXME here of not allowing this on some future
> >>>>>>>>> platforms because just use CTX_CREATE_EXT.
> >>>>>>>>
> >>>>>>>> Done.
> >>>>>>>>
> >>>>>>>>>> +     if (ret == -ENOTSUPP) {
> >>>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
> >>>>>>>>>
> >>>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
> >>>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
> >>>>>>>>
> >>>>>>>> I'm not sure what you mean by that.
> >>>>>>>
> >>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
> >>>>>>> there's only two kinds of parameters:
> >>>>>>> - those which are valid only on proto context
> >>>>>>> - those which are valid on both (like priority)
> >>>>>>>
> >>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
> >>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
> >>>>>>> *ugh* and why?
> >>>>>>
> >>>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
> >>>>>> parse and it's always set live.  I can factor out the SSEU parsing
> >>>>>> code if you want and it shouldn't be too bad in the end.
> >>>>>
> >>>>> Yeah I think the special case here is a bit too jarring.
> >>>>
> >>>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> >>>> a huge fan of that much code duplication for the SSEU set but I guess
> >>>> that's what we get for deciding to "unify" our context creation
> >>>> parameter path with our on-the-fly parameter path....
> >>>>
> >>>> You can look at it here:
> >>>>
> >>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> >>>
> >>> Hm yeah the duplication of the render engine check is a bit annoying.
> >>> What's worse, if you tthrow another set_engines on top it's probably
> >>> all wrong then. The old thing solved that by just throwing that
> >>> intel_context away.
> >>>
> >>> You're also not keeping the engine id in the proto ctx for this, so
> >>> there's probably some gaps there. We'd need to clear the SSEU if
> >>> userspace puts another context there. But also no userspace does that.
> >>>
> >>> Plus cursory review of userspace show
> >>> - mesa doesn't set this
> >>> - compute sets its right before running the batch
> >>> - media sets it as the last thing of context creation
> >>
> >> Noticed a long sub-thread so looked inside..
> >>
> >> SSEU is a really an interesting one.
> >>
> >> For current userspace limiting to context creation is fine, since it is
> >> only allowed for Icelake/VME use case. But if you notice the comment inside:
> >>
> >>                  /* ABI restriction - VME use case only. */
> >>
> >> It is a hint there was, or could be, more to this uapi than that.
> >>
> >> And from memory I think limiting to creation time will nip the hopes
> >> media had to use this dynamically on other platforms in the bud. So not
> >> that good really. They had convincing numbers what gets significantly
> >> better if we allowed dynamic control to this, just that as always, open
> >> source userspace was not there so we never allowed it. However if you
> >> come up with a new world order where it can only be done at context
> >> creation, as said already, the possibility for that improvement (aka
> >> further improving the competitive advantage) is most likely dashed.
> >
> > Hm are you sure that this is create-time only? media-driver uses it
> > like that, but from my checking compute-runtime updates SSEU mode
> > before every execbuf call. So it very much looked like we have to keep
> > this dynamic.
>
> Ah okay, I assumed it's more of the overall drive to eliminate
> set_param. If sseu set_param stays then it's fine for what I had in mind.
>
> > Or do you mean this is defacto dead code? this = compute setting it
> > before every batch I mean here.
>
> No idea, wasn't aware of the compute usage.
>
> Before every execbuf is not very ideal though since we have to inject a
> foreign context operation to update context image, which means stream of
> work belonging to the context cannot be coalesced (assuming it could to
> start with). There is also a hw cost to reconfigure the sseu which adds
> latency on top.

They filter out no-op changes. I just meant that from look at
compute-runtime, it seems like sseu can change whenever.
-Daniel

> Anyway, I was only aware of the current media usage, which is static as
> you say, and future/wishlist media usage, which would be dynamic, but a
> complicated story to get right (partly due downsides mentioned in the
> previous paragraph mean balancing benefit vs cost of dynamic sseu is not
> easy).
>
> Regards,
>
> Tvrtko



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 13:07                         ` Daniel Vetter
@ 2021-04-30 13:15                           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 13:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand


On 30/04/2021 14:07, Daniel Vetter wrote:
> On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 30/04/2021 13:30, Daniel Vetter wrote:
>>> On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>> On 30/04/2021 07:53, Daniel Vetter wrote:
>>>>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>>>>>
>>>>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>
>>>>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
>>>>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
>>>>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
>>>>>>>>>>>
>>>>>>>>>>> I think we should have a FIXME here of not allowing this on some future
>>>>>>>>>>> platforms because just use CTX_CREATE_EXT.
>>>>>>>>>>
>>>>>>>>>> Done.
>>>>>>>>>>
>>>>>>>>>>>> +     if (ret == -ENOTSUPP) {
>>>>>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
>>>>>>>>>>>
>>>>>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
>>>>>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
>>>>>>>>>>
>>>>>>>>>> I'm not sure what you mean by that.
>>>>>>>>>
>>>>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
>>>>>>>>> there's only two kinds of parameters:
>>>>>>>>> - those which are valid only on proto context
>>>>>>>>> - those which are valid on both (like priority)
>>>>>>>>>
>>>>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
>>>>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
>>>>>>>>> *ugh* and why?
>>>>>>>>
>>>>>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
>>>>>>>> parse and it's always set live.  I can factor out the SSEU parsing
>>>>>>>> code if you want and it shouldn't be too bad in the end.
>>>>>>>
>>>>>>> Yeah I think the special case here is a bit too jarring.
>>>>>>
>>>>>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
>>>>>> a huge fan of that much code duplication for the SSEU set but I guess
>>>>>> that's what we get for deciding to "unify" our context creation
>>>>>> parameter path with our on-the-fly parameter path....
>>>>>>
>>>>>> You can look at it here:
>>>>>>
>>>>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
>>>>>
>>>>> Hm yeah the duplication of the render engine check is a bit annoying.
>>>>> What's worse, if you tthrow another set_engines on top it's probably
>>>>> all wrong then. The old thing solved that by just throwing that
>>>>> intel_context away.
>>>>>
>>>>> You're also not keeping the engine id in the proto ctx for this, so
>>>>> there's probably some gaps there. We'd need to clear the SSEU if
>>>>> userspace puts another context there. But also no userspace does that.
>>>>>
>>>>> Plus cursory review of userspace show
>>>>> - mesa doesn't set this
>>>>> - compute sets its right before running the batch
>>>>> - media sets it as the last thing of context creation
>>>>
>>>> Noticed a long sub-thread so looked inside..
>>>>
>>>> SSEU is a really an interesting one.
>>>>
>>>> For current userspace limiting to context creation is fine, since it is
>>>> only allowed for Icelake/VME use case. But if you notice the comment inside:
>>>>
>>>>                   /* ABI restriction - VME use case only. */
>>>>
>>>> It is a hint there was, or could be, more to this uapi than that.
>>>>
>>>> And from memory I think limiting to creation time will nip the hopes
>>>> media had to use this dynamically on other platforms in the bud. So not
>>>> that good really. They had convincing numbers what gets significantly
>>>> better if we allowed dynamic control to this, just that as always, open
>>>> source userspace was not there so we never allowed it. However if you
>>>> come up with a new world order where it can only be done at context
>>>> creation, as said already, the possibility for that improvement (aka
>>>> further improving the competitive advantage) is most likely dashed.
>>>
>>> Hm are you sure that this is create-time only? media-driver uses it
>>> like that, but from my checking compute-runtime updates SSEU mode
>>> before every execbuf call. So it very much looked like we have to keep
>>> this dynamic.
>>
>> Ah okay, I assumed it's more of the overall drive to eliminate
>> set_param. If sseu set_param stays then it's fine for what I had in mind.
>>
>>> Or do you mean this is defacto dead code? this = compute setting it
>>> before every batch I mean here.
>>
>> No idea, wasn't aware of the compute usage.
>>
>> Before every execbuf is not very ideal though since we have to inject a
>> foreign context operation to update context image, which means stream of
>> work belonging to the context cannot be coalesced (assuming it could to
>> start with). There is also a hw cost to reconfigure the sseu which adds
>> latency on top.
> 
> They filter out no-op changes. I just meant that from look at
> compute-runtime, it seems like sseu can change whenever.

i915 does it as well for good measure - since the penalty is global we 
have to. So I guess they don't know when VME block will be used ie it is 
per batch.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 13:15                           ` Tvrtko Ursulin
  0 siblings, 0 replies; 226+ messages in thread
From: Tvrtko Ursulin @ 2021-04-30 13:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers


On 30/04/2021 14:07, Daniel Vetter wrote:
> On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 30/04/2021 13:30, Daniel Vetter wrote:
>>> On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>> On 30/04/2021 07:53, Daniel Vetter wrote:
>>>>> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>>>>>
>>>>>> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>
>>>>>>> On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
>>>>>>>> On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>> On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
>>>>>>>>>> On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>>>>> +     ret = set_proto_ctx_param(file_priv, pc, args);
>>>>>>>>>>>
>>>>>>>>>>> I think we should have a FIXME here of not allowing this on some future
>>>>>>>>>>> platforms because just use CTX_CREATE_EXT.
>>>>>>>>>>
>>>>>>>>>> Done.
>>>>>>>>>>
>>>>>>>>>>>> +     if (ret == -ENOTSUPP) {
>>>>>>>>>>>> +             /* Some params, specifically SSEU, can only be set on fully
>>>>>>>>>>>
>>>>>>>>>>> I think this needs a FIXME: that this only holds during the conversion?
>>>>>>>>>>> Otherwise we kinda have a bit a problem me thinks ...
>>>>>>>>>>
>>>>>>>>>> I'm not sure what you mean by that.
>>>>>>>>>
>>>>>>>>> Well I'm at least assuming that we wont have this case anymore, i.e.
>>>>>>>>> there's only two kinds of parameters:
>>>>>>>>> - those which are valid only on proto context
>>>>>>>>> - those which are valid on both (like priority)
>>>>>>>>>
>>>>>>>>> This SSEU thing looks like a 3rd parameter, which is only valid on
>>>>>>>>> finalized context. That feels all kinds of wrong. Will it stay? If yes
>>>>>>>>> *ugh* and why?
>>>>>>>>
>>>>>>>> Because I was being lazy.  The SSEU stuff is a fairly complex param to
>>>>>>>> parse and it's always set live.  I can factor out the SSEU parsing
>>>>>>>> code if you want and it shouldn't be too bad in the end.
>>>>>>>
>>>>>>> Yeah I think the special case here is a bit too jarring.
>>>>>>
>>>>>> I rolled a v5 that allows you to set SSEU as a create param.  I'm not
>>>>>> a huge fan of that much code duplication for the SSEU set but I guess
>>>>>> that's what we get for deciding to "unify" our context creation
>>>>>> parameter path with our on-the-fly parameter path....
>>>>>>
>>>>>> You can look at it here:
>>>>>>
>>>>>> https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
>>>>>
>>>>> Hm yeah the duplication of the render engine check is a bit annoying.
>>>>> What's worse, if you tthrow another set_engines on top it's probably
>>>>> all wrong then. The old thing solved that by just throwing that
>>>>> intel_context away.
>>>>>
>>>>> You're also not keeping the engine id in the proto ctx for this, so
>>>>> there's probably some gaps there. We'd need to clear the SSEU if
>>>>> userspace puts another context there. But also no userspace does that.
>>>>>
>>>>> Plus cursory review of userspace show
>>>>> - mesa doesn't set this
>>>>> - compute sets its right before running the batch
>>>>> - media sets it as the last thing of context creation
>>>>
>>>> Noticed a long sub-thread so looked inside..
>>>>
>>>> SSEU is a really an interesting one.
>>>>
>>>> For current userspace limiting to context creation is fine, since it is
>>>> only allowed for Icelake/VME use case. But if you notice the comment inside:
>>>>
>>>>                   /* ABI restriction - VME use case only. */
>>>>
>>>> It is a hint there was, or could be, more to this uapi than that.
>>>>
>>>> And from memory I think limiting to creation time will nip the hopes
>>>> media had to use this dynamically on other platforms in the bud. So not
>>>> that good really. They had convincing numbers what gets significantly
>>>> better if we allowed dynamic control to this, just that as always, open
>>>> source userspace was not there so we never allowed it. However if you
>>>> come up with a new world order where it can only be done at context
>>>> creation, as said already, the possibility for that improvement (aka
>>>> further improving the competitive advantage) is most likely dashed.
>>>
>>> Hm are you sure that this is create-time only? media-driver uses it
>>> like that, but from my checking compute-runtime updates SSEU mode
>>> before every execbuf call. So it very much looked like we have to keep
>>> this dynamic.
>>
>> Ah okay, I assumed it's more of the overall drive to eliminate
>> set_param. If sseu set_param stays then it's fine for what I had in mind.
>>
>>> Or do you mean this is defacto dead code? this = compute setting it
>>> before every batch I mean here.
>>
>> No idea, wasn't aware of the compute usage.
>>
>> Before every execbuf is not very ideal though since we have to inject a
>> foreign context operation to update context image, which means stream of
>> work belonging to the context cannot be coalesced (assuming it could to
>> start with). There is also a hw cost to reconfigure the sseu which adds
>> latency on top.
> 
> They filter out no-op changes. I just meant that from look at
> compute-runtime, it seems like sseu can change whenever.

i915 does it as well for good measure - since the penalty is global we 
have to. So I guess they don't know when VME block will be used ie it is 
per batch.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
  2021-04-30 11:18             ` Tvrtko Ursulin
@ 2021-04-30 15:35               ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 15:35 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:18 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 29/04/2021 15:54, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 28/04/2021 18:24, Jason Ekstrand wrote:
> >>> On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>> On 23/04/2021 23:31, Jason Ekstrand wrote:
> >>>>> Instead of handling it like a context param, unconditionally set it when
> >>>>> intel_contexts are created.  This doesn't fix anything but does simplify
> >>>>> the code a bit.
> >>>>>
> >>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>>> ---
> >>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> >>>>>     .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> >>>>>     drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> >>>>>     3 files changed, 6 insertions(+), 44 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>> index 35bcdeddfbf3f..1091cc04a242a 100644
> >>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> >>>>>             intel_engine_has_timeslices(ce->engine))
> >>>>>                 __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> >>>>>
> >>>>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> >>>>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> >>>>> +         ctx->i915->params.request_timeout_ms) {
> >>>>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> >>>>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> >>>>
> >>>> Blank line between declarations and code please, or just lose the local.
> >>>>
> >>>> Otherwise looks okay. Slight change that same GEM context can now have a
> >>>> mix of different request expirations isn't interesting I think. At least
> >>>> the change goes away by the end of the series.
> >>>
> >>> In order for that to happen, I think you'd have to have a race between
> >>> CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> >>> sysfs.  Or am I missing something?  Given that timeouts are really
> >>> per-engine anyway, I don't think we need to care too much about that.
> >>
> >> We don't care, no.
> >>
> >> For completeness only - by the end of the series it is what you say. But
> >> at _this_ point in the series though it is if modparam changes at any
> >> point between context create and replacing engines. Which is a change
> >> compared to before this patch, since modparam was cached in the GEM
> >> context so far. So one GEM context was a single request_timeout_ms.
> >
> > I've added the following to the commit message:
> >
> > It also means that sync files exported from different engines on a
> > SINGLE_TIMELINE context will have different fence contexts.  This is
> > visible to userspace if it looks at the obj_name field of
> > sync_fence_info.
> >
> > How's that sound?
>
> Wrong thread but sounds good.
>
> I haven't looked into the fence merge logic apart from noticing context
> is used there. So I'd suggest a quick look there on top, just to make
> sure merging logic does not hold any surprises if contexts start to
> differ. Probably just results with more inefficiency somewhere, in theory.

Looked at it yesterday.  It really does just create a fence array with
all the fences. :-)

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem
@ 2021-04-30 15:35               ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 15:35 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:18 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 29/04/2021 15:54, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 3:04 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 28/04/2021 18:24, Jason Ekstrand wrote:
> >>> On Wed, Apr 28, 2021 at 10:55 AM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>> On 23/04/2021 23:31, Jason Ekstrand wrote:
> >>>>> Instead of handling it like a context param, unconditionally set it when
> >>>>> intel_contexts are created.  This doesn't fix anything but does simplify
> >>>>> the code a bit.
> >>>>>
> >>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>>> ---
> >>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c   | 43 +++----------------
> >>>>>     .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 --
> >>>>>     drivers/gpu/drm/i915/gt/intel_context_param.h |  3 +-
> >>>>>     3 files changed, 6 insertions(+), 44 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>> index 35bcdeddfbf3f..1091cc04a242a 100644
> >>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>> @@ -233,7 +233,11 @@ static void intel_context_set_gem(struct intel_context *ce,
> >>>>>             intel_engine_has_timeslices(ce->engine))
> >>>>>                 __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
> >>>>>
> >>>>> -     intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
> >>>>> +     if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
> >>>>> +         ctx->i915->params.request_timeout_ms) {
> >>>>> +             unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
> >>>>> +             intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
> >>>>
> >>>> Blank line between declarations and code please, or just lose the local.
> >>>>
> >>>> Otherwise looks okay. Slight change that same GEM context can now have a
> >>>> mix of different request expirations isn't interesting I think. At least
> >>>> the change goes away by the end of the series.
> >>>
> >>> In order for that to happen, I think you'd have to have a race between
> >>> CREATE_CONTEXT and someone smashing the request_timeout_ms param via
> >>> sysfs.  Or am I missing something?  Given that timeouts are really
> >>> per-engine anyway, I don't think we need to care too much about that.
> >>
> >> We don't care, no.
> >>
> >> For completeness only - by the end of the series it is what you say. But
> >> at _this_ point in the series though it is if modparam changes at any
> >> point between context create and replacing engines. Which is a change
> >> compared to before this patch, since modparam was cached in the GEM
> >> context so far. So one GEM context was a single request_timeout_ms.
> >
> > I've added the following to the commit message:
> >
> > It also means that sync files exported from different engines on a
> > SINGLE_TIMELINE context will have different fence contexts.  This is
> > visible to userspace if it looks at the obj_name field of
> > sync_fence_info.
> >
> > How's that sound?
>
> Wrong thread but sounds good.
>
> I haven't looked into the fence merge logic apart from noticing context
> is used there. So I'd suggest a quick look there on top, just to make
> sure merging logic does not hold any surprises if contexts start to
> differ. Probably just results with more inefficiency somewhere, in theory.

Looked at it yesterday.  It really does just create a fence array with
all the fences. :-)

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
  2021-04-30 11:40                   ` Tvrtko Ursulin
@ 2021-04-30 15:54                     ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 15:54 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:40 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 29/04/2021 20:16, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 28/04/2021 18:09, Jason Ekstrand wrote:
> >>> On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>> On 28/04/2021 15:02, Daniel Vetter wrote:
> >>>>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>>>>>
> >>>>>> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>>>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>>>>>> There's no sense in allowing userspace to create more engines than it
> >>>>>>>> can possibly access via execbuf.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>>>>>> ---
> >>>>>>>>      drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>>>>>      1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>>>>>                      return -EINVAL;
> >>>>>>>>              }
> >>>>>>>> -  /*
> >>>>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>>>>>> -   * first 64 engines defined here.
> >>>>>>>> -   */
> >>>>>>>>              num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>>>>>
> >>>>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>>>>>> directly here */ since I had to check that :-)
> >>>>>>>
> >>>>>>> Same story about igt testcases needed, just to be sure.
> >>>>>>>
> >>>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>>>
> >>>>>> I am not sure about the churn vs benefit ratio here. There are also patches
> >>>>>> which extend the engine selection field in execbuf2 over the unused
> >>>>>> constants bits (with an explicit flag). So churn upstream and churn in
> >>>>>> internal (if interesting) for not much benefit.
> >>>>>
> >>>>> This isn't churn.
> >>>>>
> >>>>> This is "lock done uapi properly".
> >>>
> >>> Pretty much.
> >>
> >> Still haven't heard what concrete problems it solves.
> >>
> >>>> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> >>>> for other people and man hours spent which no one will ever properly
> >>>> account against.
> >>>>
> >>>> Number of contexts in the engine map should not really be tied to
> >>>> execbuf2. As is demonstrated by the incoming work to address more than
> >>>> 63 engines, either as an extension to execbuf2 or future execbuf3.
> >>>
> >>> Which userspace driver has requested more than 64 engines in a single context?
> >>
> >> No need to artificially limit hardware capabilities in the uapi by
> >> implementing a policy in the kernel. Which will need to be
> >> removed/changed shortly anyway. This particular patch is work and
> >> creates more work (which other people who will get to fix the fallout
> >> will spend man hours to figure out what and why broke) for no benefit.
> >> Or you are yet to explain what the benefit is in concrete terms.
> >
> > You keep complaining about how much work it takes and yet I've spent
> > more time replying to your e-mails on this patch than I spent writing
> > the patch and the IGT test.  Also, if it takes so much time to add a
> > restriction, then why are we spending time figuring out how to modify
> > the uAPI to allow you to execbuf on a context with more than 64
> > engines?  If we're worried about engineering man-hours, then limiting
> > to 64 IS the pragmatic solution.
>
> a)
>
> Question of what problem does the patch fix is still unanswered.
>
> b)
>
> You miss the point. I'll continue in the next paragraph..
>
> >
> >> Why don't you limit it to number of physical engines then? Why don't you
> >> filter out duplicates? Why not limit the number of buffer objects per
> >> client or global based on available RAM + swap relative to minimum
> >> object size? Reductio ad absurdum yes, but illustrating the, in this
> >> case, a thin line between "locking down uapi" and adding too much policy
> >> where it is not appropriate.
> >
> > All this patch does is say that  you're not allowed to create a
> > context with more engines than the execbuf API will let you use.  We
> > already have an artificial limit.  All this does is push the error
> > handling further up the stack.  If someone comes up with a mechanism
> > to execbuf on engine 65 (they'd better have an open-source user if it
> > involves changing API), I'm very happy for them to bump this limit at
> > the same time.  It'll take them 5 minutes and it'll be something they
> > find while writing the IGT test.
>
> .. no it won't take five minutes.
>
> If I need to spell everything out - you will put this patch in, which
> fixes nothing, and it will propagate to the internal kernel at some
> point. Then a bunch of tests will start failing in a strange manner.
> Which will result in people triaging them, then assigning them, then
> reserving machines, setting them up, running the repro, then digging
> into the code, and eventually figuring out what happened.

So we have internal patches for more than 64 engines and corresponding
tests?  If so, I repeat the question I asked 3-4 e-mails ago, "What
userspace is requesting this?"  If it's some super-secret thing, feel
free to tell me via internal e-mail but I doubt it is.  If there is no
userspace requesting this and it's just kernel people saying "Ah!  We
should improve this API!" then the correct answer is that those
patches and corresponding tests should be deleted from DII.  It's
extra delta from upstream for no point.

> It will take hours not five minutes. And there will likely be multiple
> bug reports which most likely won't be joined so mutliple people will be
> doing multi hour debug. All for nothing. So it is rather uninteresting
> how small the change is. Interesting part is how much pointless effort
> it will create across the organisation.

Yes, "5 minutes" was a bit glib.  In practice, if this runs through
the usual triage process, it'll take someone somewhere a lot more
time.  However, if someone tries to pull this patch series into DII
and isn't pulling in the IGT changes ahead of time and carefully
looking at every patch and looking out for these issues, this is the
smallest of the problems it will cause.  Doesn't mean that this patch
won't cause additional work but in the grand scheme of things, it's
small.

> Of course you may not care that much about that side of things, or you
> are just not familiar in how it works in practice since you haven't been
> involved in the past years. I don't know really, but I have to raise the
> point it makes no sense to do this. Cost vs benefit is simply not nearly
> there.

I do care.  But, to a certain extent, some of that is just a cost we
have to pay.  For the last 2-3 years we've been off architecting in
the dark and building a giant internal tree with hundreds of patches
on top of upstream.  Some of that delta is necessary for new hardware.
Some of it could have been avoided had we done TTM earlier.  Some of
it is likely cases where someone did something just because it seemed
like a good idea and never bothered to try and upstream it.  Upstream
needs to be allowed to move forward, as unfettered as possible.  If
there wasn't a good reason to put it in DII in the first place, then
it existing in DII isn't a good reason to block upstream.

Again, if you can give me a use-case or a user, this whole
conversation ends.  If not, delete the patch from DII and we move on.

--Jason

> >>> Also, for execbuf3, I'd like to get rid of contexts entirely and have
> >>> engines be their own userspace-visible object.  If we go this
> >>> direction, you can have UINT32_MAX of them.  Problem solved.
> >>
> >> Not the problem I am pointing at though.
> >
> > You listed two ways that accessing engine 65 can happen: Extending
> > execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
> > pointed out above, it'll hopefully be a non-issue.  If someone extends
> > execbuf2 to support more than 64 engines and does not have a userspace
> > customer that wants said new API change, I will NAK the patch.  If
> > you've got a 3rd way that someone can get at engine 65 such that this
> > is a problem, I'd love to hear about it.
>
> It's ever so easy to take a black and white stance but the world is more
> like shades of grey. I too am totally perplexed why we have to spend
> time arguing on a inconsequential patch.
>
> Context create is not called "create execbuf2 context" so why be so
> wedded to adding execbuf2 restrictions into it I have no idea. If you
> were fixing some vulnerability or something I'd understand but all I've
> heard so far is along the lines of "This is proper locking down of uapi
> - end of". And endless waste of time discussion follows

To me, it's not just locking down the API.  It's defensive design and
moving the error condition further to the front.  If a client tries to
create a context with 65 engines and use engine 64, they will fail
today.  They won't fail when they create the context, they'll fail
when they try to execbuf on it.  Assuming, that is, that they actually
get a failure and not just a wrap-around to the wrong context.  By
moving the error earlier, we're doing the user a service by preventing
them from getting into a bad situation.

No, it's not called "execbuf2 context" but, given that execbuf2 is the
only way to do any work on a context, it might as well be.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines
@ 2021-04-30 15:54                     ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 15:54 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:40 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 29/04/2021 20:16, Jason Ekstrand wrote:
> > On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 28/04/2021 18:09, Jason Ekstrand wrote:
> >>> On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>> On 28/04/2021 15:02, Daniel Vetter wrote:
> >>>>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>>>>>
> >>>>>> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>>>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>>>>>> There's no sense in allowing userspace to create more engines than it
> >>>>>>>> can possibly access via execbuf.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> >>>>>>>> ---
> >>>>>>>>      drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>>>>>      1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>>>>>                      return -EINVAL;
> >>>>>>>>              }
> >>>>>>>> -  /*
> >>>>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>>>>>> -   * first 64 engines defined here.
> >>>>>>>> -   */
> >>>>>>>>              num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>>>>>
> >>>>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>>>>>> directly here */ since I had to check that :-)
> >>>>>>>
> >>>>>>> Same story about igt testcases needed, just to be sure.
> >>>>>>>
> >>>>>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>>>
> >>>>>> I am not sure about the churn vs benefit ratio here. There are also patches
> >>>>>> which extend the engine selection field in execbuf2 over the unused
> >>>>>> constants bits (with an explicit flag). So churn upstream and churn in
> >>>>>> internal (if interesting) for not much benefit.
> >>>>>
> >>>>> This isn't churn.
> >>>>>
> >>>>> This is "lock done uapi properly".
> >>>
> >>> Pretty much.
> >>
> >> Still haven't heard what concrete problems it solves.
> >>
> >>>> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> >>>> for other people and man hours spent which no one will ever properly
> >>>> account against.
> >>>>
> >>>> Number of contexts in the engine map should not really be tied to
> >>>> execbuf2. As is demonstrated by the incoming work to address more than
> >>>> 63 engines, either as an extension to execbuf2 or future execbuf3.
> >>>
> >>> Which userspace driver has requested more than 64 engines in a single context?
> >>
> >> No need to artificially limit hardware capabilities in the uapi by
> >> implementing a policy in the kernel. Which will need to be
> >> removed/changed shortly anyway. This particular patch is work and
> >> creates more work (which other people who will get to fix the fallout
> >> will spend man hours to figure out what and why broke) for no benefit.
> >> Or you are yet to explain what the benefit is in concrete terms.
> >
> > You keep complaining about how much work it takes and yet I've spent
> > more time replying to your e-mails on this patch than I spent writing
> > the patch and the IGT test.  Also, if it takes so much time to add a
> > restriction, then why are we spending time figuring out how to modify
> > the uAPI to allow you to execbuf on a context with more than 64
> > engines?  If we're worried about engineering man-hours, then limiting
> > to 64 IS the pragmatic solution.
>
> a)
>
> Question of what problem does the patch fix is still unanswered.
>
> b)
>
> You miss the point. I'll continue in the next paragraph..
>
> >
> >> Why don't you limit it to number of physical engines then? Why don't you
> >> filter out duplicates? Why not limit the number of buffer objects per
> >> client or global based on available RAM + swap relative to minimum
> >> object size? Reductio ad absurdum yes, but illustrating the, in this
> >> case, a thin line between "locking down uapi" and adding too much policy
> >> where it is not appropriate.
> >
> > All this patch does is say that  you're not allowed to create a
> > context with more engines than the execbuf API will let you use.  We
> > already have an artificial limit.  All this does is push the error
> > handling further up the stack.  If someone comes up with a mechanism
> > to execbuf on engine 65 (they'd better have an open-source user if it
> > involves changing API), I'm very happy for them to bump this limit at
> > the same time.  It'll take them 5 minutes and it'll be something they
> > find while writing the IGT test.
>
> .. no it won't take five minutes.
>
> If I need to spell everything out - you will put this patch in, which
> fixes nothing, and it will propagate to the internal kernel at some
> point. Then a bunch of tests will start failing in a strange manner.
> Which will result in people triaging them, then assigning them, then
> reserving machines, setting them up, running the repro, then digging
> into the code, and eventually figuring out what happened.

So we have internal patches for more than 64 engines and corresponding
tests?  If so, I repeat the question I asked 3-4 e-mails ago, "What
userspace is requesting this?"  If it's some super-secret thing, feel
free to tell me via internal e-mail but I doubt it is.  If there is no
userspace requesting this and it's just kernel people saying "Ah!  We
should improve this API!" then the correct answer is that those
patches and corresponding tests should be deleted from DII.  It's
extra delta from upstream for no point.

> It will take hours not five minutes. And there will likely be multiple
> bug reports which most likely won't be joined so mutliple people will be
> doing multi hour debug. All for nothing. So it is rather uninteresting
> how small the change is. Interesting part is how much pointless effort
> it will create across the organisation.

Yes, "5 minutes" was a bit glib.  In practice, if this runs through
the usual triage process, it'll take someone somewhere a lot more
time.  However, if someone tries to pull this patch series into DII
and isn't pulling in the IGT changes ahead of time and carefully
looking at every patch and looking out for these issues, this is the
smallest of the problems it will cause.  Doesn't mean that this patch
won't cause additional work but in the grand scheme of things, it's
small.

> Of course you may not care that much about that side of things, or you
> are just not familiar in how it works in practice since you haven't been
> involved in the past years. I don't know really, but I have to raise the
> point it makes no sense to do this. Cost vs benefit is simply not nearly
> there.

I do care.  But, to a certain extent, some of that is just a cost we
have to pay.  For the last 2-3 years we've been off architecting in
the dark and building a giant internal tree with hundreds of patches
on top of upstream.  Some of that delta is necessary for new hardware.
Some of it could have been avoided had we done TTM earlier.  Some of
it is likely cases where someone did something just because it seemed
like a good idea and never bothered to try and upstream it.  Upstream
needs to be allowed to move forward, as unfettered as possible.  If
there wasn't a good reason to put it in DII in the first place, then
it existing in DII isn't a good reason to block upstream.

Again, if you can give me a use-case or a user, this whole
conversation ends.  If not, delete the patch from DII and we move on.

--Jason

> >>> Also, for execbuf3, I'd like to get rid of contexts entirely and have
> >>> engines be their own userspace-visible object.  If we go this
> >>> direction, you can have UINT32_MAX of them.  Problem solved.
> >>
> >> Not the problem I am pointing at though.
> >
> > You listed two ways that accessing engine 65 can happen: Extending
> > execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
> > pointed out above, it'll hopefully be a non-issue.  If someone extends
> > execbuf2 to support more than 64 engines and does not have a userspace
> > customer that wants said new API change, I will NAK the patch.  If
> > you've got a 3rd way that someone can get at engine 65 such that this
> > is a problem, I'd love to hear about it.
>
> It's ever so easy to take a black and white stance but the world is more
> like shades of grey. I too am totally perplexed why we have to spend
> time arguing on a inconsequential patch.
>
> Context create is not called "create execbuf2 context" so why be so
> wedded to adding execbuf2 restrictions into it I have no idea. If you
> were fixing some vulnerability or something I'd understand but all I've
> heard so far is along the lines of "This is proper locking down of uapi
> - end of". And endless waste of time discussion follows

To me, it's not just locking down the API.  It's defensive design and
moving the error condition further to the front.  If a client tries to
create a context with 65 engines and use engine 64, they will fail
today.  They won't fail when they create the context, they'll fail
when they try to execbuf on it.  Assuming, that is, that they actually
get a failure and not just a wrap-around to the wrong context.  By
moving the error earlier, we're doing the user a service by preventing
them from getting into a bad situation.

No, it's not called "execbuf2 context" but, given that execbuf2 is the
only way to do any work on a context, it might as well be.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30  6:53                 ` Daniel Vetter
@ 2021-04-30 16:27                   ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 16:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > >
> > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > >
> > > > > > Done.
> > > > > >
> > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > >
> > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > >
> > > > > > I'm not sure what you mean by that.
> > > > >
> > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > there's only two kinds of parameters:
> > > > > - those which are valid only on proto context
> > > > > - those which are valid on both (like priority)
> > > > >
> > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > *ugh* and why?
> > > >
> > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > code if you want and it shouldn't be too bad in the end.
> > >
> > > Yeah I think the special case here is a bit too jarring.
> >
> > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > a huge fan of that much code duplication for the SSEU set but I guess
> > that's what we get for deciding to "unify" our context creation
> > parameter path with our on-the-fly parameter path....
> >
> > You can look at it here:
> >
> > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
>
> Hm yeah the duplication of the render engine check is a bit annoying.
> What's worse, if you tthrow another set_engines on top it's probably
> all wrong then. The old thing solved that by just throwing that
> intel_context away.

I think that's already mostly taken care of.  When set_engines
happens, we throw away the old array of engines and start with a new
one where everything has been memset to 0.  The one remaining problem
is that, if userspace resets the engine set, we need to memset
legacy_rcs_sseu to 0.  I've added that.

> You're also not keeping the engine id in the proto ctx for this, so
> there's probably some gaps there. We'd need to clear the SSEU if
> userspace puts another context there. But also no userspace does that.

Again, I think that's handled.  See above.

> Plus cursory review of userspace show
> - mesa doesn't set this
> - compute sets its right before running the batch
> - media sets it as the last thing of context creation
>
> So it's kinda not needed. But also we're asking umd to switch over to
> CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> puzzled. And we've confused them enough already with our uapis.
>
> Another idea: proto_set_sseu just stores the uapi struct and a note
> that it's set, and checks nothing. To validate sseu on proto context
> we do (but only when an sseu parameter is set):
> 1. finalize the context
> 2. call the real set_sseu for validation
> 3. throw the finalized context away again, it was just for validating
> the overall thing
>
> That way we don't have to consider all the interactions of setting
> sseu and engines in any order on proto context, validation code is
> guaranteed shared. Only downside is that there's a slight chance in
> behaviour: SSEU, then setting another engine in that slot will fail
> instead of throwing the sseu parameters away. That's the right thing
> for CTX_CREATE_EXT anyway, and current userspace doesn't care.
>
> Thoughts?

I thought about that.  The problem is that they can set_sseu multiple
times on different engines.  This means we'd have to effectively build
up an arbitrary list of SSEU set operations and replay it.  I'm not
sure how I feel about building up a big data structure.

> > I'm also going to send it to trybot.
>
> If you resend pls include all my r-b, I think some got lost in v4.

I'll try and dig those up.

> Also, in the kernel at least we expect minimal commit message with a
> bit of context, there's no Part-of: link pointing at the entire MR
> with overview and discussion, the patchwork Link: we add is a pretty
> bad substitute. Some of the new patches in v4 are a bit too terse on
> that.

Yup.  I can try to expand things a bit more.

> And finally I'm still not a big fan of the add/remove split over
> patches, but oh well.

I'm not either but working through all this reminded me of why I
didn't do it more gradual.  The problem is ordering.  If add and
remove at the same time and do it one param at a time, we'll end up
with a situation in the middle where some params will only be allowed
to be set on the proto-ctx and others will force a proto-ctx ->
context conversion.  If, for instance, one UMD sets engines first and
then VMs and another sets VMs first and then engines, there's no way
to do a gradual transition without breaking one of them.  Also, we
need to handle basically all the setparam complexity in order to
handle creation structs and, again, those can come in any order.

I hate it, I just don't see another way. :-(

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 16:27                   ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 16:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > >
> > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > >
> > > > > > Done.
> > > > > >
> > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > >
> > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > >
> > > > > > I'm not sure what you mean by that.
> > > > >
> > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > there's only two kinds of parameters:
> > > > > - those which are valid only on proto context
> > > > > - those which are valid on both (like priority)
> > > > >
> > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > *ugh* and why?
> > > >
> > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > code if you want and it shouldn't be too bad in the end.
> > >
> > > Yeah I think the special case here is a bit too jarring.
> >
> > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > a huge fan of that much code duplication for the SSEU set but I guess
> > that's what we get for deciding to "unify" our context creation
> > parameter path with our on-the-fly parameter path....
> >
> > You can look at it here:
> >
> > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
>
> Hm yeah the duplication of the render engine check is a bit annoying.
> What's worse, if you tthrow another set_engines on top it's probably
> all wrong then. The old thing solved that by just throwing that
> intel_context away.

I think that's already mostly taken care of.  When set_engines
happens, we throw away the old array of engines and start with a new
one where everything has been memset to 0.  The one remaining problem
is that, if userspace resets the engine set, we need to memset
legacy_rcs_sseu to 0.  I've added that.

> You're also not keeping the engine id in the proto ctx for this, so
> there's probably some gaps there. We'd need to clear the SSEU if
> userspace puts another context there. But also no userspace does that.

Again, I think that's handled.  See above.

> Plus cursory review of userspace show
> - mesa doesn't set this
> - compute sets its right before running the batch
> - media sets it as the last thing of context creation
>
> So it's kinda not needed. But also we're asking umd to switch over to
> CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> puzzled. And we've confused them enough already with our uapis.
>
> Another idea: proto_set_sseu just stores the uapi struct and a note
> that it's set, and checks nothing. To validate sseu on proto context
> we do (but only when an sseu parameter is set):
> 1. finalize the context
> 2. call the real set_sseu for validation
> 3. throw the finalized context away again, it was just for validating
> the overall thing
>
> That way we don't have to consider all the interactions of setting
> sseu and engines in any order on proto context, validation code is
> guaranteed shared. Only downside is that there's a slight chance in
> behaviour: SSEU, then setting another engine in that slot will fail
> instead of throwing the sseu parameters away. That's the right thing
> for CTX_CREATE_EXT anyway, and current userspace doesn't care.
>
> Thoughts?

I thought about that.  The problem is that they can set_sseu multiple
times on different engines.  This means we'd have to effectively build
up an arbitrary list of SSEU set operations and replay it.  I'm not
sure how I feel about building up a big data structure.

> > I'm also going to send it to trybot.
>
> If you resend pls include all my r-b, I think some got lost in v4.

I'll try and dig those up.

> Also, in the kernel at least we expect minimal commit message with a
> bit of context, there's no Part-of: link pointing at the entire MR
> with overview and discussion, the patchwork Link: we add is a pretty
> bad substitute. Some of the new patches in v4 are a bit too terse on
> that.

Yup.  I can try to expand things a bit more.

> And finally I'm still not a big fan of the add/remove split over
> patches, but oh well.

I'm not either but working through all this reminded me of why I
didn't do it more gradual.  The problem is ordering.  If add and
remove at the same time and do it one param at a time, we'll end up
with a situation in the middle where some params will only be allowed
to be set on the proto-ctx and others will force a proto-ctx ->
context conversion.  If, for instance, one UMD sets engines first and
then VMs and another sets VMs first and then engines, there's no way
to do a gradual transition without breaking one of them.  Also, we
need to handle basically all the setparam complexity in order to
handle creation structs and, again, those can come in any order.

I hate it, I just don't see another way. :-(

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 16:27                   ` Jason Ekstrand
@ 2021-04-30 16:33                     ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 16:33 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > > >
> > > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > > >
> > > > > > > Done.
> > > > > > >
> > > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > > >
> > > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > > >
> > > > > > > I'm not sure what you mean by that.
> > > > > >
> > > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > > there's only two kinds of parameters:
> > > > > > - those which are valid only on proto context
> > > > > > - those which are valid on both (like priority)
> > > > > >
> > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > > *ugh* and why?
> > > > >
> > > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > > code if you want and it shouldn't be too bad in the end.
> > > >
> > > > Yeah I think the special case here is a bit too jarring.
> > >
> > > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > > a huge fan of that much code duplication for the SSEU set but I guess
> > > that's what we get for deciding to "unify" our context creation
> > > parameter path with our on-the-fly parameter path....
> > >
> > > You can look at it here:
> > >
> > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> >
> > Hm yeah the duplication of the render engine check is a bit annoying.
> > What's worse, if you tthrow another set_engines on top it's probably
> > all wrong then. The old thing solved that by just throwing that
> > intel_context away.
>
> I think that's already mostly taken care of.  When set_engines
> happens, we throw away the old array of engines and start with a new
> one where everything has been memset to 0.  The one remaining problem
> is that, if userspace resets the engine set, we need to memset
> legacy_rcs_sseu to 0.  I've added that.
>
> > You're also not keeping the engine id in the proto ctx for this, so
> > there's probably some gaps there. We'd need to clear the SSEU if
> > userspace puts another context there. But also no userspace does that.
>
> Again, I think that's handled.  See above.
>
> > Plus cursory review of userspace show
> > - mesa doesn't set this
> > - compute sets its right before running the batch
> > - media sets it as the last thing of context creation
> >
> > So it's kinda not needed. But also we're asking umd to switch over to
> > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> > puzzled. And we've confused them enough already with our uapis.
> >
> > Another idea: proto_set_sseu just stores the uapi struct and a note
> > that it's set, and checks nothing. To validate sseu on proto context
> > we do (but only when an sseu parameter is set):
> > 1. finalize the context
> > 2. call the real set_sseu for validation
> > 3. throw the finalized context away again, it was just for validating
> > the overall thing
> >
> > That way we don't have to consider all the interactions of setting
> > sseu and engines in any order on proto context, validation code is
> > guaranteed shared. Only downside is that there's a slight chance in
> > behaviour: SSEU, then setting another engine in that slot will fail
> > instead of throwing the sseu parameters away. That's the right thing
> > for CTX_CREATE_EXT anyway, and current userspace doesn't care.
> >
> > Thoughts?
>
> I thought about that.  The problem is that they can set_sseu multiple
> times on different engines.  This means we'd have to effectively build
> up an arbitrary list of SSEU set operations and replay it.  I'm not
> sure how I feel about building up a big data structure.

Hm, but how does this work with proto ctx then? I've only seen a
single sseu param set in the patch you linked.

> > > I'm also going to send it to trybot.
> >
> > If you resend pls include all my r-b, I think some got lost in v4.
>
> I'll try and dig those up.
>
> > Also, in the kernel at least we expect minimal commit message with a
> > bit of context, there's no Part-of: link pointing at the entire MR
> > with overview and discussion, the patchwork Link: we add is a pretty
> > bad substitute. Some of the new patches in v4 are a bit too terse on
> > that.
>
> Yup.  I can try to expand things a bit more.
>
> > And finally I'm still not a big fan of the add/remove split over
> > patches, but oh well.
>
> I'm not either but working through all this reminded me of why I
> didn't do it more gradual.  The problem is ordering.  If add and
> remove at the same time and do it one param at a time, we'll end up
> with a situation in the middle where some params will only be allowed
> to be set on the proto-ctx and others will force a proto-ctx ->
> context conversion.  If, for instance, one UMD sets engines first and
> then VMs and another sets VMs first and then engines, there's no way
> to do a gradual transition without breaking one of them.  Also, we
> need to handle basically all the setparam complexity in order to
> handle creation structs and, again, those can come in any order.

Yeah I know, but I considered that. I think compute-runtime uses
CTX_CREATE_EXT, it's only media. So we need to order the patches in
exactly the order media calls setparam. And then we're good.

Worst case it's exactly as useful in bisecting as your approach here
(you add dead code first, then use it, so might as well just squash it
all down to one), but if we get the ordering right it's substantially
better.

But maybe "clever ordering of the conversion" is too clever. End
result is the same anyway.
-Daniel

> I hate it, I just don't see another way. :-(
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 16:33                     ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 16:33 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > > >
> > > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > > >
> > > > > > > Done.
> > > > > > >
> > > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > > >
> > > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > > >
> > > > > > > I'm not sure what you mean by that.
> > > > > >
> > > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > > there's only two kinds of parameters:
> > > > > > - those which are valid only on proto context
> > > > > > - those which are valid on both (like priority)
> > > > > >
> > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > > *ugh* and why?
> > > > >
> > > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > > code if you want and it shouldn't be too bad in the end.
> > > >
> > > > Yeah I think the special case here is a bit too jarring.
> > >
> > > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > > a huge fan of that much code duplication for the SSEU set but I guess
> > > that's what we get for deciding to "unify" our context creation
> > > parameter path with our on-the-fly parameter path....
> > >
> > > You can look at it here:
> > >
> > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> >
> > Hm yeah the duplication of the render engine check is a bit annoying.
> > What's worse, if you tthrow another set_engines on top it's probably
> > all wrong then. The old thing solved that by just throwing that
> > intel_context away.
>
> I think that's already mostly taken care of.  When set_engines
> happens, we throw away the old array of engines and start with a new
> one where everything has been memset to 0.  The one remaining problem
> is that, if userspace resets the engine set, we need to memset
> legacy_rcs_sseu to 0.  I've added that.
>
> > You're also not keeping the engine id in the proto ctx for this, so
> > there's probably some gaps there. We'd need to clear the SSEU if
> > userspace puts another context there. But also no userspace does that.
>
> Again, I think that's handled.  See above.
>
> > Plus cursory review of userspace show
> > - mesa doesn't set this
> > - compute sets its right before running the batch
> > - media sets it as the last thing of context creation
> >
> > So it's kinda not needed. But also we're asking umd to switch over to
> > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> > puzzled. And we've confused them enough already with our uapis.
> >
> > Another idea: proto_set_sseu just stores the uapi struct and a note
> > that it's set, and checks nothing. To validate sseu on proto context
> > we do (but only when an sseu parameter is set):
> > 1. finalize the context
> > 2. call the real set_sseu for validation
> > 3. throw the finalized context away again, it was just for validating
> > the overall thing
> >
> > That way we don't have to consider all the interactions of setting
> > sseu and engines in any order on proto context, validation code is
> > guaranteed shared. Only downside is that there's a slight chance in
> > behaviour: SSEU, then setting another engine in that slot will fail
> > instead of throwing the sseu parameters away. That's the right thing
> > for CTX_CREATE_EXT anyway, and current userspace doesn't care.
> >
> > Thoughts?
>
> I thought about that.  The problem is that they can set_sseu multiple
> times on different engines.  This means we'd have to effectively build
> up an arbitrary list of SSEU set operations and replay it.  I'm not
> sure how I feel about building up a big data structure.

Hm, but how does this work with proto ctx then? I've only seen a
single sseu param set in the patch you linked.

> > > I'm also going to send it to trybot.
> >
> > If you resend pls include all my r-b, I think some got lost in v4.
>
> I'll try and dig those up.
>
> > Also, in the kernel at least we expect minimal commit message with a
> > bit of context, there's no Part-of: link pointing at the entire MR
> > with overview and discussion, the patchwork Link: we add is a pretty
> > bad substitute. Some of the new patches in v4 are a bit too terse on
> > that.
>
> Yup.  I can try to expand things a bit more.
>
> > And finally I'm still not a big fan of the add/remove split over
> > patches, but oh well.
>
> I'm not either but working through all this reminded me of why I
> didn't do it more gradual.  The problem is ordering.  If add and
> remove at the same time and do it one param at a time, we'll end up
> with a situation in the middle where some params will only be allowed
> to be set on the proto-ctx and others will force a proto-ctx ->
> context conversion.  If, for instance, one UMD sets engines first and
> then VMs and another sets VMs first and then engines, there's no way
> to do a gradual transition without breaking one of them.  Also, we
> need to handle basically all the setparam complexity in order to
> handle creation structs and, again, those can come in any order.

Yeah I know, but I considered that. I think compute-runtime uses
CTX_CREATE_EXT, it's only media. So we need to order the patches in
exactly the order media calls setparam. And then we're good.

Worst case it's exactly as useful in bisecting as your approach here
(you add dead code first, then use it, so might as well just squash it
all down to one), but if we get the ordering right it's substantially
better.

But maybe "clever ordering of the conversion" is too clever. End
result is the same anyway.
-Daniel

> I hate it, I just don't see another way. :-(
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 16:33                     ` Daniel Vetter
@ 2021-04-30 16:57                       ` Jason Ekstrand
  -1 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 16:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > > > >
> > > > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > > > >
> > > > > > > > Done.
> > > > > > > >
> > > > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > > > >
> > > > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > > > >
> > > > > > > > I'm not sure what you mean by that.
> > > > > > >
> > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > > > there's only two kinds of parameters:
> > > > > > > - those which are valid only on proto context
> > > > > > > - those which are valid on both (like priority)
> > > > > > >
> > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > > > *ugh* and why?
> > > > > >
> > > > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > > > code if you want and it shouldn't be too bad in the end.
> > > > >
> > > > > Yeah I think the special case here is a bit too jarring.
> > > >
> > > > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > > > a huge fan of that much code duplication for the SSEU set but I guess
> > > > that's what we get for deciding to "unify" our context creation
> > > > parameter path with our on-the-fly parameter path....
> > > >
> > > > You can look at it here:
> > > >
> > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> > >
> > > Hm yeah the duplication of the render engine check is a bit annoying.
> > > What's worse, if you tthrow another set_engines on top it's probably
> > > all wrong then. The old thing solved that by just throwing that
> > > intel_context away.
> >
> > I think that's already mostly taken care of.  When set_engines
> > happens, we throw away the old array of engines and start with a new
> > one where everything has been memset to 0.  The one remaining problem
> > is that, if userspace resets the engine set, we need to memset
> > legacy_rcs_sseu to 0.  I've added that.
> >
> > > You're also not keeping the engine id in the proto ctx for this, so
> > > there's probably some gaps there. We'd need to clear the SSEU if
> > > userspace puts another context there. But also no userspace does that.
> >
> > Again, I think that's handled.  See above.
> >
> > > Plus cursory review of userspace show
> > > - mesa doesn't set this
> > > - compute sets its right before running the batch
> > > - media sets it as the last thing of context creation
> > >
> > > So it's kinda not needed. But also we're asking umd to switch over to
> > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> > > puzzled. And we've confused them enough already with our uapis.
> > >
> > > Another idea: proto_set_sseu just stores the uapi struct and a note
> > > that it's set, and checks nothing. To validate sseu on proto context
> > > we do (but only when an sseu parameter is set):
> > > 1. finalize the context
> > > 2. call the real set_sseu for validation
> > > 3. throw the finalized context away again, it was just for validating
> > > the overall thing
> > >
> > > That way we don't have to consider all the interactions of setting
> > > sseu and engines in any order on proto context, validation code is
> > > guaranteed shared. Only downside is that there's a slight chance in
> > > behaviour: SSEU, then setting another engine in that slot will fail
> > > instead of throwing the sseu parameters away. That's the right thing
> > > for CTX_CREATE_EXT anyway, and current userspace doesn't care.
> > >
> > > Thoughts?
> >
> > I thought about that.  The problem is that they can set_sseu multiple
> > times on different engines.  This means we'd have to effectively build
> > up an arbitrary list of SSEU set operations and replay it.  I'm not
> > sure how I feel about building up a big data structure.
>
> Hm, but how does this work with proto ctx then? I've only seen a
> single sseu param set in the patch you linked.

It works roughly the same as it works now:

 - If set_sseu is called, it always overwrites whatever was there
before.  If it's called for a legacy (no user-specified engines)
context, it overwrites legacy_rcs_sseu.  If it's called on a user
engine context, it overwrites the sseu on the given engine.
 - When set_engines is called, it throws away all the user engine data
(if any) and memsets legacy_rcu_sseu to 0.  The end result is that
everything gets reset.

> > > > I'm also going to send it to trybot.
> > >
> > > If you resend pls include all my r-b, I think some got lost in v4.
> >
> > I'll try and dig those up.
> >
> > > Also, in the kernel at least we expect minimal commit message with a
> > > bit of context, there's no Part-of: link pointing at the entire MR
> > > with overview and discussion, the patchwork Link: we add is a pretty
> > > bad substitute. Some of the new patches in v4 are a bit too terse on
> > > that.
> >
> > Yup.  I can try to expand things a bit more.
> >
> > > And finally I'm still not a big fan of the add/remove split over
> > > patches, but oh well.
> >
> > I'm not either but working through all this reminded me of why I
> > didn't do it more gradual.  The problem is ordering.  If add and
> > remove at the same time and do it one param at a time, we'll end up
> > with a situation in the middle where some params will only be allowed
> > to be set on the proto-ctx and others will force a proto-ctx ->
> > context conversion.  If, for instance, one UMD sets engines first and
> > then VMs and another sets VMs first and then engines, there's no way
> > to do a gradual transition without breaking one of them.  Also, we
> > need to handle basically all the setparam complexity in order to
> > handle creation structs and, again, those can come in any order.
>
> Yeah I know, but I considered that. I think compute-runtime uses
> CTX_CREATE_EXT, it's only media.

That doesn't really matter because both go through the same path.
Anything that uses CONTEXT_CREATE_EXT is identical to something which
creates the context and then calls SET_CONTEXT_PARAM in the same order
as the structs in the extension chain.

Incidentally, this also means that if we do it gradually, we have to
handle finalizing the proto-ctx mid-way through handling the chain of
create extensions.  That should be possible to handle if a bit tricky.
It'll also mean we'll have a (small) range of kernels where the
CONTEXT_CREATE_EXT method is broken if you get it in the wrong order.

> So we need to order the patches in
> exactly the order media calls setparam. And then we're good.

Mesa only ever sets engines.  Upstream compute only ever sets the VM.
Media always sets the VM first.  So, if we handle VM first, we should
be good-to-go, I think.

> Worst case it's exactly as useful in bisecting as your approach here
> (you add dead code first, then use it,

It's not dead.  At the time it's added, it's used for all
CONTEXT_CREATE_EXT.  Then, later, it becomes used for everything.

> so might as well just squash it
> all down to one), but if we get the ordering right it's substantially
> better.

I can try to spin a v5 and see how bad it ends up being.  I don't
really like breaking CONTEXT_CREATE_EXT in the middle, though.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 16:57                       ` Jason Ekstrand
  0 siblings, 0 replies; 226+ messages in thread
From: Jason Ekstrand @ 2021-04-30 16:57 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > > > >
> > > > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > > > >
> > > > > > > > Done.
> > > > > > > >
> > > > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > > > >
> > > > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > > > >
> > > > > > > > I'm not sure what you mean by that.
> > > > > > >
> > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > > > there's only two kinds of parameters:
> > > > > > > - those which are valid only on proto context
> > > > > > > - those which are valid on both (like priority)
> > > > > > >
> > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > > > *ugh* and why?
> > > > > >
> > > > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > > > code if you want and it shouldn't be too bad in the end.
> > > > >
> > > > > Yeah I think the special case here is a bit too jarring.
> > > >
> > > > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > > > a huge fan of that much code duplication for the SSEU set but I guess
> > > > that's what we get for deciding to "unify" our context creation
> > > > parameter path with our on-the-fly parameter path....
> > > >
> > > > You can look at it here:
> > > >
> > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> > >
> > > Hm yeah the duplication of the render engine check is a bit annoying.
> > > What's worse, if you tthrow another set_engines on top it's probably
> > > all wrong then. The old thing solved that by just throwing that
> > > intel_context away.
> >
> > I think that's already mostly taken care of.  When set_engines
> > happens, we throw away the old array of engines and start with a new
> > one where everything has been memset to 0.  The one remaining problem
> > is that, if userspace resets the engine set, we need to memset
> > legacy_rcs_sseu to 0.  I've added that.
> >
> > > You're also not keeping the engine id in the proto ctx for this, so
> > > there's probably some gaps there. We'd need to clear the SSEU if
> > > userspace puts another context there. But also no userspace does that.
> >
> > Again, I think that's handled.  See above.
> >
> > > Plus cursory review of userspace show
> > > - mesa doesn't set this
> > > - compute sets its right before running the batch
> > > - media sets it as the last thing of context creation
> > >
> > > So it's kinda not needed. But also we're asking umd to switch over to
> > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> > > puzzled. And we've confused them enough already with our uapis.
> > >
> > > Another idea: proto_set_sseu just stores the uapi struct and a note
> > > that it's set, and checks nothing. To validate sseu on proto context
> > > we do (but only when an sseu parameter is set):
> > > 1. finalize the context
> > > 2. call the real set_sseu for validation
> > > 3. throw the finalized context away again, it was just for validating
> > > the overall thing
> > >
> > > That way we don't have to consider all the interactions of setting
> > > sseu and engines in any order on proto context, validation code is
> > > guaranteed shared. Only downside is that there's a slight chance in
> > > behaviour: SSEU, then setting another engine in that slot will fail
> > > instead of throwing the sseu parameters away. That's the right thing
> > > for CTX_CREATE_EXT anyway, and current userspace doesn't care.
> > >
> > > Thoughts?
> >
> > I thought about that.  The problem is that they can set_sseu multiple
> > times on different engines.  This means we'd have to effectively build
> > up an arbitrary list of SSEU set operations and replay it.  I'm not
> > sure how I feel about building up a big data structure.
>
> Hm, but how does this work with proto ctx then? I've only seen a
> single sseu param set in the patch you linked.

It works roughly the same as it works now:

 - If set_sseu is called, it always overwrites whatever was there
before.  If it's called for a legacy (no user-specified engines)
context, it overwrites legacy_rcs_sseu.  If it's called on a user
engine context, it overwrites the sseu on the given engine.
 - When set_engines is called, it throws away all the user engine data
(if any) and memsets legacy_rcu_sseu to 0.  The end result is that
everything gets reset.

> > > > I'm also going to send it to trybot.
> > >
> > > If you resend pls include all my r-b, I think some got lost in v4.
> >
> > I'll try and dig those up.
> >
> > > Also, in the kernel at least we expect minimal commit message with a
> > > bit of context, there's no Part-of: link pointing at the entire MR
> > > with overview and discussion, the patchwork Link: we add is a pretty
> > > bad substitute. Some of the new patches in v4 are a bit too terse on
> > > that.
> >
> > Yup.  I can try to expand things a bit more.
> >
> > > And finally I'm still not a big fan of the add/remove split over
> > > patches, but oh well.
> >
> > I'm not either but working through all this reminded me of why I
> > didn't do it more gradual.  The problem is ordering.  If add and
> > remove at the same time and do it one param at a time, we'll end up
> > with a situation in the middle where some params will only be allowed
> > to be set on the proto-ctx and others will force a proto-ctx ->
> > context conversion.  If, for instance, one UMD sets engines first and
> > then VMs and another sets VMs first and then engines, there's no way
> > to do a gradual transition without breaking one of them.  Also, we
> > need to handle basically all the setparam complexity in order to
> > handle creation structs and, again, those can come in any order.
>
> Yeah I know, but I considered that. I think compute-runtime uses
> CTX_CREATE_EXT, it's only media.

That doesn't really matter because both go through the same path.
Anything that uses CONTEXT_CREATE_EXT is identical to something which
creates the context and then calls SET_CONTEXT_PARAM in the same order
as the structs in the extension chain.

Incidentally, this also means that if we do it gradually, we have to
handle finalizing the proto-ctx mid-way through handling the chain of
create extensions.  That should be possible to handle if a bit tricky.
It'll also mean we'll have a (small) range of kernels where the
CONTEXT_CREATE_EXT method is broken if you get it in the wrong order.

> So we need to order the patches in
> exactly the order media calls setparam. And then we're good.

Mesa only ever sets engines.  Upstream compute only ever sets the VM.
Media always sets the VM first.  So, if we handle VM first, we should
be good-to-go, I think.

> Worst case it's exactly as useful in bisecting as your approach here
> (you add dead code first, then use it,

It's not dead.  At the time it's added, it's used for all
CONTEXT_CREATE_EXT.  Then, later, it becomes used for everything.

> so might as well just squash it
> all down to one), but if we get the ordering right it's substantially
> better.

I can try to spin a v5 and see how bad it ends up being.  I don't
really like breaking CONTEXT_CREATE_EXT in the middle, though.

--Jason
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
  2021-04-30 16:57                       ` Jason Ekstrand
@ 2021-04-30 17:08                         ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 17:08 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:57 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > >
> > > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > >
> > > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > > > > >
> > > > > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > > > > >
> > > > > > > > > Done.
> > > > > > > > >
> > > > > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > > > > >
> > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > > > > >
> > > > > > > > > I'm not sure what you mean by that.
> > > > > > > >
> > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > > > > there's only two kinds of parameters:
> > > > > > > > - those which are valid only on proto context
> > > > > > > > - those which are valid on both (like priority)
> > > > > > > >
> > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > > > > *ugh* and why?
> > > > > > >
> > > > > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > > > > code if you want and it shouldn't be too bad in the end.
> > > > > >
> > > > > > Yeah I think the special case here is a bit too jarring.
> > > > >
> > > > > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > > > > a huge fan of that much code duplication for the SSEU set but I guess
> > > > > that's what we get for deciding to "unify" our context creation
> > > > > parameter path with our on-the-fly parameter path....
> > > > >
> > > > > You can look at it here:
> > > > >
> > > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> > > >
> > > > Hm yeah the duplication of the render engine check is a bit annoying.
> > > > What's worse, if you tthrow another set_engines on top it's probably
> > > > all wrong then. The old thing solved that by just throwing that
> > > > intel_context away.
> > >
> > > I think that's already mostly taken care of.  When set_engines
> > > happens, we throw away the old array of engines and start with a new
> > > one where everything has been memset to 0.  The one remaining problem
> > > is that, if userspace resets the engine set, we need to memset
> > > legacy_rcs_sseu to 0.  I've added that.
> > >
> > > > You're also not keeping the engine id in the proto ctx for this, so
> > > > there's probably some gaps there. We'd need to clear the SSEU if
> > > > userspace puts another context there. But also no userspace does that.
> > >
> > > Again, I think that's handled.  See above.
> > >
> > > > Plus cursory review of userspace show
> > > > - mesa doesn't set this
> > > > - compute sets its right before running the batch
> > > > - media sets it as the last thing of context creation
> > > >
> > > > So it's kinda not needed. But also we're asking umd to switch over to
> > > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> > > > puzzled. And we've confused them enough already with our uapis.
> > > >
> > > > Another idea: proto_set_sseu just stores the uapi struct and a note
> > > > that it's set, and checks nothing. To validate sseu on proto context
> > > > we do (but only when an sseu parameter is set):
> > > > 1. finalize the context
> > > > 2. call the real set_sseu for validation
> > > > 3. throw the finalized context away again, it was just for validating
> > > > the overall thing
> > > >
> > > > That way we don't have to consider all the interactions of setting
> > > > sseu and engines in any order on proto context, validation code is
> > > > guaranteed shared. Only downside is that there's a slight chance in
> > > > behaviour: SSEU, then setting another engine in that slot will fail
> > > > instead of throwing the sseu parameters away. That's the right thing
> > > > for CTX_CREATE_EXT anyway, and current userspace doesn't care.
> > > >
> > > > Thoughts?
> > >
> > > I thought about that.  The problem is that they can set_sseu multiple
> > > times on different engines.  This means we'd have to effectively build
> > > up an arbitrary list of SSEU set operations and replay it.  I'm not
> > > sure how I feel about building up a big data structure.
> >
> > Hm, but how does this work with proto ctx then? I've only seen a
> > single sseu param set in the patch you linked.
>
> It works roughly the same as it works now:
>
>  - If set_sseu is called, it always overwrites whatever was there
> before.  If it's called for a legacy (no user-specified engines)
> context, it overwrites legacy_rcs_sseu.  If it's called on a user
> engine context, it overwrites the sseu on the given engine.
>  - When set_engines is called, it throws away all the user engine data
> (if any) and memsets legacy_rcu_sseu to 0.  The end result is that
> everything gets reset.

I think I need to review this carefully in the new version. Definitely
too much w/e here already for tricky stuff :-)

> > > > > I'm also going to send it to trybot.
> > > >
> > > > If you resend pls include all my r-b, I think some got lost in v4.
> > >
> > > I'll try and dig those up.
> > >
> > > > Also, in the kernel at least we expect minimal commit message with a
> > > > bit of context, there's no Part-of: link pointing at the entire MR
> > > > with overview and discussion, the patchwork Link: we add is a pretty
> > > > bad substitute. Some of the new patches in v4 are a bit too terse on
> > > > that.
> > >
> > > Yup.  I can try to expand things a bit more.
> > >
> > > > And finally I'm still not a big fan of the add/remove split over
> > > > patches, but oh well.
> > >
> > > I'm not either but working through all this reminded me of why I
> > > didn't do it more gradual.  The problem is ordering.  If add and
> > > remove at the same time and do it one param at a time, we'll end up
> > > with a situation in the middle where some params will only be allowed
> > > to be set on the proto-ctx and others will force a proto-ctx ->
> > > context conversion.  If, for instance, one UMD sets engines first and
> > > then VMs and another sets VMs first and then engines, there's no way
> > > to do a gradual transition without breaking one of them.  Also, we
> > > need to handle basically all the setparam complexity in order to
> > > handle creation structs and, again, those can come in any order.
> >
> > Yeah I know, but I considered that. I think compute-runtime uses
> > CTX_CREATE_EXT, it's only media.
>
> That doesn't really matter because both go through the same path.
> Anything that uses CONTEXT_CREATE_EXT is identical to something which
> creates the context and then calls SET_CONTEXT_PARAM in the same order
> as the structs in the extension chain.
>
> Incidentally, this also means that if we do it gradually, we have to
> handle finalizing the proto-ctx mid-way through handling the chain of
> create extensions.  That should be possible to handle if a bit tricky.
> It'll also mean we'll have a (small) range of kernels where the
> CONTEXT_CREATE_EXT method is broken if you get it in the wrong order.
>
> > So we need to order the patches in
> > exactly the order media calls setparam. And then we're good.
>
> Mesa only ever sets engines.  Upstream compute only ever sets the VM.
> Media always sets the VM first.  So, if we handle VM first, we should
> be good-to-go, I think.
>
> > Worst case it's exactly as useful in bisecting as your approach here
> > (you add dead code first, then use it,
>
> It's not dead.  At the time it's added, it's used for all
> CONTEXT_CREATE_EXT.  Then, later, it becomes used for everything.
>
> > so might as well just squash it
> > all down to one), but if we get the ordering right it's substantially
> > better.
>
> I can try to spin a v5 and see how bad it ends up being.  I don't
> really like breaking CONTEXT_CREATE_EXT in the middle, though.

Hm right, I forgot that we also de-proto in the middle of
CONTEXT_CREATE_EXT while the conversion is going on. This really is
annoying.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 16/21] drm/i915/gem: Delay context creation
@ 2021-04-30 17:08                         ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-04-30 17:08 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 6:57 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > >
> > > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
> > > > >
> > > > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > >
> > > > > > On Thu, Apr 29, 2021 at 02:01:16PM -0500, Jason Ekstrand wrote:
> > > > > > > On Thu, Apr 29, 2021 at 1:56 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > On Thu, Apr 29, 2021 at 01:16:04PM -0500, Jason Ekstrand wrote:
> > > > > > > > > On Thu, Apr 29, 2021 at 10:51 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > > > > +     ret = set_proto_ctx_param(file_priv, pc, args);
> > > > > > > > > >
> > > > > > > > > > I think we should have a FIXME here of not allowing this on some future
> > > > > > > > > > platforms because just use CTX_CREATE_EXT.
> > > > > > > > >
> > > > > > > > > Done.
> > > > > > > > >
> > > > > > > > > > > +     if (ret == -ENOTSUPP) {
> > > > > > > > > > > +             /* Some params, specifically SSEU, can only be set on fully
> > > > > > > > > >
> > > > > > > > > > I think this needs a FIXME: that this only holds during the conversion?
> > > > > > > > > > Otherwise we kinda have a bit a problem me thinks ...
> > > > > > > > >
> > > > > > > > > I'm not sure what you mean by that.
> > > > > > > >
> > > > > > > > Well I'm at least assuming that we wont have this case anymore, i.e.
> > > > > > > > there's only two kinds of parameters:
> > > > > > > > - those which are valid only on proto context
> > > > > > > > - those which are valid on both (like priority)
> > > > > > > >
> > > > > > > > This SSEU thing looks like a 3rd parameter, which is only valid on
> > > > > > > > finalized context. That feels all kinds of wrong. Will it stay? If yes
> > > > > > > > *ugh* and why?
> > > > > > >
> > > > > > > Because I was being lazy.  The SSEU stuff is a fairly complex param to
> > > > > > > parse and it's always set live.  I can factor out the SSEU parsing
> > > > > > > code if you want and it shouldn't be too bad in the end.
> > > > > >
> > > > > > Yeah I think the special case here is a bit too jarring.
> > > > >
> > > > > I rolled a v5 that allows you to set SSEU as a create param.  I'm not
> > > > > a huge fan of that much code duplication for the SSEU set but I guess
> > > > > that's what we get for deciding to "unify" our context creation
> > > > > parameter path with our on-the-fly parameter path....
> > > > >
> > > > > You can look at it here:
> > > > >
> > > > > https://gitlab.freedesktop.org/jekstrand/linux/-/commit/c805f424a3374b2de405b7fc651eab551df2cdaf#474deb1194892a272db022ff175872d42004dfda_283_588
> > > >
> > > > Hm yeah the duplication of the render engine check is a bit annoying.
> > > > What's worse, if you tthrow another set_engines on top it's probably
> > > > all wrong then. The old thing solved that by just throwing that
> > > > intel_context away.
> > >
> > > I think that's already mostly taken care of.  When set_engines
> > > happens, we throw away the old array of engines and start with a new
> > > one where everything has been memset to 0.  The one remaining problem
> > > is that, if userspace resets the engine set, we need to memset
> > > legacy_rcs_sseu to 0.  I've added that.
> > >
> > > > You're also not keeping the engine id in the proto ctx for this, so
> > > > there's probably some gaps there. We'd need to clear the SSEU if
> > > > userspace puts another context there. But also no userspace does that.
> > >
> > > Again, I think that's handled.  See above.
> > >
> > > > Plus cursory review of userspace show
> > > > - mesa doesn't set this
> > > > - compute sets its right before running the batch
> > > > - media sets it as the last thing of context creation
> > > >
> > > > So it's kinda not needed. But also we're asking umd to switch over to
> > > > CTX_CREATE_EXT, and if sseu doesn't work for that media team will be
> > > > puzzled. And we've confused them enough already with our uapis.
> > > >
> > > > Another idea: proto_set_sseu just stores the uapi struct and a note
> > > > that it's set, and checks nothing. To validate sseu on proto context
> > > > we do (but only when an sseu parameter is set):
> > > > 1. finalize the context
> > > > 2. call the real set_sseu for validation
> > > > 3. throw the finalized context away again, it was just for validating
> > > > the overall thing
> > > >
> > > > That way we don't have to consider all the interactions of setting
> > > > sseu and engines in any order on proto context, validation code is
> > > > guaranteed shared. Only downside is that there's a slight chance in
> > > > behaviour: SSEU, then setting another engine in that slot will fail
> > > > instead of throwing the sseu parameters away. That's the right thing
> > > > for CTX_CREATE_EXT anyway, and current userspace doesn't care.
> > > >
> > > > Thoughts?
> > >
> > > I thought about that.  The problem is that they can set_sseu multiple
> > > times on different engines.  This means we'd have to effectively build
> > > up an arbitrary list of SSEU set operations and replay it.  I'm not
> > > sure how I feel about building up a big data structure.
> >
> > Hm, but how does this work with proto ctx then? I've only seen a
> > single sseu param set in the patch you linked.
>
> It works roughly the same as it works now:
>
>  - If set_sseu is called, it always overwrites whatever was there
> before.  If it's called for a legacy (no user-specified engines)
> context, it overwrites legacy_rcs_sseu.  If it's called on a user
> engine context, it overwrites the sseu on the given engine.
>  - When set_engines is called, it throws away all the user engine data
> (if any) and memsets legacy_rcu_sseu to 0.  The end result is that
> everything gets reset.

I think I need to review this carefully in the new version. Definitely
too much w/e here already for tricky stuff :-)

> > > > > I'm also going to send it to trybot.
> > > >
> > > > If you resend pls include all my r-b, I think some got lost in v4.
> > >
> > > I'll try and dig those up.
> > >
> > > > Also, in the kernel at least we expect minimal commit message with a
> > > > bit of context, there's no Part-of: link pointing at the entire MR
> > > > with overview and discussion, the patchwork Link: we add is a pretty
> > > > bad substitute. Some of the new patches in v4 are a bit too terse on
> > > > that.
> > >
> > > Yup.  I can try to expand things a bit more.
> > >
> > > > And finally I'm still not a big fan of the add/remove split over
> > > > patches, but oh well.
> > >
> > > I'm not either but working through all this reminded me of why I
> > > didn't do it more gradual.  The problem is ordering.  If add and
> > > remove at the same time and do it one param at a time, we'll end up
> > > with a situation in the middle where some params will only be allowed
> > > to be set on the proto-ctx and others will force a proto-ctx ->
> > > context conversion.  If, for instance, one UMD sets engines first and
> > > then VMs and another sets VMs first and then engines, there's no way
> > > to do a gradual transition without breaking one of them.  Also, we
> > > need to handle basically all the setparam complexity in order to
> > > handle creation structs and, again, those can come in any order.
> >
> > Yeah I know, but I considered that. I think compute-runtime uses
> > CTX_CREATE_EXT, it's only media.
>
> That doesn't really matter because both go through the same path.
> Anything that uses CONTEXT_CREATE_EXT is identical to something which
> creates the context and then calls SET_CONTEXT_PARAM in the same order
> as the structs in the extension chain.
>
> Incidentally, this also means that if we do it gradually, we have to
> handle finalizing the proto-ctx mid-way through handling the chain of
> create extensions.  That should be possible to handle if a bit tricky.
> It'll also mean we'll have a (small) range of kernels where the
> CONTEXT_CREATE_EXT method is broken if you get it in the wrong order.
>
> > So we need to order the patches in
> > exactly the order media calls setparam. And then we're good.
>
> Mesa only ever sets engines.  Upstream compute only ever sets the VM.
> Media always sets the VM first.  So, if we handle VM first, we should
> be good-to-go, I think.
>
> > Worst case it's exactly as useful in bisecting as your approach here
> > (you add dead code first, then use it,
>
> It's not dead.  At the time it's added, it's used for all
> CONTEXT_CREATE_EXT.  Then, later, it becomes used for everything.
>
> > so might as well just squash it
> > all down to one), but if we get the ordering right it's substantially
> > better.
>
> I can try to spin a v5 and see how bad it ends up being.  I don't
> really like breaking CONTEXT_CREATE_EXT in the middle, though.

Hm right, I forgot that we also de-proto in the middle of
CONTEXT_CREATE_EXT while the conversion is going on. This really is
annoying.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-04-30 10:11                       ` Daniel Vetter
@ 2021-05-01 17:17                         ` Matthew Brost
  -1 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-05-01 17:17 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers, Jason Ekstrand

On Fri, Apr 30, 2021 at 12:11:07PM +0200, Daniel Vetter wrote:
> On Thu, Apr 29, 2021 at 09:03:48PM -0700, Matthew Brost wrote:
> > On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> > > On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > >
> > > > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > > > until that lands.
> > > > > >
> > > > > > Mind being a bit more specific about that?  The motivation for this
> > > > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > > > Instead you create engines and then bond them together after the fact.
> > > > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > > > we want a more explicit API where you specify a set of engine
> > > > > > class/instance pairs to bond together into a single engine similar to
> > > > > > how the current balancing API works.
> > > > > >
> > > > > > Of course, that's all focused on the API and not the internals.  But,
> > > > > > again, I'm not sure how we want things to look internally.  What we've
> > > > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > > > much not the expert there.  I don't want to be working at cross
> > > > > > purposes to you and I'm happy to leave bits if you think they're
> > > > > > useful.  But I thought I was clearing things away so that you can put
> > > > > > in what you actually want for GuC/parallel submit.
> > > > > >
> > > > >
> > > > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > > > still likely be used for the new parallel submission interface with
> > > > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > > > only simple ones.
> > > > 
> > > > I'm fine with leaving some of the internal bits for a little while if
> > > > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > > > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > > > thoughts?
> > > 
> > > Yeah I'm also wondering why we need this. Essentially your insight (and
> > > Tony Ye from media team confirmed) is that media umd never uses bonded on
> > > virtual engines.
> > >
> > 
> > Well you should use virtual engines with parallel submission interface 
> > if are you using it correctly.
> > 
> > e.g. You want a 2 wide parallel submission and there are 4 engine
> > instances.
> > 
> > You'd create 2 VEs:
> > 
> > A: 0, 2
> > B: 1, 3
> > set_parallel
> 
> So tbh I'm not really liking this part. At least my understanding is that
> with GuC this is really one overall virtual engine, backed by a multi-lrc.
> 
> So it should fill one engine slot, not fill multiple virtual engines and
> then be an awkward thing wrapped on top.
> 
> I think (but maybe my understanding of GuC and the parallel submit execbuf
> interface is wrong) that the parallel engine should occupy a single VE
> slot, not require additional VE just for fun (maybe the execlist backend
> would require that internally, but that should not leak into the higher
> levels, much less the uapi). And you submit your multi-batch execbuf on
> that single parallel VE, which then gets passed to GuC as a multi-LRC.
> Internally in the backend there's a bit of fan-out to put the right
> MI_BB_START into the right rings and all that, but again I think that
> should be backend concerns.
> 
> Or am I missing something big here?

Unfortunately that is not how the interface works. The user must
configure the engine set with either physical or virtual engines which
determine the valid placements of each BB (LRC, ring, whatever we want
to call it) and call the set parallel extension which validations engine
layout. After that the engines are ready be used with multi-BB
submission in single IOCTL. 

We discussed this internally with the i915 developers + with the media
for like 6 months and this is where we landed after some very
contentious discussions. One of the proposals was pretty similar to your
understanding but got NACK'd as it was too specific to what our HW can
do / what the UMDs need rather than being able to do tons of wild things
our HW / UMDs will never support (sounds familiar, right?). 

What we landed on is still simpler than most of the other proposals - we
almost really went off the deep end but voices of reason thankfully won
out.

> 
> > For GuC submission we just configure context and the GuC load balances
> > it.
> > 
> > For execlists we'd need to create bonds.
> > 
> > Also likely the reason virtual engines wasn't used with the old
> > interface was we only had 2 instances max per class so no need for
> > virtual engines. If they used it for my above example if they were using
> > the interface correctly they would have to use virtual engines too.
> 
> They do actually use virtual engines, it's just the virtual engine only
> contains a single one, and internally i915 folds that into the hw engine
> directly. So we can take away the entire implementation complexity.
> 
> Also I still think for execlist we shouldn't bother with trying to enable
> parallel submit. Or at least only way down if there's no other reasonable
> option.
>

Agree but honestly if we have to it isn't going to be that painful. I
think my patch to enable this was a couple hundred lines.

Matt
 
> > > So the only thing we need is the await_fence submit_fence logic to stall
> > > the subsequent patches just long enough. I think that stays.
> > >
> > 
> > My implementation, for the new parallel submission interface, with
> > execlists used a bonds + priority boosts to ensure both are present at
> > the same time. This was used for both non-virtual and virtual engines.
> > This was never reviewed though and the code died on the list.
> 
> :-(
> 
> > > All the additional logic with the cmpxchg lockless trickery and all that
> > > isn't needed, because we _never_ have to select an engine for bonded
> > > submission: It's always the single one available.
> > > 
> > > This would mean that for execlist parallel submit we can apply a
> > > limitation (beyond what GuC supports perhaps) and it's all ok. With that
> > > everything except the submit fence await logic itself can go I think.
> > > 
> > > Also one for Matt: We decided to ZBB implementing parallel submit on
> > > execlist, it's going to be just for GuC. At least until someone starts
> > > screaming really loudly.
> > 
> > If this is the case, then bonds can be deleted.
> 
> Yeah that's the goal we're aiming for.
> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-05-01 17:17                         ` Matthew Brost
  0 siblings, 0 replies; 226+ messages in thread
From: Matthew Brost @ 2021-05-01 17:17 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Fri, Apr 30, 2021 at 12:11:07PM +0200, Daniel Vetter wrote:
> On Thu, Apr 29, 2021 at 09:03:48PM -0700, Matthew Brost wrote:
> > On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> > > On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > > > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > >
> > > > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > > > until that lands.
> > > > > >
> > > > > > Mind being a bit more specific about that?  The motivation for this
> > > > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > > > Instead you create engines and then bond them together after the fact.
> > > > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > > > we want a more explicit API where you specify a set of engine
> > > > > > class/instance pairs to bond together into a single engine similar to
> > > > > > how the current balancing API works.
> > > > > >
> > > > > > Of course, that's all focused on the API and not the internals.  But,
> > > > > > again, I'm not sure how we want things to look internally.  What we've
> > > > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > > > much not the expert there.  I don't want to be working at cross
> > > > > > purposes to you and I'm happy to leave bits if you think they're
> > > > > > useful.  But I thought I was clearing things away so that you can put
> > > > > > in what you actually want for GuC/parallel submit.
> > > > > >
> > > > >
> > > > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > > > still likely be used for the new parallel submission interface with
> > > > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > > > only simple ones.
> > > > 
> > > > I'm fine with leaving some of the internal bits for a little while if
> > > > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > > > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > > > thoughts?
> > > 
> > > Yeah I'm also wondering why we need this. Essentially your insight (and
> > > Tony Ye from media team confirmed) is that media umd never uses bonded on
> > > virtual engines.
> > >
> > 
> > Well you should use virtual engines with parallel submission interface 
> > if are you using it correctly.
> > 
> > e.g. You want a 2 wide parallel submission and there are 4 engine
> > instances.
> > 
> > You'd create 2 VEs:
> > 
> > A: 0, 2
> > B: 1, 3
> > set_parallel
> 
> So tbh I'm not really liking this part. At least my understanding is that
> with GuC this is really one overall virtual engine, backed by a multi-lrc.
> 
> So it should fill one engine slot, not fill multiple virtual engines and
> then be an awkward thing wrapped on top.
> 
> I think (but maybe my understanding of GuC and the parallel submit execbuf
> interface is wrong) that the parallel engine should occupy a single VE
> slot, not require additional VE just for fun (maybe the execlist backend
> would require that internally, but that should not leak into the higher
> levels, much less the uapi). And you submit your multi-batch execbuf on
> that single parallel VE, which then gets passed to GuC as a multi-LRC.
> Internally in the backend there's a bit of fan-out to put the right
> MI_BB_START into the right rings and all that, but again I think that
> should be backend concerns.
> 
> Or am I missing something big here?

Unfortunately that is not how the interface works. The user must
configure the engine set with either physical or virtual engines which
determine the valid placements of each BB (LRC, ring, whatever we want
to call it) and call the set parallel extension which validations engine
layout. After that the engines are ready be used with multi-BB
submission in single IOCTL. 

We discussed this internally with the i915 developers + with the media
for like 6 months and this is where we landed after some very
contentious discussions. One of the proposals was pretty similar to your
understanding but got NACK'd as it was too specific to what our HW can
do / what the UMDs need rather than being able to do tons of wild things
our HW / UMDs will never support (sounds familiar, right?). 

What we landed on is still simpler than most of the other proposals - we
almost really went off the deep end but voices of reason thankfully won
out.

> 
> > For GuC submission we just configure context and the GuC load balances
> > it.
> > 
> > For execlists we'd need to create bonds.
> > 
> > Also likely the reason virtual engines wasn't used with the old
> > interface was we only had 2 instances max per class so no need for
> > virtual engines. If they used it for my above example if they were using
> > the interface correctly they would have to use virtual engines too.
> 
> They do actually use virtual engines, it's just the virtual engine only
> contains a single one, and internally i915 folds that into the hw engine
> directly. So we can take away the entire implementation complexity.
> 
> Also I still think for execlist we shouldn't bother with trying to enable
> parallel submit. Or at least only way down if there's no other reasonable
> option.
>

Agree but honestly if we have to it isn't going to be that painful. I
think my patch to enable this was a couple hundred lines.

Matt
 
> > > So the only thing we need is the await_fence submit_fence logic to stall
> > > the subsequent patches just long enough. I think that stays.
> > >
> > 
> > My implementation, for the new parallel submission interface, with
> > execlists used a bonds + priority boosts to ensure both are present at
> > the same time. This was used for both non-virtual and virtual engines.
> > This was never reviewed though and the code died on the list.
> 
> :-(
> 
> > > All the additional logic with the cmpxchg lockless trickery and all that
> > > isn't needed, because we _never_ have to select an engine for bonded
> > > submission: It's always the single one available.
> > > 
> > > This would mean that for execlist parallel submit we can apply a
> > > limitation (beyond what GuC supports perhaps) and it's all ok. With that
> > > everything except the submit fence await logic itself can go I think.
> > > 
> > > Also one for Matt: We decided to ZBB implementing parallel submit on
> > > execlist, it's going to be just for GuC. At least until someone starts
> > > screaming really loudly.
> > 
> > If this is the case, then bonds can be deleted.
> 
> Yeah that's the goal we're aiming for.
> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
  2021-05-01 17:17                         ` Matthew Brost
@ 2021-05-04  7:36                           ` Daniel Vetter
  -1 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-05-04  7:36 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Maling list - DRI developers, Intel GFX, Jason Ekstrand

On Sat, May 01, 2021 at 10:17:46AM -0700, Matthew Brost wrote:
> On Fri, Apr 30, 2021 at 12:11:07PM +0200, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 09:03:48PM -0700, Matthew Brost wrote:
> > > On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> > > > On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > > > > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > >
> > > > > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > > > > until that lands.
> > > > > > >
> > > > > > > Mind being a bit more specific about that?  The motivation for this
> > > > > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > > > > Instead you create engines and then bond them together after the fact.
> > > > > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > > > > we want a more explicit API where you specify a set of engine
> > > > > > > class/instance pairs to bond together into a single engine similar to
> > > > > > > how the current balancing API works.
> > > > > > >
> > > > > > > Of course, that's all focused on the API and not the internals.  But,
> > > > > > > again, I'm not sure how we want things to look internally.  What we've
> > > > > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > > > > much not the expert there.  I don't want to be working at cross
> > > > > > > purposes to you and I'm happy to leave bits if you think they're
> > > > > > > useful.  But I thought I was clearing things away so that you can put
> > > > > > > in what you actually want for GuC/parallel submit.
> > > > > > >
> > > > > >
> > > > > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > > > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > > > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > > > > still likely be used for the new parallel submission interface with
> > > > > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > > > > only simple ones.
> > > > > 
> > > > > I'm fine with leaving some of the internal bits for a little while if
> > > > > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > > > > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > > > > thoughts?
> > > > 
> > > > Yeah I'm also wondering why we need this. Essentially your insight (and
> > > > Tony Ye from media team confirmed) is that media umd never uses bonded on
> > > > virtual engines.
> > > >
> > > 
> > > Well you should use virtual engines with parallel submission interface 
> > > if are you using it correctly.
> > > 
> > > e.g. You want a 2 wide parallel submission and there are 4 engine
> > > instances.
> > > 
> > > You'd create 2 VEs:
> > > 
> > > A: 0, 2
> > > B: 1, 3
> > > set_parallel
> > 
> > So tbh I'm not really liking this part. At least my understanding is that
> > with GuC this is really one overall virtual engine, backed by a multi-lrc.
> > 
> > So it should fill one engine slot, not fill multiple virtual engines and
> > then be an awkward thing wrapped on top.
> > 
> > I think (but maybe my understanding of GuC and the parallel submit execbuf
> > interface is wrong) that the parallel engine should occupy a single VE
> > slot, not require additional VE just for fun (maybe the execlist backend
> > would require that internally, but that should not leak into the higher
> > levels, much less the uapi). And you submit your multi-batch execbuf on
> > that single parallel VE, which then gets passed to GuC as a multi-LRC.
> > Internally in the backend there's a bit of fan-out to put the right
> > MI_BB_START into the right rings and all that, but again I think that
> > should be backend concerns.
> > 
> > Or am I missing something big here?
> 
> Unfortunately that is not how the interface works. The user must
> configure the engine set with either physical or virtual engines which
> determine the valid placements of each BB (LRC, ring, whatever we want
> to call it) and call the set parallel extension which validations engine
> layout. After that the engines are ready be used with multi-BB
> submission in single IOCTL. 
> 
> We discussed this internally with the i915 developers + with the media
> for like 6 months and this is where we landed after some very
> contentious discussions. One of the proposals was pretty similar to your
> understanding but got NACK'd as it was too specific to what our HW can
> do / what the UMDs need rather than being able to do tons of wild things
> our HW / UMDs will never support (sounds familiar, right?). 
> 
> What we landed on is still simpler than most of the other proposals - we
> almost really went off the deep end but voices of reason thankfully won
> out.

Yeah I know some of the story here. But the thing is, we're ripping out
tons of these design decisions because they're just plain bogus.

And this very much looks like one, and since it's new uapi, it's better to
correct it before we finalize it in upstream for 10+ years. Or we're just
right back to where we are right now, and this hole is too deep for my
taste.

btw these kind of discussions are what the rfc patch with documentation of
our new uapi&plans is meant for.

> > > For GuC submission we just configure context and the GuC load balances
> > > it.
> > > 
> > > For execlists we'd need to create bonds.
> > > 
> > > Also likely the reason virtual engines wasn't used with the old
> > > interface was we only had 2 instances max per class so no need for
> > > virtual engines. If they used it for my above example if they were using
> > > the interface correctly they would have to use virtual engines too.
> > 
> > They do actually use virtual engines, it's just the virtual engine only
> > contains a single one, and internally i915 folds that into the hw engine
> > directly. So we can take away the entire implementation complexity.
> > 
> > Also I still think for execlist we shouldn't bother with trying to enable
> > parallel submit. Or at least only way down if there's no other reasonable
> > option.
> >
> 
> Agree but honestly if we have to it isn't going to be that painful. I
> think my patch to enable this was a couple hundred lines.

Ah that sounds good at least, as a fallback.
-Daniel

> 
> Matt
>  
> > > > So the only thing we need is the await_fence submit_fence logic to stall
> > > > the subsequent patches just long enough. I think that stays.
> > > >
> > > 
> > > My implementation, for the new parallel submission interface, with
> > > execlists used a bonds + priority boosts to ensure both are present at
> > > the same time. This was used for both non-virtual and virtual engines.
> > > This was never reviewed though and the code died on the list.
> > 
> > :-(
> > 
> > > > All the additional logic with the cmpxchg lockless trickery and all that
> > > > isn't needed, because we _never_ have to select an engine for bonded
> > > > submission: It's always the single one available.
> > > > 
> > > > This would mean that for execlist parallel submit we can apply a
> > > > limitation (beyond what GuC supports perhaps) and it's all ok. With that
> > > > everything except the submit fence await logic itself can go I think.
> > > > 
> > > > Also one for Matt: We decided to ZBB implementing parallel submit on
> > > > execlist, it's going to be just for GuC. At least until someone starts
> > > > screaming really loudly.
> > > 
> > > If this is the case, then bonds can be deleted.
> > 
> > Yeah that's the goal we're aiming for.
> > -Daniel
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 226+ messages in thread

* Re: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines
@ 2021-05-04  7:36                           ` Daniel Vetter
  0 siblings, 0 replies; 226+ messages in thread
From: Daniel Vetter @ 2021-05-04  7:36 UTC (permalink / raw)
  To: Matthew Brost; +Cc: Maling list - DRI developers, Intel GFX

On Sat, May 01, 2021 at 10:17:46AM -0700, Matthew Brost wrote:
> On Fri, Apr 30, 2021 at 12:11:07PM +0200, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 09:03:48PM -0700, Matthew Brost wrote:
> > > On Thu, Apr 29, 2021 at 02:14:19PM +0200, Daniel Vetter wrote:
> > > > On Wed, Apr 28, 2021 at 01:17:27PM -0500, Jason Ekstrand wrote:
> > > > > On Wed, Apr 28, 2021 at 1:02 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > >
> > > > > > On Wed, Apr 28, 2021 at 12:46:07PM -0500, Jason Ekstrand wrote:
> > > > > > > On Wed, Apr 28, 2021 at 12:26 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > > > > > > > Jumping on here mid-thread. For what is is worth to make execlists work
> > > > > > > > with the upcoming parallel submission extension I leveraged some of the
> > > > > > > > existing bonding code so I wouldn't be too eager to delete this code
> > > > > > > > until that lands.
> > > > > > >
> > > > > > > Mind being a bit more specific about that?  The motivation for this
> > > > > > > patch is that the current bonding handling and uAPI is, well, very odd
> > > > > > > and confusing IMO.  It doesn't let you create sets of bonded engines.
> > > > > > > Instead you create engines and then bond them together after the fact.
> > > > > > > I didn't want to blindly duplicate those oddities with the proto-ctx
> > > > > > > stuff unless they were useful.  With parallel submit, I would expect
> > > > > > > we want a more explicit API where you specify a set of engine
> > > > > > > class/instance pairs to bond together into a single engine similar to
> > > > > > > how the current balancing API works.
> > > > > > >
> > > > > > > Of course, that's all focused on the API and not the internals.  But,
> > > > > > > again, I'm not sure how we want things to look internally.  What we've
> > > > > > > got now doesn't seem great for the GuC submission model but I'm very
> > > > > > > much not the expert there.  I don't want to be working at cross
> > > > > > > purposes to you and I'm happy to leave bits if you think they're
> > > > > > > useful.  But I thought I was clearing things away so that you can put
> > > > > > > in what you actually want for GuC/parallel submit.
> > > > > > >
> > > > > >
> > > > > > Removing all the UAPI things are fine but I wouldn't delete some of the
> > > > > > internal stuff (e.g. intel_virtual_engine_attach_bond, bond
> > > > > > intel_context_ops, the hook for a submit fence, etc...) as that will
> > > > > > still likely be used for the new parallel submission interface with
> > > > > > execlists. As you say the new UAPI wont allow crazy configurations,
> > > > > > only simple ones.
> > > > > 
> > > > > I'm fine with leaving some of the internal bits for a little while if
> > > > > it makes pulling the GuC scheduler in easier.  I'm just a bit
> > > > > skeptical of why you'd care about SUBMIT_FENCE. :-)  Daniel, any
> > > > > thoughts?
> > > > 
> > > > Yeah I'm also wondering why we need this. Essentially your insight (and
> > > > Tony Ye from media team confirmed) is that media umd never uses bonded on
> > > > virtual engines.
> > > >
> > > 
> > > Well you should use virtual engines with parallel submission interface 
> > > if are you using it correctly.
> > > 
> > > e.g. You want a 2 wide parallel submission and there are 4 engine
> > > instances.
> > > 
> > > You'd create 2 VEs:
> > > 
> > > A: 0, 2
> > > B: 1, 3
> > > set_parallel
> > 
> > So tbh I'm not really liking this part. At least my understanding is that
> > with GuC this is really one overall virtual engine, backed by a multi-lrc.
> > 
> > So it should fill one engine slot, not fill multiple virtual engines and
> > then be an awkward thing wrapped on top.
> > 
> > I think (but maybe my understanding of GuC and the parallel submit execbuf
> > interface is wrong) that the parallel engine should occupy a single VE
> > slot, not require additional VE just for fun (maybe the execlist backend
> > would require that internally, but that should not leak into the higher
> > levels, much less the uapi). And you submit your multi-batch execbuf on
> > that single parallel VE, which then gets passed to GuC as a multi-LRC.
> > Internally in the backend there's a bit of fan-out to put the right
> > MI_BB_START into the right rings and all that, but again I think that
> > should be backend concerns.
> > 
> > Or am I missing something big here?
> 
> Unfortunately that is not how the interface works. The user must
> configure the engine set with either physical or virtual engines which
> determine the valid placements of each BB (LRC, ring, whatever we want
> to call it) and call the set parallel extension which validations engine
> layout. After that the engines are ready be used with multi-BB
> submission in single IOCTL. 
> 
> We discussed this internally with the i915 developers + with the media
> for like 6 months and this is where we landed after some very
> contentious discussions. One of the proposals was pretty similar to your
> understanding but got NACK'd as it was too specific to what our HW can
> do / what the UMDs need rather than being able to do tons of wild things
> our HW / UMDs will never support (sounds familiar, right?). 
> 
> What we landed on is still simpler than most of the other proposals - we
> almost really went off the deep end but voices of reason thankfully won
> out.

Yeah I know some of the story here. But the thing is, we're ripping out
tons of these design decisions because they're just plain bogus.

And this very much looks like one, and since it's new uapi, it's better to
correct it before we finalize it in upstream for 10+ years. Or we're just
right back to where we are right now, and this hole is too deep for my
taste.

btw these kind of discussions are what the rfc patch with documentation of
our new uapi&plans is meant for.

> > > For GuC submission we just configure context and the GuC load balances
> > > it.
> > > 
> > > For execlists we'd need to create bonds.
> > > 
> > > Also likely the reason virtual engines wasn't used with the old
> > > interface was we only had 2 instances max per class so no need for
> > > virtual engines. If they used it for my above example if they were using
> > > the interface correctly they would have to use virtual engines too.
> > 
> > They do actually use virtual engines, it's just the virtual engine only
> > contains a single one, and internally i915 folds that into the hw engine
> > directly. So we can take away the entire implementation complexity.
> > 
> > Also I still think for execlist we shouldn't bother with trying to enable
> > parallel submit. Or at least only way down if there's no other reasonable
> > option.
> >
> 
> Agree but honestly if we have to it isn't going to be that painful. I
> think my patch to enable this was a couple hundred lines.

Ah that sounds good at least, as a fallback.
-Daniel

> 
> Matt
>  
> > > > So the only thing we need is the await_fence submit_fence logic to stall
> > > > the subsequent patches just long enough. I think that stays.
> > > >
> > > 
> > > My implementation, for the new parallel submission interface, with
> > > execlists used a bonds + priority boosts to ensure both are present at
> > > the same time. This was used for both non-virtual and virtual engines.
> > > This was never reviewed though and the code died on the list.
> > 
> > :-(
> > 
> > > > All the additional logic with the cmpxchg lockless trickery and all that
> > > > isn't needed, because we _never_ have to select an engine for bonded
> > > > submission: It's always the single one available.
> > > > 
> > > > This would mean that for execlist parallel submit we can apply a
> > > > limitation (beyond what GuC supports perhaps) and it's all ok. With that
> > > > everything except the submit fence await logic itself can go I think.
> > > > 
> > > > Also one for Matt: We decided to ZBB implementing parallel submit on
> > > > execlist, it's going to be just for GuC. At least until someone starts
> > > > screaming really loudly.
> > > 
> > > If this is the case, then bonds can be deleted.
> > 
> > Yeah that's the goal we're aiming for.
> > -Daniel
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 226+ messages in thread

end of thread, other threads:[~2021-05-04  7:36 UTC | newest]

Thread overview: 226+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23 22:31 [PATCH 00/21] drm/i915/gem: ioctl clean-ups Jason Ekstrand
2021-04-23 22:31 ` [Intel-gfx] " Jason Ekstrand
2021-04-23 22:31 ` [PATCH 01/21] drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:32   ` Daniel Vetter
2021-04-27  9:32     ` [Intel-gfx] " Daniel Vetter
2021-04-28  3:33     ` Jason Ekstrand
2021-04-28  3:33       ` [Intel-gfx] " Jason Ekstrand
2021-04-23 22:31 ` [PATCH 02/21] drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:38   ` Daniel Vetter
2021-04-27  9:38     ` Daniel Vetter
2021-04-23 22:31 ` [PATCH 03/21] drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:42   ` Daniel Vetter
2021-04-27  9:42     ` [Intel-gfx] " Daniel Vetter
2021-04-28 15:55   ` Tvrtko Ursulin
2021-04-28 15:55     ` Tvrtko Ursulin
2021-04-28 17:24     ` Jason Ekstrand
2021-04-28 17:24       ` Jason Ekstrand
2021-04-29  8:04       ` Tvrtko Ursulin
2021-04-29  8:04         ` Tvrtko Ursulin
2021-04-29 14:54         ` Jason Ekstrand
2021-04-29 14:54           ` Jason Ekstrand
2021-04-29 17:12           ` Daniel Vetter
2021-04-29 17:12             ` Daniel Vetter
2021-04-29 17:13             ` Daniel Vetter
2021-04-29 17:13               ` Daniel Vetter
2021-04-29 18:41               ` Jason Ekstrand
2021-04-29 18:41                 ` Jason Ekstrand
2021-04-30 11:18           ` Tvrtko Ursulin
2021-04-30 11:18             ` Tvrtko Ursulin
2021-04-30 15:35             ` Jason Ekstrand
2021-04-30 15:35               ` Jason Ekstrand
2021-04-23 22:31 ` [PATCH 04/21] drm/i915/gem: Return void from context_apply_all Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:42   ` Daniel Vetter
2021-04-27  9:42     ` [Intel-gfx] " Daniel Vetter
2021-04-23 22:31 ` [PATCH 05/21] drm/i915: Drop the CONTEXT_CLONE API Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:49   ` Daniel Vetter
2021-04-27  9:49     ` Daniel Vetter
2021-04-28 17:38     ` Jason Ekstrand
2021-04-28 17:38       ` Jason Ekstrand
2021-04-28 15:59   ` Tvrtko Ursulin
2021-04-28 15:59     ` Tvrtko Ursulin
2021-04-23 22:31 ` [PATCH 06/21] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v3) Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:55   ` Daniel Vetter
2021-04-27  9:55     ` Daniel Vetter
2021-04-28 15:49   ` Tvrtko Ursulin
2021-04-28 15:49     ` Tvrtko Ursulin
2021-04-28 17:26     ` Jason Ekstrand
2021-04-28 17:26       ` Jason Ekstrand
2021-04-29  8:06       ` Tvrtko Ursulin
2021-04-29  8:06         ` Tvrtko Ursulin
2021-04-29 12:08         ` Daniel Vetter
2021-04-29 12:08           ` Daniel Vetter
2021-04-29 14:47           ` Jason Ekstrand
2021-04-29 14:47             ` Jason Ekstrand
2021-04-23 22:31 ` [PATCH 07/21] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-27  9:58   ` Daniel Vetter
2021-04-27  9:58     ` [Intel-gfx] " Daniel Vetter
2021-04-23 22:31 ` [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-26 23:43   ` [PATCH 08/20] drm/i915/gem: Disallow bonding of virtual engines (v2) Jason Ekstrand
2021-04-26 23:43     ` [Intel-gfx] " Jason Ekstrand
2021-04-27 13:58     ` Daniel Vetter
2021-04-27 13:58       ` Daniel Vetter
2021-04-27 13:51   ` [PATCH 08/21] drm/i915/gem: Disallow bonding of virtual engines Jason Ekstrand
2021-04-27 13:51     ` [Intel-gfx] " Jason Ekstrand
2021-04-28 10:13     ` Daniel Vetter
2021-04-28 10:13       ` [Intel-gfx] " Daniel Vetter
2021-04-28 17:18       ` Jason Ekstrand
2021-04-28 17:18         ` [Intel-gfx] " Jason Ekstrand
2021-04-28 17:18         ` Matthew Brost
2021-04-28 17:18           ` Matthew Brost
2021-04-28 17:46           ` Jason Ekstrand
2021-04-28 17:46             ` Jason Ekstrand
2021-04-28 17:55             ` Matthew Brost
2021-04-28 17:55               ` Matthew Brost
2021-04-28 18:17               ` Jason Ekstrand
2021-04-28 18:17                 ` Jason Ekstrand
2021-04-29 12:14                 ` Daniel Vetter
2021-04-29 12:14                   ` Daniel Vetter
2021-04-30  4:03                   ` Matthew Brost
2021-04-30  4:03                     ` Matthew Brost
2021-04-30 10:11                     ` Daniel Vetter
2021-04-30 10:11                       ` Daniel Vetter
2021-05-01 17:17                       ` Matthew Brost
2021-05-01 17:17                         ` Matthew Brost
2021-05-04  7:36                         ` Daniel Vetter
2021-05-04  7:36                           ` Daniel Vetter
2021-04-28 18:58         ` Jason Ekstrand
2021-04-28 18:58           ` [Intel-gfx] " Jason Ekstrand
2021-04-29 12:16           ` Daniel Vetter
2021-04-29 12:16             ` [Intel-gfx] " Daniel Vetter
2021-04-29 16:02             ` Jason Ekstrand
2021-04-29 16:02               ` [Intel-gfx] " Jason Ekstrand
2021-04-29 17:14               ` Daniel Vetter
2021-04-29 17:14                 ` [Intel-gfx] " Daniel Vetter
2021-04-28 15:51   ` Tvrtko Ursulin
2021-04-28 15:51     ` Tvrtko Ursulin
2021-04-29 12:24     ` Daniel Vetter
2021-04-29 12:24       ` Daniel Vetter
2021-04-29 12:54       ` Tvrtko Ursulin
2021-04-29 12:54         ` Tvrtko Ursulin
2021-04-29 15:41         ` Jason Ekstrand
2021-04-29 15:41           ` Jason Ekstrand
2021-04-23 22:31 ` [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-28 10:16   ` Daniel Vetter
2021-04-28 10:16     ` Daniel Vetter
2021-04-28 10:42     ` Tvrtko Ursulin
2021-04-28 10:42       ` Tvrtko Ursulin
2021-04-28 14:02       ` Daniel Vetter
2021-04-28 14:02         ` Daniel Vetter
2021-04-28 14:26         ` Tvrtko Ursulin
2021-04-28 14:26           ` Tvrtko Ursulin
2021-04-28 17:09           ` Jason Ekstrand
2021-04-28 17:09             ` Jason Ekstrand
2021-04-29  8:01             ` Tvrtko Ursulin
2021-04-29  8:01               ` Tvrtko Ursulin
2021-04-29 19:16               ` Jason Ekstrand
2021-04-29 19:16                 ` Jason Ekstrand
2021-04-30 11:40                 ` Tvrtko Ursulin
2021-04-30 11:40                   ` Tvrtko Ursulin
2021-04-30 15:54                   ` Jason Ekstrand
2021-04-30 15:54                     ` Jason Ekstrand
2021-04-23 22:31 ` [PATCH 10/21] drm/i915/request: Remove the hook from await_execution Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-26 23:44   ` Jason Ekstrand
2021-04-26 23:44     ` [Intel-gfx] " Jason Ekstrand
2021-04-23 22:31 ` [PATCH 11/21] drm/i915: Stop manually RCU banging in reset_stats_ioctl Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-28 10:27   ` Daniel Vetter
2021-04-28 10:27     ` [Intel-gfx] " Daniel Vetter
2021-04-28 18:22     ` Jason Ekstrand
2021-04-28 18:22       ` [Intel-gfx] " Jason Ekstrand
2021-04-29 12:22       ` Daniel Vetter
2021-04-29 12:22         ` [Intel-gfx] " Daniel Vetter
2021-04-23 22:31 ` [PATCH 12/21] drm/i915/gem: Add a separate validate_priority helper Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-28 14:37   ` Daniel Vetter
2021-04-28 14:37     ` [Intel-gfx] " Daniel Vetter
2021-04-23 22:31 ` [PATCH 13/21] drm/i915/gem: Add an intermediate proto_context struct Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-29 13:02   ` Daniel Vetter
2021-04-29 13:02     ` Daniel Vetter
2021-04-29 16:44     ` Jason Ekstrand
2021-04-29 16:44       ` Jason Ekstrand
2021-04-23 22:31 ` [PATCH 14/21] drm/i915/gem: Return an error ptr from context_lookup Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-29 13:27   ` Daniel Vetter
2021-04-29 13:27     ` Daniel Vetter
2021-04-29 15:29     ` Jason Ekstrand
2021-04-29 15:29       ` Jason Ekstrand
2021-04-29 17:16       ` Daniel Vetter
2021-04-29 17:16         ` Daniel Vetter
2021-04-23 22:31 ` [PATCH 15/21] drm/i915/gt: Drop i915_address_space::file Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-29 12:37   ` Daniel Vetter
2021-04-29 12:37     ` [Intel-gfx] " Daniel Vetter
2021-04-29 15:26     ` Jason Ekstrand
2021-04-29 15:26       ` [Intel-gfx] " Jason Ekstrand
2021-04-23 22:31 ` [PATCH 16/21] drm/i915/gem: Delay context creation Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-24  3:21   ` kernel test robot
2021-04-24  3:21     ` kernel test robot
2021-04-24  3:21     ` kernel test robot
2021-04-24  3:24   ` kernel test robot
2021-04-24  3:24     ` kernel test robot
2021-04-24  3:24     ` kernel test robot
2021-04-29 15:51   ` Daniel Vetter
2021-04-29 15:51     ` Daniel Vetter
2021-04-29 18:16     ` Jason Ekstrand
2021-04-29 18:16       ` Jason Ekstrand
2021-04-29 18:56       ` Daniel Vetter
2021-04-29 18:56         ` Daniel Vetter
2021-04-29 19:01         ` Jason Ekstrand
2021-04-29 19:01           ` Jason Ekstrand
2021-04-29 19:07           ` Daniel Vetter
2021-04-29 19:07             ` Daniel Vetter
2021-04-29 21:35             ` Jason Ekstrand
2021-04-29 21:35               ` Jason Ekstrand
2021-04-30  6:53               ` Daniel Vetter
2021-04-30  6:53                 ` Daniel Vetter
2021-04-30 11:58                 ` Tvrtko Ursulin
2021-04-30 11:58                   ` Tvrtko Ursulin
2021-04-30 12:30                   ` Daniel Vetter
2021-04-30 12:30                     ` Daniel Vetter
2021-04-30 12:44                     ` Tvrtko Ursulin
2021-04-30 12:44                       ` Tvrtko Ursulin
2021-04-30 13:07                       ` Daniel Vetter
2021-04-30 13:07                         ` Daniel Vetter
2021-04-30 13:15                         ` Tvrtko Ursulin
2021-04-30 13:15                           ` Tvrtko Ursulin
2021-04-30 16:27                 ` Jason Ekstrand
2021-04-30 16:27                   ` Jason Ekstrand
2021-04-30 16:33                   ` Daniel Vetter
2021-04-30 16:33                     ` Daniel Vetter
2021-04-30 16:57                     ` Jason Ekstrand
2021-04-30 16:57                       ` Jason Ekstrand
2021-04-30 17:08                       ` Daniel Vetter
2021-04-30 17:08                         ` Daniel Vetter
2021-04-23 22:31 ` [PATCH 17/21] drm/i915/gem: Don't allow changing the VM on running contexts Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-23 22:31 ` [PATCH 18/21] drm/i915/gem: Don't allow changing the engine set " Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-29 17:21   ` Daniel Vetter
2021-04-29 17:21     ` [Intel-gfx] " Daniel Vetter
2021-04-23 22:31 ` [PATCH 19/21] drm/i915/selftests: Take a VM in kernel_context() Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-23 22:31 ` [PATCH 20/21] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-29 17:19   ` Daniel Vetter
2021-04-29 17:19     ` Daniel Vetter
2021-04-23 22:31 ` [PATCH 21/21] drm/i915/gem: Roll all of context creation together Jason Ekstrand
2021-04-23 22:31   ` [Intel-gfx] " Jason Ekstrand
2021-04-29 17:25   ` Daniel Vetter
2021-04-29 17:25     ` [Intel-gfx] " Daniel Vetter
2021-04-23 22:49 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gem: ioctl clean-ups Patchwork
2021-04-23 22:51 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-04-23 23:16 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-04-24  2:14 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.