All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Subject: Re: [Intel-gfx] [PATCH 10/24] drm/i915: Drop no-semaphore boosting
Date: Wed, 13 May 2020 20:04:08 +0300	[thread overview]
Message-ID: <87a72bvkaf.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20200513074809.18194-10-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Now that we have fast timeslicing on semaphores, we no longer need to
> prioritise none-semaphore work as we will yield any work blocked on a
> sempahore to the next in the queue. Previously with no timeslicing,

sempahore is back at blocking again :)

> blocking on the semaphore caused extremely bad scheduling with multiple
> clients utilising multiple rings. Now, there is no impact and we can
> remove the complication.

Not a small feat to accomplish.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 15 -------
>  drivers/gpu/drm/i915/gt/intel_lrc.c           |  9 -----
>  drivers/gpu/drm/i915/gt/selftest_context.c    |  1 +
>  drivers/gpu/drm/i915/i915_priolist_types.h    |  4 +-
>  drivers/gpu/drm/i915/i915_request.c           | 40 ++-----------------
>  drivers/gpu/drm/i915/i915_request.h           |  1 -
>  drivers/gpu/drm/i915/i915_scheduler.c         | 12 +++---
>  drivers/gpu/drm/i915/i915_scheduler_types.h   |  3 +-
>  8 files changed, 12 insertions(+), 73 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 2067557e277b..0a4606faf966 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2603,21 +2603,6 @@ static void eb_request_add(struct i915_execbuffer *eb)
>  	/* Check that the context wasn't destroyed before submission */
>  	if (likely(!intel_context_is_closed(eb->context))) {
>  		attr = eb->gem_context->sched;
> -
> -		/*
> -		 * Boost actual workloads past semaphores!
> -		 *
> -		 * With semaphores we spin on one engine waiting for another,
> -		 * simply to reduce the latency of starting our work when
> -		 * the signaler completes. However, if there is any other
> -		 * work that we could be doing on this engine instead, that
> -		 * is better utilisation and will reduce the overall duration
> -		 * of the current work. To avoid PI boosting a semaphore
> -		 * far in the distance past over useful work, we keep a history
> -		 * of any semaphore use along our dependency chain.
> -		 */
> -		if (!(rq->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
> -			attr.priority |= I915_PRIORITY_NOSEMAPHORE;
>  	} else {
>  		/* Serialise with context_close via the add_to_timeline */
>  		i915_request_set_error_once(rq, -ENOENT);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 9141b3afa2c5..c7d7438b5d55 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -429,15 +429,6 @@ static int effective_prio(const struct i915_request *rq)
>  	if (i915_request_has_nopreempt(rq))
>  		prio = I915_PRIORITY_UNPREEMPTABLE;
>  
> -	/*
> -	 * On unwinding the active request, we give it a priority bump
> -	 * if it has completed waiting on any semaphore. If we know that
> -	 * the request has already started, we can prevent an unwanted
> -	 * preempt-to-idle cycle by taking that into account now.
> -	 */
> -	if (__i915_request_has_started(rq))
> -		prio |= I915_PRIORITY_NOSEMAPHORE;
> -
>  	return prio;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index a56dff3b157a..52af1cee9a94 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -24,6 +24,7 @@ static int request_sync(struct i915_request *rq)
>  
>  	/* Opencode i915_request_add() so we can keep the timeline locked. */
>  	__i915_request_commit(rq);
> +	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>  	__i915_request_queue(rq, NULL);
>  
>  	timeout = i915_request_wait(rq, 0, HZ / 10);
> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
> index e18723d8df86..5003a71113cb 100644
> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
> @@ -24,14 +24,12 @@ enum {
>  	I915_PRIORITY_DISPLAY,
>  };
>  
> -#define I915_USER_PRIORITY_SHIFT 1
> +#define I915_USER_PRIORITY_SHIFT 0
>  #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
>  
>  #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
>  #define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
>  
> -#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(0))
> -
>  /* Smallest priority value that cannot be bumped. */
>  #define I915_PRIORITY_INVALID (INT_MIN | (u8)I915_PRIORITY_MASK)
>  
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 9ad1e6761492..9738dab5a9f6 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -369,8 +369,6 @@ __await_execution(struct i915_request *rq,
>  	}
>  	spin_unlock_irq(&signal->lock);
>  
> -	/* Copy across semaphore status as we need the same behaviour */
> -	rq->sched.flags |= signal->sched.flags;
>  	return 0;
>  }
>  
> @@ -539,10 +537,8 @@ void __i915_request_unsubmit(struct i915_request *request)
>  	spin_unlock(&request->lock);
>  
>  	/* We've already spun, don't charge on resubmitting. */
> -	if (request->sched.semaphores && i915_request_started(request)) {
> -		request->sched.attr.priority |= I915_PRIORITY_NOSEMAPHORE;
> +	if (request->sched.semaphores && i915_request_started(request))
>  		request->sched.semaphores = 0;
> -	}
>  
>  	/*
>  	 * We don't need to wake_up any waiters on request->execute, they
> @@ -600,15 +596,6 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>  	return NOTIFY_DONE;
>  }
>  
> -static void irq_semaphore_cb(struct irq_work *wrk)
> -{
> -	struct i915_request *rq =
> -		container_of(wrk, typeof(*rq), semaphore_work);
> -
> -	i915_schedule_bump_priority(rq, I915_PRIORITY_NOSEMAPHORE);
> -	i915_request_put(rq);
> -}
> -
>  static int __i915_sw_fence_call
>  semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>  {
> @@ -616,11 +603,6 @@ semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
>  
>  	switch (state) {
>  	case FENCE_COMPLETE:
> -		if (!(READ_ONCE(rq->sched.attr.priority) & I915_PRIORITY_NOSEMAPHORE)) {
> -			i915_request_get(rq);
> -			init_irq_work(&rq->semaphore_work, irq_semaphore_cb);
> -			irq_work_queue(&rq->semaphore_work);
> -		}
>  		break;
>  
>  	case FENCE_FREE:
> @@ -999,6 +981,7 @@ emit_semaphore_wait(struct i915_request *to,
>  		    gfp_t gfp)
>  {
>  	const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
> +	struct i915_sw_fence *wait = &to->submit;
>  
>  	if (!intel_context_use_semaphores(to->context))
>  		goto await_fence;
> @@ -1033,11 +1016,10 @@ emit_semaphore_wait(struct i915_request *to,
>  		goto await_fence;
>  
>  	to->sched.semaphores |= mask;
> -	to->sched.flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> -	return 0;
> +	wait = &to->semaphore;
>  
>  await_fence:
> -	return i915_sw_fence_await_dma_fence(&to->submit,
> +	return i915_sw_fence_await_dma_fence(wait,
>  					     &from->fence, 0,
>  					     I915_FENCE_GFP);
>  }
> @@ -1072,17 +1054,6 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
>  	if (ret < 0)
>  		return ret;
>  
> -	if (to->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN) {
> -		ret = i915_sw_fence_await_dma_fence(&to->semaphore,
> -						    &from->fence, 0,
> -						    I915_FENCE_GFP);
> -		if (ret < 0)
> -			return ret;
> -	}
> -
> -	if (from->sched.flags & I915_SCHED_HAS_EXTERNAL_CHAIN)
> -		to->sched.flags |= I915_SCHED_HAS_EXTERNAL_CHAIN;
> -
>  	return 0;
>  }
>  
> @@ -1706,9 +1677,6 @@ void i915_request_add(struct i915_request *rq)
>  		attr = ctx->sched;
>  	rcu_read_unlock();
>  
> -	if (!(rq->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
> -		attr.priority |= I915_PRIORITY_NOSEMAPHORE;
> -
>  	__i915_request_queue(rq, &attr);
>  
>  	mutex_unlock(&tl->mutex);
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 98ae2dc82371..8ec7ee4dbadc 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -216,7 +216,6 @@ struct i915_request {
>  	};
>  	struct list_head execute_cb;
>  	struct i915_sw_fence semaphore;
> -	struct irq_work semaphore_work;
>  
>  	/*
>  	 * A list of everyone we wait upon, and everyone who waits upon us.
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index f8e797a7eee9..56defe78ae54 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -51,11 +51,11 @@ static void assert_priolists(struct intel_engine_execlists * const execlists)
>  	GEM_BUG_ON(rb_first_cached(&execlists->queue) !=
>  		   rb_first(&execlists->queue.rb_root));
>  
> -	last_prio = (INT_MAX >> I915_USER_PRIORITY_SHIFT) + 1;
> +	last_prio = INT_MAX;
>  	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
>  		const struct i915_priolist *p = to_priolist(rb);
>  
> -		GEM_BUG_ON(p->priority >= last_prio);
> +		GEM_BUG_ON(p->priority > last_prio);
>  		last_prio = p->priority;
>  
>  		GEM_BUG_ON(!p->used);
> @@ -434,15 +434,13 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
>  		dep->waiter = node;
>  		dep->flags = flags;
>  
> -		/* Keep track of whether anyone on this chain has a semaphore */
> -		if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN &&
> -		    !node_started(signal))
> -			node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
> -
>  		/* All set, now publish. Beware the lockless walkers. */
>  		list_add_rcu(&dep->signal_link, &node->signalers_list);
>  		list_add_rcu(&dep->wait_link, &signal->waiters_list);
>  
> +		/* Propagate the chains */
> +		node->flags |= signal->flags;
> +
>  		ret = true;
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> index 6ab2c5289bed..f72e6c397b08 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler_types.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -65,8 +65,7 @@ struct i915_sched_node {
>  	struct list_head link;
>  	struct i915_sched_attr attr;
>  	unsigned int flags;
> -#define I915_SCHED_HAS_SEMAPHORE_CHAIN	BIT(0)
> -#define I915_SCHED_HAS_EXTERNAL_CHAIN	BIT(1)
> +#define I915_SCHED_HAS_EXTERNAL_CHAIN	BIT(0)
>  	intel_engine_mask_t semaphores;
>  };
>  
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-05-13 17:06 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13  7:47 [Intel-gfx] [PATCH 01/24] drm/i915/gt: Transfer old virtual breadcrumbs to irq_worker Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 02/24] dma-buf: Use atomic_fetch_add() for the context id Chris Wilson
2020-05-13 12:37   ` Patelczyk, Maciej
2020-05-13  7:47 ` [Intel-gfx] [PATCH 03/24] drm/i915: Mark the addition of the initial-breadcrumb in the request Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 04/24] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 05/24] drm/syncobj: Allow use of dma-fence-proxy Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 06/24] drm/i915/gem: Teach execbuf how to wait on future syncobj Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 07/24] drm/i915/gem: Allow combining submit-fences with syncobj Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 08/24] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 09/24] drm/i915/gem: Remove redundant exec_fence Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 10/24] drm/i915: Drop no-semaphore boosting Chris Wilson
2020-05-13 17:04   ` Mika Kuoppala [this message]
2020-05-13  7:47 ` [Intel-gfx] [PATCH 11/24] drm/i915: Move saturated workload detection back to the context Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 12/24] drm/i915: Remove the saturation backoff for HW semaphores Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 13/24] drm/i915/gt: Use built-in active intel_context reference Chris Wilson
2020-05-13  7:47 ` [Intel-gfx] [PATCH 14/24] drm/i915: Drop I915_RESET_TIMEOUT and friends Chris Wilson
2020-05-13 12:48   ` Patelczyk, Maciej
2020-05-13  7:48 ` [Intel-gfx] [PATCH 15/24] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 16/24] drm/i915/selftests: Always call the provided engine->emit_init_breadcrumb Chris Wilson
2020-05-14  7:55   ` Mika Kuoppala
2020-05-13  7:48 ` [Intel-gfx] [PATCH 17/24] drm/i915/gt: Do not suspend bonded requests if one hangs Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 18/24] drm/i915: Always defer fenced work to the worker Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 19/24] drm/i915/gem: Assign context id for async work Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 20/24] drm/i915: Export a preallocate variant of i915_active_acquire() Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 21/24] drm/i915/gem: Separate the ww_mutex walker into its own list Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 22/24] drm/i915/gem: Asynchronous GTT unbinding Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 23/24] drm/i915/gem: Bind the fence async for execbuf Chris Wilson
2020-05-13  7:48 ` [Intel-gfx] [PATCH 24/24] drm/i915: Show per-engine default property values in sysfs Chris Wilson
2020-05-13 13:40   ` Patelczyk, Maciej
2020-05-13  8:01 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/24] drm/i915/gt: Transfer old virtual breadcrumbs to irq_worker Patchwork
2020-05-13  8:28 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-05-14  8:55 ` [Intel-gfx] [PATCH 01/24] " Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a72bvkaf.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.