All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [PATCH i-g-t 05/24] i915/gem_exec_schedule: Verify that using HW semaphores doesn't block
Date: Tue, 26 Mar 2019 10:03:50 +0000	[thread overview]
Message-ID: <155359462997.15930.8399218264806708761@skylake-alporthouse-com> (raw)
In-Reply-To: <b6552cbd-e7dc-f1db-2497-9c9e030fb70b@linux.intel.com>

Quoting Tvrtko Ursulin (2019-03-26 09:19:33)
> 
> 
> On 22/03/2019 09:21, Chris Wilson wrote:
> > We may use HW semaphores to schedule nearly-ready work such that they
> > are already spinning on the GPU waiting for the completion on another
> > engine. However, we don't want for that spinning task to actually block
> > any real work should it be scheduled.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   tests/i915/gem_exec_schedule.c | 87 ++++++++++++++++++++++++++++++++++
> >   1 file changed, 87 insertions(+)
> > 
> > diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
> > index 4f0577b4e..ae850c4a3 100644
> > --- a/tests/i915/gem_exec_schedule.c
> > +++ b/tests/i915/gem_exec_schedule.c
> > @@ -48,6 +48,10 @@
> >   
> >   #define MAX_CONTEXTS 1024
> >   
> > +#define LOCAL_I915_EXEC_BSD_SHIFT      (13)
> > +#define LOCAL_I915_EXEC_BSD_MASK       (3 << LOCAL_I915_EXEC_BSD_SHIFT)
> > +#define ENGINE_MASK  (I915_EXEC_RING_MASK | LOCAL_I915_EXEC_BSD_MASK)
> > +
> >   IGT_TEST_DESCRIPTION("Check that we can control the order of execution");
> >   
> >   static inline
> > @@ -320,6 +324,86 @@ static void smoketest(int fd, unsigned ring, unsigned timeout)
> >       }
> >   }
> >   
> > +static uint32_t __batch_create(int i915, uint32_t offset)
> > +{
> > +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, ALIGN(offset + 4, 4096));
> > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static uint32_t batch_create(int i915)
> > +{
> > +     return __batch_create(i915, 0);
> > +}
> > +
> > +static void semaphore_userlock(int i915)
> > +{
> > +     struct drm_i915_gem_exec_object2 obj = {
> > +             .handle = batch_create(i915),
> > +     };
> > +     igt_spin_t *spin = NULL;
> > +     unsigned int engine;
> > +     uint32_t scratch;
> > +
> > +     igt_require(gem_scheduler_has_preemption(i915));
> > +
> > +     /*
> > +      * Given the use of semaphores to govern parallel submission
> > +      * of nearly-ready work to HW, we still want to run actually
> > +      * ready work immediately. Without semaphores, the dependent
> > +      * work wouldn't be submitted so our ready work will run.
> > +      */
> > +
> > +     scratch = gem_create(i915, 4096);
> > +     for_each_physical_engine(i915, engine) {
> > +             if (!spin) {
> > +                     spin = igt_spin_batch_new(i915,
> > +                                               .dependency = scratch,
> > +                                               .engine = engine);
> > +             } else {
> > +                     typeof(spin->execbuf.flags) saved = spin->execbuf.flags;
> 
> u64 reads better and struct eb won't change anyway.

If it were only u64!

> > +                     spin->execbuf.flags &= ~ENGINE_MASK;
> > +                     spin->execbuf.flags |= engine;
> > +
> > +                     gem_execbuf(i915, &spin->execbuf);
> 
> Do you need to wait for spinner to be running before submitting these 
> ones, to make sure the logic emits a semaphore poll for them and submits 
> them straight away?

Not required, the semaphores are emitted based on completion status.

> > +                     spin->execbuf.flags = saved;
> > +             }
> > +     }
> > +     igt_require(spin);
> > +     gem_close(i915, scratch);
> > +
> > +     /*
> > +      * On all dependent engines, the request may be executing (busywaiting
> > +      * on a HW semaphore) but it should not prevent any real work from
> > +      * taking precedence.
> > +      */
> > +     scratch = gem_context_create(i915);
> > +     for_each_physical_engine(i915, engine) {
> > +             struct drm_i915_gem_execbuffer2 execbuf = {
> > +                     .buffers_ptr = to_user_pointer(&obj),
> > +                     .buffer_count = 1,
> > +                     .flags = engine,
> > +                     .rsvd1 = scratch,
> > +             };
> > +
> > +             if (engine == (spin->execbuf.flags & ENGINE_MASK))
> > +                     continue;
> 
> Ugh saving and restoring eb flags to find the spinning engine here I 
> feel will be a land mine for the upcoming for_each_physical_engine 
> conversion but what can we do.

It will be fine. Unless you plan to randomise discovery just to make
things interesting. :)

We can make reuse of engines explicit if use ctx->engines[] instead of
for_each_physical_engine().

> > +             gem_execbuf(i915, &execbuf);
> > +     }
> > +     gem_context_destroy(i915, scratch);
> > +     gem_sync(i915, obj.handle); /* to hang unless we can preempt */
> 
> I got lost - how does this work if the spinner is still keeping the 
> obj.handle busy?

obj.handle and spinner are separate, and on different contexts. So we
fill all engines with the spinner + semaphores. Submit a new batch that
has no dependencies, and our expectation is that it is able to run ahead
of the semaphore busywait. Is that a reasonable expectation for
userspace? (Note that this demonstrates a subtle change in the ABI with
the introduction of plain semaphores, as without preemption that patch
causes this test to hang. So whether or not it is a reasonable
expectation, the change in behaviour is unwanted, but may have gone
unnoticed)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 05/24] i915/gem_exec_schedule: Verify that using HW semaphores doesn't block
Date: Tue, 26 Mar 2019 10:03:50 +0000	[thread overview]
Message-ID: <155359462997.15930.8399218264806708761@skylake-alporthouse-com> (raw)
In-Reply-To: <b6552cbd-e7dc-f1db-2497-9c9e030fb70b@linux.intel.com>

Quoting Tvrtko Ursulin (2019-03-26 09:19:33)
> 
> 
> On 22/03/2019 09:21, Chris Wilson wrote:
> > We may use HW semaphores to schedule nearly-ready work such that they
> > are already spinning on the GPU waiting for the completion on another
> > engine. However, we don't want for that spinning task to actually block
> > any real work should it be scheduled.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   tests/i915/gem_exec_schedule.c | 87 ++++++++++++++++++++++++++++++++++
> >   1 file changed, 87 insertions(+)
> > 
> > diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
> > index 4f0577b4e..ae850c4a3 100644
> > --- a/tests/i915/gem_exec_schedule.c
> > +++ b/tests/i915/gem_exec_schedule.c
> > @@ -48,6 +48,10 @@
> >   
> >   #define MAX_CONTEXTS 1024
> >   
> > +#define LOCAL_I915_EXEC_BSD_SHIFT      (13)
> > +#define LOCAL_I915_EXEC_BSD_MASK       (3 << LOCAL_I915_EXEC_BSD_SHIFT)
> > +#define ENGINE_MASK  (I915_EXEC_RING_MASK | LOCAL_I915_EXEC_BSD_MASK)
> > +
> >   IGT_TEST_DESCRIPTION("Check that we can control the order of execution");
> >   
> >   static inline
> > @@ -320,6 +324,86 @@ static void smoketest(int fd, unsigned ring, unsigned timeout)
> >       }
> >   }
> >   
> > +static uint32_t __batch_create(int i915, uint32_t offset)
> > +{
> > +     const uint32_t bbe = MI_BATCH_BUFFER_END;
> > +     uint32_t handle;
> > +
> > +     handle = gem_create(i915, ALIGN(offset + 4, 4096));
> > +     gem_write(i915, handle, offset, &bbe, sizeof(bbe));
> > +
> > +     return handle;
> > +}
> > +
> > +static uint32_t batch_create(int i915)
> > +{
> > +     return __batch_create(i915, 0);
> > +}
> > +
> > +static void semaphore_userlock(int i915)
> > +{
> > +     struct drm_i915_gem_exec_object2 obj = {
> > +             .handle = batch_create(i915),
> > +     };
> > +     igt_spin_t *spin = NULL;
> > +     unsigned int engine;
> > +     uint32_t scratch;
> > +
> > +     igt_require(gem_scheduler_has_preemption(i915));
> > +
> > +     /*
> > +      * Given the use of semaphores to govern parallel submission
> > +      * of nearly-ready work to HW, we still want to run actually
> > +      * ready work immediately. Without semaphores, the dependent
> > +      * work wouldn't be submitted so our ready work will run.
> > +      */
> > +
> > +     scratch = gem_create(i915, 4096);
> > +     for_each_physical_engine(i915, engine) {
> > +             if (!spin) {
> > +                     spin = igt_spin_batch_new(i915,
> > +                                               .dependency = scratch,
> > +                                               .engine = engine);
> > +             } else {
> > +                     typeof(spin->execbuf.flags) saved = spin->execbuf.flags;
> 
> u64 reads better and struct eb won't change anyway.

If it were only u64!

> > +                     spin->execbuf.flags &= ~ENGINE_MASK;
> > +                     spin->execbuf.flags |= engine;
> > +
> > +                     gem_execbuf(i915, &spin->execbuf);
> 
> Do you need to wait for spinner to be running before submitting these 
> ones, to make sure the logic emits a semaphore poll for them and submits 
> them straight away?

Not required, the semaphores are emitted based on completion status.

> > +                     spin->execbuf.flags = saved;
> > +             }
> > +     }
> > +     igt_require(spin);
> > +     gem_close(i915, scratch);
> > +
> > +     /*
> > +      * On all dependent engines, the request may be executing (busywaiting
> > +      * on a HW semaphore) but it should not prevent any real work from
> > +      * taking precedence.
> > +      */
> > +     scratch = gem_context_create(i915);
> > +     for_each_physical_engine(i915, engine) {
> > +             struct drm_i915_gem_execbuffer2 execbuf = {
> > +                     .buffers_ptr = to_user_pointer(&obj),
> > +                     .buffer_count = 1,
> > +                     .flags = engine,
> > +                     .rsvd1 = scratch,
> > +             };
> > +
> > +             if (engine == (spin->execbuf.flags & ENGINE_MASK))
> > +                     continue;
> 
> Ugh saving and restoring eb flags to find the spinning engine here I 
> feel will be a land mine for the upcoming for_each_physical_engine 
> conversion but what can we do.

It will be fine. Unless you plan to randomise discovery just to make
things interesting. :)

We can make reuse of engines explicit if use ctx->engines[] instead of
for_each_physical_engine().

> > +             gem_execbuf(i915, &execbuf);
> > +     }
> > +     gem_context_destroy(i915, scratch);
> > +     gem_sync(i915, obj.handle); /* to hang unless we can preempt */
> 
> I got lost - how does this work if the spinner is still keeping the 
> obj.handle busy?

obj.handle and spinner are separate, and on different contexts. So we
fill all engines with the spinner + semaphores. Submit a new batch that
has no dependencies, and our expectation is that it is able to run ahead
of the semaphore busywait. Is that a reasonable expectation for
userspace? (Note that this demonstrates a subtle change in the ABI with
the introduction of plain semaphores, as without preemption that patch
causes this test to hang. So whether or not it is a reasonable
expectation, the change in behaviour is unwanted, but may have gone
unnoticed)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-03-26 10:03 UTC|newest]

Thread overview: 113+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22  9:21 [PATCH i-g-t 01/24] i915/gem_exec_latency: Measure the latency of context switching Chris Wilson
2019-03-22  9:21 ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 02/24] lib: Add GPU power measurement Chris Wilson
2019-03-22  9:21   ` [Intel-gfx] " Chris Wilson
2019-03-26  8:36   ` [igt-dev] " Tvrtko Ursulin
2019-03-26  8:36     ` Tvrtko Ursulin
2019-03-26  8:49     ` Chris Wilson
2019-03-26  8:49       ` Chris Wilson
2019-03-26  9:18   ` [PATCH i-g-t v2] " Chris Wilson
2019-03-26  9:18     ` [igt-dev] " Chris Wilson
2019-03-26  9:52     ` Tvrtko Ursulin
2019-03-26  9:52       ` Tvrtko Ursulin
2019-03-26 10:06       ` Chris Wilson
2019-03-26 10:06         ` Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 03/24] i915/gem_exec_schedule: Measure semaphore power consumption Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26  8:46   ` Tvrtko Ursulin
2019-03-26  8:46     ` [Intel-gfx] " Tvrtko Ursulin
2019-03-26  9:23     ` Chris Wilson
2019-03-26  9:23       ` Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 04/24] i915/gem_exec_whisper: Measure total power consumed Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26  8:47   ` Tvrtko Ursulin
2019-03-26  8:47     ` Tvrtko Ursulin
2019-03-22  9:21 ` [PATCH i-g-t 05/24] i915/gem_exec_schedule: Verify that using HW semaphores doesn't block Chris Wilson
2019-03-22  9:21   ` [Intel-gfx] " Chris Wilson
2019-03-26  9:19   ` [igt-dev] " Tvrtko Ursulin
2019-03-26  9:19     ` Tvrtko Ursulin
2019-03-26 10:03     ` Chris Wilson [this message]
2019-03-26 10:03       ` [Intel-gfx] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 06/24] i915/gem_exec_nop: poll-sequential requires ordering between rings Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26  9:38   ` Tvrtko Ursulin
2019-03-26  9:38     ` Tvrtko Ursulin
2019-03-22  9:21 ` [PATCH i-g-t 07/24] i915/gem_sync: Make switch-default asymmetric Chris Wilson
2019-03-22  9:21   ` [Intel-gfx] " Chris Wilson
2019-03-26  9:57   ` [igt-dev] " Tvrtko Ursulin
2019-03-26  9:57     ` [Intel-gfx] " Tvrtko Ursulin
2019-03-22  9:21 ` [PATCH i-g-t 08/24] i915/gem_ctx_param: Remove kneecapping Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26  9:58   ` Tvrtko Ursulin
2019-03-26  9:58     ` Tvrtko Ursulin
2019-03-22  9:21 ` [PATCH i-g-t 09/24] i915/gem_exec_big: Add a single shot test Chris Wilson
2019-03-22  9:21   ` [Intel-gfx] " Chris Wilson
2019-03-26 10:06   ` [igt-dev] " Tvrtko Ursulin
2019-03-26 10:06     ` Tvrtko Ursulin
2019-03-26 10:21     ` Chris Wilson
2019-03-26 10:21       ` Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 10/24] kms_fence_pin_leak: Ask for the GPU before use Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26 10:10   ` Tvrtko Ursulin
2019-03-26 10:10     ` Tvrtko Ursulin
2019-03-22  9:21 ` [PATCH i-g-t 11/24] drm-uapi: Import i915_drm.h upto 53073249452d Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 12/24] lib/i915: Improve gem_context error messages Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26 10:14   ` Tvrtko Ursulin
2019-03-26 10:14     ` Tvrtko Ursulin
2019-03-22  9:21 ` [PATCH i-g-t 13/24] i915/gem_ctx_param: Test set/get (copy) VM Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26 10:22   ` Tvrtko Ursulin
2019-03-26 10:22     ` Tvrtko Ursulin
2019-03-26 10:33     ` Tvrtko Ursulin
2019-03-26 10:33       ` Tvrtko Ursulin
2019-03-26 10:51       ` Chris Wilson
2019-03-26 10:51         ` Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 14/24] i915/gem_ctx_create: Basic checks for constructor properties Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26 10:46   ` Tvrtko Ursulin
2019-03-26 10:46     ` Tvrtko Ursulin
2019-03-26 11:06     ` Chris Wilson
2019-03-26 11:06       ` [Intel-gfx] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 15/24] i915: Add gem_vm_create Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26 11:21   ` Tvrtko Ursulin
2019-03-26 11:21     ` Tvrtko Ursulin
2019-03-26 11:37     ` Chris Wilson
2019-03-26 11:37       ` Chris Wilson
2019-03-26 11:48       ` Tvrtko Ursulin
2019-03-26 11:48         ` Tvrtko Ursulin
2019-03-26 14:11         ` Chris Wilson
2019-03-26 14:11           ` Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 16/24] drm-uapi: Import i915_drm.h upto 364df3d04d51 Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 17/24] i915: Add gem_ctx_clone Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-26 15:44   ` Tvrtko Ursulin
2019-03-26 15:44     ` Tvrtko Ursulin
2019-03-26 15:49     ` Chris Wilson
2019-03-26 15:49       ` Chris Wilson
2019-03-26 15:54     ` Chris Wilson
2019-03-26 15:54       ` Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 18/24] i915: Exercise creating context with shared GTT Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 19/24] i915/gem_ctx_switch: Exercise queues Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 20/24] i915/gem_exec_whisper: Fork all-engine tests one-per-engine Chris Wilson
2019-03-22  9:21   ` [Intel-gfx] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 21/24] i915/gem_exec_whisper: debugfs/next_seqno is defunct Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 22/24] i915: Add gem_ctx_engines Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22 16:40   ` Andi Shyti
2019-03-22 16:40     ` [igt-dev] " Andi Shyti
2019-03-22 16:48     ` Chris Wilson
2019-03-22 16:48       ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 23/24] i915: Add gem_exec_balancer Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22  9:21 ` [PATCH i-g-t 24/24] i915/gem_exec_balancer: Exercise bonded pairs Chris Wilson
2019-03-22  9:21   ` [igt-dev] " Chris Wilson
2019-03-22 10:22 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,01/24] i915/gem_exec_latency: Measure the latency of context switching Patchwork
2019-03-23  6:38 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
2019-03-26 11:00 ` [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,01/24] i915/gem_exec_latency: Measure the latency of context switching (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=155359462997.15930.8399218264806708761@skylake-alporthouse-com \
    --to=chris@chris-wilson.co.uk \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.