All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [PATCH i-g-t] i915/gem_workarounds: Verify regs directly
Date: Wed, 29 May 2019 11:24:49 +0100	[thread overview]
Message-ID: <155912548956.13891.16600713179238619259@skylake-alporthouse-com> (raw)
In-Reply-To: <87ef4ho9bh.fsf@gaia.fi.intel.com>

Quoting Mika Kuoppala (2019-05-29 11:15:46)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > It seems like the HW validator is getting better at preventing our
> > snooping of system registers from non-privileged batches! If we can't
> > use SRM, let's probe the register directly through mmio, making sure we
> > have the context spinning on the GPU first.
> >
> > v2: Hold forcewake just in case the spinning batch isn't enough to
> > justify our register access.
> >
> 
> If I recall correctly, either of them separately didn't
> work. And there was delay after grabbing the fw before
> the register contents appeared. Don't remember the gen tho.

That would be a kernel bug / HW bug. Either we fail in our ack sequence
(maybe let the read overtake the fw ack or something equally impossible),
or the HW fails in its.

If you can think of a way of spotting it, add it to
selftests/intel_uncore -- it's exactly the type of test we need in bring
up, verifying our mmio is accurate.

> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110544
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.william.auld@gmail.com>
> > ---
> >  tests/i915/gem_workarounds.c | 88 +++++++-----------------------------
> >  1 file changed, 17 insertions(+), 71 deletions(-)
> >
> > diff --git a/tests/i915/gem_workarounds.c b/tests/i915/gem_workarounds.c
> > index 44e3dce8a..2767b04d7 100644
> > --- a/tests/i915/gem_workarounds.c
> > +++ b/tests/i915/gem_workarounds.c
> > @@ -80,70 +80,27 @@ static bool write_only(const uint32_t addr)
> >       return false;
> >  }
> >  
> > -#define MI_STORE_REGISTER_MEM (0x24 << 23)
> > -
> > -static int workaround_fail_count(int fd, uint32_t ctx)
> > +static int workaround_fail_count(int i915, uint32_t ctx)
> >  {
> > -     struct drm_i915_gem_exec_object2 obj[2];
> > -     struct drm_i915_gem_relocation_entry *reloc;
> > -     struct drm_i915_gem_execbuffer2 execbuf;
> > -     uint32_t result_sz, batch_sz;
> > -     uint32_t *base, *out;
> > -     int fail_count = 0;
> > -
> > -     reloc = calloc(num_wa_regs, sizeof(*reloc));
> > -     igt_assert(reloc);
> > -
> > -     result_sz = 4 * num_wa_regs;
> > -     result_sz = PAGE_ALIGN(result_sz);
> > -
> > -     batch_sz = 16 * num_wa_regs + 4;
> > -     batch_sz = PAGE_ALIGN(batch_sz);
> > -
> > -     memset(obj, 0, sizeof(obj));
> > -     obj[0].handle = gem_create(fd, result_sz);
> > -     gem_set_caching(fd, obj[0].handle, I915_CACHING_CACHED);
> > -     obj[1].handle = gem_create(fd, batch_sz);
> > -     obj[1].relocs_ptr = to_user_pointer(reloc);
> > -     obj[1].relocation_count = num_wa_regs;
> > -
> > -     out = base = gem_mmap__cpu(fd, obj[1].handle, 0, batch_sz, PROT_WRITE);
> > -     for (int i = 0; i < num_wa_regs; i++) {
> > -             *out++ = MI_STORE_REGISTER_MEM | ((gen >= 8 ? 4 : 2) - 2);
> > -             *out++ = wa_regs[i].addr;
> > -             reloc[i].target_handle = obj[0].handle;
> > -             reloc[i].offset = (out - base) * sizeof(*out);
> > -             reloc[i].delta = i * sizeof(uint32_t);
> > -             reloc[i].read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > -             reloc[i].write_domain = I915_GEM_DOMAIN_INSTRUCTION;
> > -             *out++ = reloc[i].delta;
> > -             if (gen >= 8)
> > -                     *out++ = 0;
> > -     }
> > -     *out++ = MI_BATCH_BUFFER_END;
> > -     munmap(base, batch_sz);
> > +     igt_spin_t *spin;
> > +     int fw, fail = 0;
> >  
> > -     memset(&execbuf, 0, sizeof(execbuf));
> > -     execbuf.buffers_ptr = to_user_pointer(obj);
> > -     execbuf.buffer_count = 2;
> > -     execbuf.rsvd1 = ctx;
> > -     gem_execbuf(fd, &execbuf);
> > +     spin = igt_spin_new(i915, .ctx = ctx, .flags = IGT_SPIN_POLL_RUN);
> > +     igt_spin_busywait_until_started(spin);
> >  
> > -     gem_set_domain(fd, obj[0].handle, I915_GEM_DOMAIN_CPU, 0);
> > -
> > -     igt_debug("Address\tval\t\tmask\t\tread\t\tresult\n");
> > -
> > -     out = gem_mmap__cpu(fd, obj[0].handle, 0, result_sz, PROT_READ);
> > +     fw = igt_open_forcewake_handle(i915);
> 
> assert that it went fine?

Do we always expect the debugfs to be present? Do we strictly need it?
igt_debug if it isn't present and correlate that to a fail.
 
> Perhaps both will now do the trick. But if it fails
> get the forcewake before spinner so you get more delay.

Nah, I vote that in that case forcewake is broken and needs a kernel fix,
as we don't have any such delay when using I915_READ().
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [PATCH i-g-t] i915/gem_workarounds: Verify regs directly
Date: Wed, 29 May 2019 11:24:49 +0100	[thread overview]
Message-ID: <155912548956.13891.16600713179238619259@skylake-alporthouse-com> (raw)
In-Reply-To: <87ef4ho9bh.fsf@gaia.fi.intel.com>

Quoting Mika Kuoppala (2019-05-29 11:15:46)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > It seems like the HW validator is getting better at preventing our
> > snooping of system registers from non-privileged batches! If we can't
> > use SRM, let's probe the register directly through mmio, making sure we
> > have the context spinning on the GPU first.
> >
> > v2: Hold forcewake just in case the spinning batch isn't enough to
> > justify our register access.
> >
> 
> If I recall correctly, either of them separately didn't
> work. And there was delay after grabbing the fw before
> the register contents appeared. Don't remember the gen tho.

That would be a kernel bug / HW bug. Either we fail in our ack sequence
(maybe let the read overtake the fw ack or something equally impossible),
or the HW fails in its.

If you can think of a way of spotting it, add it to
selftests/intel_uncore -- it's exactly the type of test we need in bring
up, verifying our mmio is accurate.

> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110544
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.william.auld@gmail.com>
> > ---
> >  tests/i915/gem_workarounds.c | 88 +++++++-----------------------------
> >  1 file changed, 17 insertions(+), 71 deletions(-)
> >
> > diff --git a/tests/i915/gem_workarounds.c b/tests/i915/gem_workarounds.c
> > index 44e3dce8a..2767b04d7 100644
> > --- a/tests/i915/gem_workarounds.c
> > +++ b/tests/i915/gem_workarounds.c
> > @@ -80,70 +80,27 @@ static bool write_only(const uint32_t addr)
> >       return false;
> >  }
> >  
> > -#define MI_STORE_REGISTER_MEM (0x24 << 23)
> > -
> > -static int workaround_fail_count(int fd, uint32_t ctx)
> > +static int workaround_fail_count(int i915, uint32_t ctx)
> >  {
> > -     struct drm_i915_gem_exec_object2 obj[2];
> > -     struct drm_i915_gem_relocation_entry *reloc;
> > -     struct drm_i915_gem_execbuffer2 execbuf;
> > -     uint32_t result_sz, batch_sz;
> > -     uint32_t *base, *out;
> > -     int fail_count = 0;
> > -
> > -     reloc = calloc(num_wa_regs, sizeof(*reloc));
> > -     igt_assert(reloc);
> > -
> > -     result_sz = 4 * num_wa_regs;
> > -     result_sz = PAGE_ALIGN(result_sz);
> > -
> > -     batch_sz = 16 * num_wa_regs + 4;
> > -     batch_sz = PAGE_ALIGN(batch_sz);
> > -
> > -     memset(obj, 0, sizeof(obj));
> > -     obj[0].handle = gem_create(fd, result_sz);
> > -     gem_set_caching(fd, obj[0].handle, I915_CACHING_CACHED);
> > -     obj[1].handle = gem_create(fd, batch_sz);
> > -     obj[1].relocs_ptr = to_user_pointer(reloc);
> > -     obj[1].relocation_count = num_wa_regs;
> > -
> > -     out = base = gem_mmap__cpu(fd, obj[1].handle, 0, batch_sz, PROT_WRITE);
> > -     for (int i = 0; i < num_wa_regs; i++) {
> > -             *out++ = MI_STORE_REGISTER_MEM | ((gen >= 8 ? 4 : 2) - 2);
> > -             *out++ = wa_regs[i].addr;
> > -             reloc[i].target_handle = obj[0].handle;
> > -             reloc[i].offset = (out - base) * sizeof(*out);
> > -             reloc[i].delta = i * sizeof(uint32_t);
> > -             reloc[i].read_domains = I915_GEM_DOMAIN_INSTRUCTION;
> > -             reloc[i].write_domain = I915_GEM_DOMAIN_INSTRUCTION;
> > -             *out++ = reloc[i].delta;
> > -             if (gen >= 8)
> > -                     *out++ = 0;
> > -     }
> > -     *out++ = MI_BATCH_BUFFER_END;
> > -     munmap(base, batch_sz);
> > +     igt_spin_t *spin;
> > +     int fw, fail = 0;
> >  
> > -     memset(&execbuf, 0, sizeof(execbuf));
> > -     execbuf.buffers_ptr = to_user_pointer(obj);
> > -     execbuf.buffer_count = 2;
> > -     execbuf.rsvd1 = ctx;
> > -     gem_execbuf(fd, &execbuf);
> > +     spin = igt_spin_new(i915, .ctx = ctx, .flags = IGT_SPIN_POLL_RUN);
> > +     igt_spin_busywait_until_started(spin);
> >  
> > -     gem_set_domain(fd, obj[0].handle, I915_GEM_DOMAIN_CPU, 0);
> > -
> > -     igt_debug("Address\tval\t\tmask\t\tread\t\tresult\n");
> > -
> > -     out = gem_mmap__cpu(fd, obj[0].handle, 0, result_sz, PROT_READ);
> > +     fw = igt_open_forcewake_handle(i915);
> 
> assert that it went fine?

Do we always expect the debugfs to be present? Do we strictly need it?
igt_debug if it isn't present and correlate that to a fail.
 
> Perhaps both will now do the trick. But if it fails
> get the forcewake before spinner so you get more delay.

Nah, I vote that in that case forcewake is broken and needs a kernel fix,
as we don't have any such delay when using I915_READ().
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

  reply	other threads:[~2019-05-29 10:24 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29  9:58 [PATCH i-g-t] i915/gem_workarounds: Verify regs directly Chris Wilson
2019-05-29  9:58 ` [igt-dev] " Chris Wilson
2019-05-29 10:00 ` Chris Wilson
2019-05-29 10:00   ` [Intel-gfx] " Chris Wilson
2019-05-29 10:15 ` [igt-dev] " Mika Kuoppala
2019-05-29 10:15   ` Mika Kuoppala
2019-05-29 10:24   ` Chris Wilson [this message]
2019-05-29 10:24     ` Chris Wilson
2019-05-29 10:27 ` [igt-dev] ✓ Fi.CI.BAT: success for i915/gem_workarounds: Verify regs directly (rev2) Patchwork
2019-05-29 14:56 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2019-05-25  7:05 [igt-dev] [PATCH i-g-t] i915/gem_workarounds: Verify regs directly Chris Wilson
2019-05-29  9:51 ` Matthew Auld
2019-05-29  9:51   ` Matthew Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=155912548956.13891.16600713179238619259@skylake-alporthouse-com \
    --to=chris@chris-wilson.co.uk \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.