All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Deak <imre.deak@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>,
	Valtteri Rantala <valtteri.rantala@intel.com>,
	intel-gfx@lists.freedesktop.org,
	Michael T Frederick <michael.t.frederick@intel.com>
Subject: Re: [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config
Date: Thu, 28 Apr 2016 20:15:24 +0300	[thread overview]
Message-ID: <1461863724.9253.90.camel@intel.com> (raw)
In-Reply-To: <20160428144837.GF5784@phenom.ffwll.local>

On to, 2016-04-28 at 16:48 +0200, Daniel Vetter wrote:
> On Thu, Apr 28, 2016 at 11:38:55AM +0300, Imre Deak wrote:
> > On to, 2016-04-28 at 10:17 +0200, Daniel Vetter wrote:
> > > On Tue, Apr 26, 2016 at 07:01:06PM +0300, Imre Deak wrote:
> > > > On ti, 2016-04-26 at 15:42 +0100, Chris Wilson wrote:
> > > > > On Tue, Apr 26, 2016 at 05:26:43PM +0300, Eero Tamminen
> > > > > wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On 26.04.2016 16:23, Chris Wilson wrote:
> > > > > > > On Tue, Apr 26, 2016 at 04:17:55PM +0300, Imre Deak
> > > > > > > wrote:
> > > > > > > > On ti, 2016-04-26 at 13:57 +0100, Chris Wilson wrote:
> > > > > > > > > On Tue, Apr 26, 2016 at 03:44:22PM +0300, Imre Deak
> > > > > > > > > wrote:
> > > > > > > > > > Setting a write-back cache policy in the MOCS entry
> > > > > > > > > > definition also
> > > > > > > > > > implies snooping, which has a considerable
> > > > > > > > > > overhead. This
> > > > > > > > > > is
> > > > > > > > > > unexpected for a few reasons:
> > > > > > > > > 
> > > > > > > > > If it is snooping, then I don't see why it is
> > > > > > > > > undesirable to
> > > > > > > > > have it
> > > > > > > > > available in a mocs setting. If it is bogus and the
> > > > > > > > > bit is
> > > > > > > > > undefined,
> > > > > > > > > then by all means remove it.
> > > > > > > > 
> > > > > > > > None of these entries are used alone for coherent
> > > > > > > > surfaces. For
> > > > > > > > that
> > > > > > > > the application would have to use entry index#1 or #2
> > > > > > > > _and_
> > > > > > > > call the
> > > > > > > > set caching IOCTL to set the corresponding buffer to be
> > > > > > > > cached.
> > > > > > > 
> > > > > > > No, the application doesn't. There are sufficent
> > > > > > > interfaces
> > > > > > > exposed that
> > > > > > > userspace can bypass the kernel if it so desired.
> > > > > > > 
> > > > > > > > The
> > > > > > > > problem is that without setting the buffer to be
> > > > > > > > cacheable the
> > > > > > > > expectation is that we won't be snooping and incur the
> > > > > > > > corresponding
> > > > > > > > overhead. This is what this patch addresses.
> > > > > > > 
> > > > > > > Not true.
> > > > > > > 
> > > > > > > > The bit is also bogus, if we wanted snooping via MOCS
> > > > > > > > we'd use
> > > > > > > > the
> > > > > > > > dedicated HW flag for that.
> > > > > > > 
> > > > > > > But you keep saying this bit *enables* snooping. So
> > > > > > > either it
> > > > > > > does or it
> > > > > > > doesn't.
> > > > > > > 
> > > > > > > > If we wanted to have a snooping MOCS entry we should
> > > > > > > > add that
> > > > > > > > separately (as a forth entry), but we'd need this
> > > > > > > > change as a
> > > > > > > > fix for
> > > > > > > > current users.
> > > > > > > 
> > > > > > > The current users who are getting what they request but
> > > > > > > don't
> > > > > > > know what
> > > > > > > they were requesting?
> > > > > > 
> > > > > > What this kernel ABI (index entry #2) has been agreed &
> > > > > > documented
> > > > > > to provide?
> > > > > 
> > > > > The ABI is what we agree makes sense between hardware /
> > > > > kernel /
> > > > > userspace, and then we stick to it.
> > > > > 
> > > > > > I thought this entry is supposed to replace the writeback
> > > > > > LLC/eLLC
> > > > > > cache MOCS setting Mesa is using on (e.g. BDW) to speed up
> > > > > > accesses
> > > > > > to a memory area which it knows always to be accessed so
> > > > > > that it
> > > > > > can
> > > > > > be cached.
> > > > > 
> > > > > Which it does... Only it speeds us writeback from the CPU,
> > > > > not the
> > > > > GPU. ;)
> > > > > 
> > > > > The confusion seems to be in mistaking !llc for llc. We have
> > > > > to come
> > > > > to
> > > > > some agreement on whether it makes sense having multiple
> > > > > entries for
> > > > > the
> > > > > same follows-PTE mocs or whether it makes more sense to
> > > > > expose the hw
> > > > > capabilties.
> > > > 
> > > > Note that we can't just say to Mesa to use index #0 on BXT
> > > > instead of
> > > > index #2, since index #0 turns off caching in GPU L3, so we'd
> > > > have to
> > > > also redefine that to be L3 cached. And I don't know what
> > > > turning on L3
> > > > caching for index #0 would break, I would rather avoid doing
> > > > that. Imo
> > > > defining the entries the following way would allow us to use
> > > > the same
> > > > indexes on GEN9 regardless of it being SKL or BXT:
> > > > #0: uncached (neither in L3 nor in (e)LLC)
> > > > #1: PTE passthrough
> > > 
> > > So our rendercpy in igt does set MOCS entry 1. Or how exactly do
> > > all the
> > > set_caching vs. rendercpy tests we currently have pass? Just not?
> > 
> > We don't have tests for coherent surfaces. The current rendercpy
> > test
> > just uses uncached buffers, so they are flushed before checking the
> > result. I could add a new subtest to rendercpy to test with a
> > cached/coherent surface.
> > 
> > > Also, you're guaranateeing that opencl/libva don't screw this up
> > > either?
> > 
> > If they don't set the given buffer to be cached via the set_caching
> > IOCTL (as a consequence making them coherent) they are already
> > screwed
> > on CHV. If they call the IOCTL they are fine on BXT too.
> 
> We do implicit set_caching when displaying something to something
> coherent. To make that work userspace should use the "use PTE" mode by
> default, except when they really know what they're doing.
> That's also the mode that's supposed to give you the most reasonable
> performance. But somehow that mode ended up in MOCS entry 1, so pretty much
> guaranteed userspace will get it wrong. Mesa just hit a perf snag, but
> might as well have been visual corruption. I think it'd be a lot safe to
> make "use PTE" entry 0.

Mesa uses entries 1 and 2. If something else like opencl or libva (or
even Mesa for that matter) uses index 0 for PTE pass-through that's a
bug on its own. I don't know if this is the case, but it's a separate
issue from what I'm trying to fix here.

This isn't about a case where a PTE pass-through entry needs to be
provided, but about the case where a cached but non-coherent one is
needed. Mesa assumes this to be entry 2 and I don't see why we couldn't
make sure that this assumption holds. Note that this entry on BXT could
be both a PTE pass-through one as in this patch or LLC-UC.

--Imre

> > > It's UABI, and we've botched it. Smells like we need to have a
> > > lot more
> > > solid story and make sure we get this right this time around.
> > > There's also
> > > the "how does dma-buf mmap support fit in" question.
> > > 
> > > Oh and nope, none of the relevant testcases are in BAT at all.
> > > 
> > > > #2: cached everywhere (L3 + (e)LLC if it exists), non-coherent
> > > > #3: cached everywhere, coherent
> > > > 
> > > > I'm not sure if there is even a user for coherent surfaces atm,
> > > > so then
> > > > we could delay adding #3 until it's really needed.
> > > 
> > > The problem with the above is that the various access paths on
> > > SoC
> > > (without LLC) and big core (with llc) are massively different.
> > > You need
> > > lots of different cases in upload/download paths for max speed
> > > anyway.
> > 
> > The current assumption everywhere in user space is the above three
> > entries with their definition above as far as I know. There are no
> > users that would use a MOCS entry to get a coherent surface, we
> > could
> > add that as a 4th entry once a user arises.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-04-28 17:15 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 12:44 [PATCH v2 1/2] drm/i915/gen9: Clean up MOCS table definitions Imre Deak
2016-04-26 12:44 ` [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config Imre Deak
2016-04-26 12:57   ` Chris Wilson
2016-04-26 13:17     ` Imre Deak
2016-04-26 13:23       ` Chris Wilson
2016-04-26 13:43         ` Imre Deak
2016-04-26 13:58           ` Chris Wilson
2016-04-26 14:26         ` Eero Tamminen
2016-04-26 14:30           ` Daniel Vetter
2016-04-26 17:18             ` Eero Tamminen
2016-04-26 17:25               ` Frederick, Michael T
2016-04-27 13:25                 ` Eero Tamminen
2016-04-27 14:53                   ` Chris Wilson
2016-04-27 18:42                     ` Dave Gordon
2016-04-29  8:01                     ` Eero Tamminen
2016-04-26 17:57             ` Ville Syrjälä
2016-04-28  8:13               ` Daniel Vetter
2016-04-28 10:48                 ` Ville Syrjälä
2016-04-28 14:44                   ` Daniel Vetter
2016-04-28 17:21                     ` Ville Syrjälä
2016-04-26 14:42           ` Chris Wilson
2016-04-26 16:01             ` Imre Deak
2016-04-28  8:17               ` Daniel Vetter
2016-04-28  8:38                 ` Imre Deak
2016-04-28 14:48                   ` Daniel Vetter
2016-04-28 17:15                     ` Imre Deak [this message]
2016-05-02  8:28                       ` Daniel Vetter
2016-05-02 11:18                         ` Ville Syrjälä
2016-05-02 13:50                         ` Imre Deak
2016-04-28 17:25                     ` Ville Syrjälä
2016-04-26 13:12   ` Chris Wilson
2016-04-26 16:55 ` ✗ Fi.CI.BAT: failure for series starting with [v2,1/2] drm/i915/gen9: Clean up MOCS table definitions Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461863724.9253.90.camel@intel.com \
    --to=imre.deak@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=eero.t.tamminen@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=michael.t.frederick@intel.com \
    --cc=valtteri.rantala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.