All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Frederick, Michael T" <michael.t.frederick@intel.com>
To: "Tamminen, Eero T" <eero.t.tamminen@intel.com>,
	Daniel Vetter <daniel@ffwll.ch>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>
Subject: Re: [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config
Date: Tue, 26 Apr 2016 17:25:04 +0000	[thread overview]
Message-ID: <C60B45DB5719F7449BF9D47D22012D959C688BDB@ORSMSX113.amr.corp.intel.com> (raw)
In-Reply-To: <571FA2E9.10307@intel.com>

Sorry I'm not tracking all the MOCs discussions.  I just want to indicate what the coherency means in SoC for BXT.

GTI sets the non-inclusive bit on the IDI interface based on how it treats the memory.  In BXT case where there is no uncore cache, "non-inclusive" just indicates snoop or not.  BXT has a snoop filter in order to make the latency of snooping GT from a core roughly similar to snooping another core.

For BXT:
If GTI sets non-inclusive=0 (i.e. coherent): transaction looks up in the SF and the SA snoops the cores.  The potential impact here is that for high BW coherent traffic, the SF will become the BW limiter of the system and cap BW at 33% * 34GBps. For writes like WCILFs snoops to cores must be resolved before SA requests WR data from GT.  For reads the common case should have no impact because snoop latency is generally much less than memory data latency.  In general snoop latency for a core is relatively small, but there is also the prospect that a core could be down (e.g. ratio change) or loaded w/ snooping.
If GTI sets non-inclusive=1 (i.e. non-coherent): transaction takes the SF bypass and the SA does not snoop the cores.  This is best for high-BW since it removes the SF bottleneck and doesn't require core interaction.

Thanks, Mike


-----Original Message-----
From: Tamminen, Eero T 
Sent: Tuesday, April 26, 2016 10:19 AM
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>; Deak, Imre <imre.deak@intel.com>; intel-gfx@lists.freedesktop.org; Rantala, Valtteri <valtteri.rantala@intel.com>; Frederick, Michael T <michael.t.frederick@intel.com>; Ville Syrjälä <ville.syrjala@linux.intel.com>
Subject: Re: [Intel-gfx] [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config

Hi,

On 26.04.2016 17:30, Daniel Vetter wrote:
> On Tue, Apr 26, 2016 at 05:26:43PM +0300, Eero Tamminen wrote:
[...]
>> What this kernel ABI (index entry #2) has been agreed & documented to 
>> provide?
>>
>> I thought this entry is supposed to replace the writeback LLC/eLLC 
>> cache MOCS setting Mesa is using on (e.g. BDW) to speed up accesses 
>> to a memory area which it knows always to be accessed so that it can be cached.
>>
>> If app runs on HW where LLC/eLLC is missing, giving the app extra 
>> slowdown instead of potential speedup sounds like failed HW 
>> abstraction. :-)
>
> Well mesa needs to know llc vs. !llc anyway to not totally suck,

What do you think it should do with that information?

I assume you to mean, that Mesa needs to know the *amount* of LLC and change its behavior based on that amount, not just whether it's present.

In that case Mesa does, and has always totally "sucked".  Mesa on earlier GEN(s) cached everything that can be cached, and I assume it to try to do that with GEN9 too.


However, based on our MOCS testing on BDW, that actually gives the best overall perf results.  On average it doesn't give much, but it was better than any straightforward (buffer size/type) heuristics for making something not to be cached in effort to utilize LLC "better".

It seemed that LLC is too small to have meaningful generic heuristics for normal 3D workloads (or they need to be very complex, something needing months of testing & iteration, or be per application, not generic).

eLLC could be a different matter as it's large enough that one can put e.g. color/depth buffer there.

Skip cache setting for LLC may also be useful, if it works (as it in a sense extends the cache size), and render compression can also change things.  Problem with RBC is that it makes assumptions about memory areas usage even less reliable as you don't know how well the content compresses.


> and defining entry #2 as "coherent, always" makes sense. I thought 
> entry 0 was the reaonable default aka pte passthrough and hence managed by kernel?
>
> If mesa asks for nonsense, the kernel is happy to oblige.


	- Eero

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-04-26 17:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 12:44 [PATCH v2 1/2] drm/i915/gen9: Clean up MOCS table definitions Imre Deak
2016-04-26 12:44 ` [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config Imre Deak
2016-04-26 12:57   ` Chris Wilson
2016-04-26 13:17     ` Imre Deak
2016-04-26 13:23       ` Chris Wilson
2016-04-26 13:43         ` Imre Deak
2016-04-26 13:58           ` Chris Wilson
2016-04-26 14:26         ` Eero Tamminen
2016-04-26 14:30           ` Daniel Vetter
2016-04-26 17:18             ` Eero Tamminen
2016-04-26 17:25               ` Frederick, Michael T [this message]
2016-04-27 13:25                 ` Eero Tamminen
2016-04-27 14:53                   ` Chris Wilson
2016-04-27 18:42                     ` Dave Gordon
2016-04-29  8:01                     ` Eero Tamminen
2016-04-26 17:57             ` Ville Syrjälä
2016-04-28  8:13               ` Daniel Vetter
2016-04-28 10:48                 ` Ville Syrjälä
2016-04-28 14:44                   ` Daniel Vetter
2016-04-28 17:21                     ` Ville Syrjälä
2016-04-26 14:42           ` Chris Wilson
2016-04-26 16:01             ` Imre Deak
2016-04-28  8:17               ` Daniel Vetter
2016-04-28  8:38                 ` Imre Deak
2016-04-28 14:48                   ` Daniel Vetter
2016-04-28 17:15                     ` Imre Deak
2016-05-02  8:28                       ` Daniel Vetter
2016-05-02 11:18                         ` Ville Syrjälä
2016-05-02 13:50                         ` Imre Deak
2016-04-28 17:25                     ` Ville Syrjälä
2016-04-26 13:12   ` Chris Wilson
2016-04-26 16:55 ` ✗ Fi.CI.BAT: failure for series starting with [v2,1/2] drm/i915/gen9: Clean up MOCS table definitions Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C60B45DB5719F7449BF9D47D22012D959C688BDB@ORSMSX113.amr.corp.intel.com \
    --to=michael.t.frederick@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=eero.t.tamminen@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=valtteri.rantala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.