All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Auld <matthew.william.auld@gmail.com>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	Matthew Auld <matthew.auld@intel.com>,
	ML dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits
Date: Tue, 13 Jul 2021 17:13:37 +0100	[thread overview]
Message-ID: <CAM0jSHOx=WVbzfQzn=kL-5qaG4B3dxPLOimkvUdv6HFJymZeZw@mail.gmail.com> (raw)
In-Reply-To: <YO23Y3PUS22FaXDC@intel.com>

On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > +     /**
> > +      * @cache_coherent:
> > +      *
> > +      * Track whether the pages are coherent with the GPU if reading or
> > +      * writing through the CPU cache.
> > +      *
> > +      * This largely depends on the @cache_level, for example if the object
> > +      * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> > +      * reads and writes through the CPU cache.
> > +      *
> > +      * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> > +      * the CPU cache are always coherent, regardless of the @cache_level. On
> > +      * snooping based platforms this is not the case, unless the full
> > +      * I915_CACHE_LLC or similar setting is used.
> > +      *
> > +      * As a result of this we need to track coherency separately for reads
> > +      * and writes, in order to avoid superfluous flushing on shared-LLC
> > +      * platforms, for reads.
> > +      *
> > +      * I915_BO_CACHE_COHERENT_FOR_READ:
> > +      *
> > +      * When reading through the CPU cache, the GPU is still coherent. Note
> > +      * that no data has actually been modified here, so it might seem
> > +      * strange that we care about this.
> > +      *
> > +      * As an example, if some object is mapped on the CPU with write-back
> > +      * caching, and we read some page, then the cache likely now contains
> > +      * the data from that read. At this point the cache and main memory
> > +      * match up, so all good. But next the GPU needs to write some data to
> > +      * that same page. Now if the @cache_level is I915_CACHE_NONE and the
> > +      * the platform doesn't have the shared-LLC, then the GPU will
> > +      * effectively skip invalidating the cache(or however that works
> > +      * internally) when writing the new value.  This is really bad since the
> > +      * GPU has just written some new data to main memory, but the CPU cache
> > +      * is still valid and now contains stale data. As a result the next time
> > +      * we do a cached read with the CPU, we are rewarded with stale data.
> > +      * Likewise if the cache is later flushed, we might be rewarded with
> > +      * overwriting main memory with stale data.
> > +      *
> > +      * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > +      *
> > +      * When writing through the CPU cache, the GPU is still coherent. Note
> > +      * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > +      *
> > +      * This is never set when I915_CACHE_NONE is used for @cache_level,
> > +      * where instead we have to manually flush the caches after writing
> > +      * through the CPU cache. For other cache levels this should be set and
> > +      * the object is therefore considered coherent for both reads and writes
> > +      * through the CPU cache.
>
> I don't remember why we have this read vs. write split and this new
> documentation doesn't seem to really explain it either.

Hmm, I attempted to explain that earlier:

* Note that on platforms with shared-LLC support(HAS_LLC) reads through
* the CPU cache are always coherent, regardless of the @cache_level. On
* snooping based platforms this is not the case, unless the full
* I915_CACHE_LLC or similar setting is used.
*
* As a result of this we need to track coherency separately for reads
* and writes, in order to avoid superfluous flushing on shared-LLC
* platforms, for reads.

So AFAIK it's just because shared-LLC can be coherent for reads, while
also not being coherent for writes(CACHE_NONE), so being able to track
each separately is kind of needed to avoid unnecessary flushing for
the read cases i.e simple boolean for coherent vs non-coherent is not
enough.

I can try to reword things to make that more clear.

>
> Is it for optimizing some display related case where we can omit the
> invalidates but still have to do the writeback to keep the display
> engine happy?
>
> --
> Ville Syrjälä
> Intel

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Auld <matthew.william.auld@gmail.com>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	Matthew Auld <matthew.auld@intel.com>,
	ML dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits
Date: Tue, 13 Jul 2021 17:13:37 +0100	[thread overview]
Message-ID: <CAM0jSHOx=WVbzfQzn=kL-5qaG4B3dxPLOimkvUdv6HFJymZeZw@mail.gmail.com> (raw)
In-Reply-To: <YO23Y3PUS22FaXDC@intel.com>

On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > +     /**
> > +      * @cache_coherent:
> > +      *
> > +      * Track whether the pages are coherent with the GPU if reading or
> > +      * writing through the CPU cache.
> > +      *
> > +      * This largely depends on the @cache_level, for example if the object
> > +      * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> > +      * reads and writes through the CPU cache.
> > +      *
> > +      * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> > +      * the CPU cache are always coherent, regardless of the @cache_level. On
> > +      * snooping based platforms this is not the case, unless the full
> > +      * I915_CACHE_LLC or similar setting is used.
> > +      *
> > +      * As a result of this we need to track coherency separately for reads
> > +      * and writes, in order to avoid superfluous flushing on shared-LLC
> > +      * platforms, for reads.
> > +      *
> > +      * I915_BO_CACHE_COHERENT_FOR_READ:
> > +      *
> > +      * When reading through the CPU cache, the GPU is still coherent. Note
> > +      * that no data has actually been modified here, so it might seem
> > +      * strange that we care about this.
> > +      *
> > +      * As an example, if some object is mapped on the CPU with write-back
> > +      * caching, and we read some page, then the cache likely now contains
> > +      * the data from that read. At this point the cache and main memory
> > +      * match up, so all good. But next the GPU needs to write some data to
> > +      * that same page. Now if the @cache_level is I915_CACHE_NONE and the
> > +      * the platform doesn't have the shared-LLC, then the GPU will
> > +      * effectively skip invalidating the cache(or however that works
> > +      * internally) when writing the new value.  This is really bad since the
> > +      * GPU has just written some new data to main memory, but the CPU cache
> > +      * is still valid and now contains stale data. As a result the next time
> > +      * we do a cached read with the CPU, we are rewarded with stale data.
> > +      * Likewise if the cache is later flushed, we might be rewarded with
> > +      * overwriting main memory with stale data.
> > +      *
> > +      * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > +      *
> > +      * When writing through the CPU cache, the GPU is still coherent. Note
> > +      * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > +      *
> > +      * This is never set when I915_CACHE_NONE is used for @cache_level,
> > +      * where instead we have to manually flush the caches after writing
> > +      * through the CPU cache. For other cache levels this should be set and
> > +      * the object is therefore considered coherent for both reads and writes
> > +      * through the CPU cache.
>
> I don't remember why we have this read vs. write split and this new
> documentation doesn't seem to really explain it either.

Hmm, I attempted to explain that earlier:

* Note that on platforms with shared-LLC support(HAS_LLC) reads through
* the CPU cache are always coherent, regardless of the @cache_level. On
* snooping based platforms this is not the case, unless the full
* I915_CACHE_LLC or similar setting is used.
*
* As a result of this we need to track coherency separately for reads
* and writes, in order to avoid superfluous flushing on shared-LLC
* platforms, for reads.

So AFAIK it's just because shared-LLC can be coherent for reads, while
also not being coherent for writes(CACHE_NONE), so being able to track
each separately is kind of needed to avoid unnecessary flushing for
the read cases i.e simple boolean for coherent vs non-coherent is not
enough.

I can try to reword things to make that more clear.

>
> Is it for optimizing some display related case where we can omit the
> invalidates but still have to do the writeback to keep the display
> engine happy?
>
> --
> Ville Syrjälä
> Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-07-13 16:14 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13 10:45 [PATCH 1/5] drm/i915: document caching related bits Matthew Auld
2021-07-13 10:45 ` [Intel-gfx] " Matthew Auld
2021-07-13 10:45 ` [PATCH 2/5] drm/i915/uapi: convert drm_i915_gem_madvise to kernel-doc Matthew Auld
2021-07-13 10:45   ` [Intel-gfx] " Matthew Auld
2021-07-13 10:45 ` [PATCH 3/5] drm/i915: convert drm_i915_gem_object " Matthew Auld
2021-07-13 10:45   ` [Intel-gfx] " Matthew Auld
2021-07-13 10:45 ` [PATCH 4/5] drm/i915: pull in some more kernel-doc Matthew Auld
2021-07-13 10:45   ` [Intel-gfx] " Matthew Auld
2021-07-13 10:45 ` [PATCH 5/5] drm/i915/ehl: unconditionally flush the pages on acquire Matthew Auld
2021-07-13 10:45   ` [Intel-gfx] " Matthew Auld
2021-07-14 11:19   ` Daniel Vetter
2021-07-14 11:19     ` [Intel-gfx] " Daniel Vetter
2021-07-13 11:12 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: document caching related bits Patchwork
2021-07-13 11:39 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-07-13 11:41 ` [Intel-gfx] [PATCH 1/5] " Mika Kuoppala
2021-07-13 11:41   ` Mika Kuoppala
2021-07-13 12:59 ` [Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/5] " Patchwork
2021-07-13 15:55 ` [Intel-gfx] [PATCH 1/5] " Ville Syrjälä
2021-07-13 15:55   ` Ville Syrjälä
2021-07-13 16:13   ` Matthew Auld [this message]
2021-07-13 16:13     ` Matthew Auld
2021-07-13 16:50     ` Daniel Vetter
2021-07-13 16:50       ` Daniel Vetter
2021-07-13 17:47     ` Ville Syrjälä
2021-07-13 17:47       ` Ville Syrjälä
2021-07-13 18:24       ` Matthew Auld
2021-07-13 18:24         ` Matthew Auld
2021-07-13 18:46         ` Ville Syrjälä
2021-07-13 18:46           ` Ville Syrjälä
2021-07-14 11:16           ` Daniel Vetter
2021-07-14 11:16             ` Daniel Vetter
2021-07-14 11:42             ` Ville Syrjälä
2021-07-14 11:42               ` Ville Syrjälä
2021-07-14 12:08               ` Ville Syrjälä
2021-07-14 12:08                 ` Ville Syrjälä
2021-07-14 12:57               ` Daniel Vetter
2021-07-14 12:57                 ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM0jSHOx=WVbzfQzn=kL-5qaG4B3dxPLOimkvUdv6HFJymZeZw@mail.gmail.com' \
    --to=matthew.william.auld@gmail.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.