From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=NHDR=MF=lists.freedesktop.org=dri-devel-bounces@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,
	HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
	MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 97C0BC11F66
	for <dri-devel@archiver.kernel.org>; Tue, 13 Jul 2021 11:42:34 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 6444B60240
	for <dri-devel@archiver.kernel.org>; Tue, 13 Jul 2021 11:42:34 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6444B60240
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com
Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 43EDE6E08A;
	Tue, 13 Jul 2021 11:42:32 +0000 (UTC)
Received: from mga18.intel.com (mga18.intel.com [134.134.136.126])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 2B1696E086;
 Tue, 13 Jul 2021 11:42:30 +0000 (UTC)
X-IronPort-AV: E=McAfee;i="6200,9189,10043"; a="197418678"
X-IronPort-AV: E=Sophos;i="5.84,236,1620716400"; d="scan'208";a="197418678"
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 13 Jul 2021 04:42:23 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.84,236,1620716400"; d="scan'208";a="492829748"
Received: from gaia.fi.intel.com ([10.237.72.192])
 by FMSMGA003.fm.intel.com with ESMTP; 13 Jul 2021 04:42:20 -0700
Received: by gaia.fi.intel.com (Postfix, from userid 1000)
 id DF3AA5C1F0F; Tue, 13 Jul 2021 14:41:30 +0300 (EEST)
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits
In-Reply-To: <20210713104554.2381406-1-matthew.auld@intel.com>
References: <20210713104554.2381406-1-matthew.auld@intel.com>
Date: Tue, 13 Jul 2021 14:41:30 +0300
Message-ID: <87wnputp91.fsf@gaia.fi.intel.com>
MIME-Version: 1.0
Content-Type: text/plain
X-BeenThere: dri-devel@lists.freedesktop.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Direct Rendering Infrastructure - Development
 <dri-devel.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, dri-devel@lists.freedesktop.org
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Matthew Auld <matthew.auld@intel.com> writes:

> Try to document the object caching related bits, like cache_coherent and
> cache_dirty.
>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 135 +++++++++++++++++-
>  drivers/gpu/drm/i915/i915_drv.h               |   9 --
>  2 files changed, 131 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index ef3de2ae9723..02c3529b774c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -92,6 +92,57 @@ struct drm_i915_gem_object_ops {
>  	const char *name; /* friendly name for debug, e.g. lockdep classes */
>  };
>  
> +/**
> + * enum i915_cache_level - The supported GTT caching values for system memory
> + * pages.
> + *
> + * These translate to some special GTT PTE bits when binding pages into some
> + * address space. It also determines whether an object, or rather its pages are
> + * coherent with the GPU, when also reading or writing through the CPU cache
> + * with those pages.
> + *
> + * Userspace can also control this through struct drm_i915_gem_caching.
> + */
> +enum i915_cache_level {
> +	/**
> +	 * @I915_CACHE_NONE:
> +	 *
> +	 * Not coherent with the CPU cache. If the cache is dirty and we need
> +	 * the underlying pages to be coherent with some later GPU access then
> +	 * we need to manually flush the pages.
> +	 *
> +	 * Note that on shared-LLC platforms reads through the CPU cache are
> +	 * still coherent even with this setting. See also
> +	 * I915_BO_CACHE_COHERENT_FOR_READ for more details.
> +	 */
> +	I915_CACHE_NONE = 0,
> +	/**
> +	 * @I915_CACHE_LLC:
> +	 *
> +	 * Coherent with the CPU cache. If the cache is dirty, then the GPU will
> +	 * ensure that access remains coherent, when both reading and writing
> +	 * through the CPU cache.
> +	 *
> +	 * Applies to both platforms with shared-LLC(HAS_LLC), and snooping
> +	 * based platforms(HAS_SNOOP).
> +	 */
> +	I915_CACHE_LLC,
> +	/**
> +	 * @I915_CACHE_L3_LLC:
> +	 *
> +	 * gen7+, L3 sits between the domain specifc caches, eg sampler/render

typo: specifc

> +	 * caches, and the large Last-Level-Cache. LLC is coherent with the CPU,
> +	 * but L3 is only visible to the GPU.
> +	 */

I dont get the difference between this and I915_CACHE_LLC.
Could the diff between LLC and L3_LLC be described here with example?

Thanks,
-Mika

> +	I915_CACHE_L3_LLC,
> +	/**
> +	 * @I915_CACHE_WT:
> +	 *
> +	 * hsw:gt3e Write-through for scanout buffers.
> +	 */
> +	I915_CACHE_WT,
> +};
> +
>  enum i915_map_type {
>  	I915_MAP_WB = 0,
>  	I915_MAP_WC,
> @@ -228,14 +279,90 @@ struct drm_i915_gem_object {
>  	unsigned int mem_flags;
>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>  #define I915_BO_FLAG_IOMEM       BIT(1) /* Object backed by IO memory */
> -	/*
> -	 * Is the object to be mapped as read-only to the GPU
> -	 * Only honoured if hardware has relevant pte bit
> +	/**
> +	 * @cache_level: The desired GTT caching level.
> +	 *
> +	 * See enum i915_cache_level for possible values, along with what
> +	 * each does.
>  	 */
>  	unsigned int cache_level:3;
> -	unsigned int cache_coherent:2;
> +	/**
> +	 * @cache_coherent:
> +	 *
> +	 * Track whether the pages are coherent with the GPU if reading or
> +	 * writing through the CPU cache.
> +	 *
> +	 * This largely depends on the @cache_level, for example if the object
> +	 * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> +	 * reads and writes through the CPU cache.
> +	 *
> +	 * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> +	 * the CPU cache are always coherent, regardless of the @cache_level. On
> +	 * snooping based platforms this is not the case, unless the full
> +	 * I915_CACHE_LLC or similar setting is used.
> +	 *
> +	 * As a result of this we need to track coherency separately for reads
> +	 * and writes, in order to avoid superfluous flushing on shared-LLC
> +	 * platforms, for reads.
> +	 *
> +	 * I915_BO_CACHE_COHERENT_FOR_READ:
> +	 *
> +	 * When reading through the CPU cache, the GPU is still coherent. Note
> +	 * that no data has actually been modified here, so it might seem
> +	 * strange that we care about this.
> +	 *
> +	 * As an example, if some object is mapped on the CPU with write-back
> +	 * caching, and we read some page, then the cache likely now contains
> +	 * the data from that read. At this point the cache and main memory
> +	 * match up, so all good. But next the GPU needs to write some data to
> +	 * that same page. Now if the @cache_level is I915_CACHE_NONE and the
> +	 * the platform doesn't have the shared-LLC, then the GPU will
> +	 * effectively skip invalidating the cache(or however that works
> +	 * internally) when writing the new value.  This is really bad since the
> +	 * GPU has just written some new data to main memory, but the CPU cache
> +	 * is still valid and now contains stale data. As a result the next time
> +	 * we do a cached read with the CPU, we are rewarded with stale data.
> +	 * Likewise if the cache is later flushed, we might be rewarded with
> +	 * overwriting main memory with stale data.
> +	 *
> +	 * I915_BO_CACHE_COHERENT_FOR_WRITE:
> +	 *
> +	 * When writing through the CPU cache, the GPU is still coherent. Note
> +	 * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> +	 *
> +	 * This is never set when I915_CACHE_NONE is used for @cache_level,
> +	 * where instead we have to manually flush the caches after writing
> +	 * through the CPU cache. For other cache levels this should be set and
> +	 * the object is therefore considered coherent for both reads and writes
> +	 * through the CPU cache.
> +	 */
>  #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0)
>  #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1)
> +	unsigned int cache_coherent:2;
> +	/**
> +	 * @cache_dirty:
> +	 *
> +	 * Track if the cache might be dirty for the @pages i.e it has yet to be
> +	 * written back to main memory. As a result reading directly from main
> +	 * memory might yield stale data.
> +	 *
> +	 * This also ties into whether the kernel is tracking the object as
> +	 * coherent with the GPU, as per @cache_coherent, as it determines if
> +	 * flushing might be needed at various points.
> +	 *
> +	 * Another part of @cache_dirty is managing flushing when first
> +	 * acquiring the pages for system memory, at this point the pages are
> +	 * considered foreign, so the default assumption is that the cache is
> +	 * dirty, for example the page zeroing done my the kernel might leave
> +	 * writes though the CPU cache, or swapping-in, while the actual data in
> +	 * main memory is potentially stale.  Note that this is a potential
> +	 * security issue when dealing with userspace objects and zeroing. Now,
> +	 * whether we actually need apply the big sledgehammer of flushing all
> +	 * the pages on acquire depends on if @cache_coherent is marked as
> +	 * I915_BO_CACHE_COHERENT_FOR_WRITE, i.e that the GPU will be coherent
> +	 * for both reads and writes though the CPU cache. So pretty much this
> +	 * should only be needed for I915_CACHE_NONE objects.
> +	 */
>  	unsigned int cache_dirty:1;
>  
>  	/**
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c4747f4407ef..37bb1a3cadd4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -394,15 +394,6 @@ struct drm_i915_display_funcs {
>  	void (*read_luts)(struct intel_crtc_state *crtc_state);
>  };
>  
> -enum i915_cache_level {
> -	I915_CACHE_NONE = 0,
> -	I915_CACHE_LLC, /* also used for snoopable memory on non-LLC */
> -	I915_CACHE_L3_LLC, /* gen7+, L3 sits between the domain specifc
> -			      caches, eg sampler/render caches, and the
> -			      large Last-Level-Cache. LLC is coherent with
> -			      the CPU, but L3 is only visible to the GPU. */
> -	I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
> -};
>  
>  #define I915_COLOR_UNEVICTABLE (-1) /* a non-vma sharing the address space */
>  
> -- 
> 2.26.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=tFHw=MF=lists.freedesktop.org=intel-gfx-bounces@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,
	HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
	MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 30FB2C07E95
	for <intel-gfx@archiver.kernel.org>; Tue, 13 Jul 2021 11:42:34 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id E41BA611C0
	for <intel-gfx@archiver.kernel.org>; Tue, 13 Jul 2021 11:42:31 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E41BA611C0
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com
Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 8D0276E086;
	Tue, 13 Jul 2021 11:42:31 +0000 (UTC)
Received: from mga18.intel.com (mga18.intel.com [134.134.136.126])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 2B1696E086;
 Tue, 13 Jul 2021 11:42:30 +0000 (UTC)
X-IronPort-AV: E=McAfee;i="6200,9189,10043"; a="197418678"
X-IronPort-AV: E=Sophos;i="5.84,236,1620716400"; d="scan'208";a="197418678"
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 13 Jul 2021 04:42:23 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.84,236,1620716400"; d="scan'208";a="492829748"
Received: from gaia.fi.intel.com ([10.237.72.192])
 by FMSMGA003.fm.intel.com with ESMTP; 13 Jul 2021 04:42:20 -0700
Received: by gaia.fi.intel.com (Postfix, from userid 1000)
 id DF3AA5C1F0F; Tue, 13 Jul 2021 14:41:30 +0300 (EEST)
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-gfx@lists.freedesktop.org
In-Reply-To: <20210713104554.2381406-1-matthew.auld@intel.com>
References: <20210713104554.2381406-1-matthew.auld@intel.com>
Date: Tue, 13 Jul 2021 14:41:30 +0300
Message-ID: <87wnputp91.fsf@gaia.fi.intel.com>
MIME-Version: 1.0
Subject: Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Intel graphics driver community testing & development
 <intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, dri-devel@lists.freedesktop.org
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Matthew Auld <matthew.auld@intel.com> writes:

> Try to document the object caching related bits, like cache_coherent and
> cache_dirty.
>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 135 +++++++++++++++++-
>  drivers/gpu/drm/i915/i915_drv.h               |   9 --
>  2 files changed, 131 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index ef3de2ae9723..02c3529b774c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -92,6 +92,57 @@ struct drm_i915_gem_object_ops {
>  	const char *name; /* friendly name for debug, e.g. lockdep classes */
>  };
>  
> +/**
> + * enum i915_cache_level - The supported GTT caching values for system memory
> + * pages.
> + *
> + * These translate to some special GTT PTE bits when binding pages into some
> + * address space. It also determines whether an object, or rather its pages are
> + * coherent with the GPU, when also reading or writing through the CPU cache
> + * with those pages.
> + *
> + * Userspace can also control this through struct drm_i915_gem_caching.
> + */
> +enum i915_cache_level {
> +	/**
> +	 * @I915_CACHE_NONE:
> +	 *
> +	 * Not coherent with the CPU cache. If the cache is dirty and we need
> +	 * the underlying pages to be coherent with some later GPU access then
> +	 * we need to manually flush the pages.
> +	 *
> +	 * Note that on shared-LLC platforms reads through the CPU cache are
> +	 * still coherent even with this setting. See also
> +	 * I915_BO_CACHE_COHERENT_FOR_READ for more details.
> +	 */
> +	I915_CACHE_NONE = 0,
> +	/**
> +	 * @I915_CACHE_LLC:
> +	 *
> +	 * Coherent with the CPU cache. If the cache is dirty, then the GPU will
> +	 * ensure that access remains coherent, when both reading and writing
> +	 * through the CPU cache.
> +	 *
> +	 * Applies to both platforms with shared-LLC(HAS_LLC), and snooping
> +	 * based platforms(HAS_SNOOP).
> +	 */
> +	I915_CACHE_LLC,
> +	/**
> +	 * @I915_CACHE_L3_LLC:
> +	 *
> +	 * gen7+, L3 sits between the domain specifc caches, eg sampler/render

typo: specifc

> +	 * caches, and the large Last-Level-Cache. LLC is coherent with the CPU,
> +	 * but L3 is only visible to the GPU.
> +	 */

I dont get the difference between this and I915_CACHE_LLC.
Could the diff between LLC and L3_LLC be described here with example?

Thanks,
-Mika

> +	I915_CACHE_L3_LLC,
> +	/**
> +	 * @I915_CACHE_WT:
> +	 *
> +	 * hsw:gt3e Write-through for scanout buffers.
> +	 */
> +	I915_CACHE_WT,
> +};
> +
>  enum i915_map_type {
>  	I915_MAP_WB = 0,
>  	I915_MAP_WC,
> @@ -228,14 +279,90 @@ struct drm_i915_gem_object {
>  	unsigned int mem_flags;
>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>  #define I915_BO_FLAG_IOMEM       BIT(1) /* Object backed by IO memory */
> -	/*
> -	 * Is the object to be mapped as read-only to the GPU
> -	 * Only honoured if hardware has relevant pte bit
> +	/**
> +	 * @cache_level: The desired GTT caching level.
> +	 *
> +	 * See enum i915_cache_level for possible values, along with what
> +	 * each does.
>  	 */
>  	unsigned int cache_level:3;
> -	unsigned int cache_coherent:2;
> +	/**
> +	 * @cache_coherent:
> +	 *
> +	 * Track whether the pages are coherent with the GPU if reading or
> +	 * writing through the CPU cache.
> +	 *
> +	 * This largely depends on the @cache_level, for example if the object
> +	 * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> +	 * reads and writes through the CPU cache.
> +	 *
> +	 * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> +	 * the CPU cache are always coherent, regardless of the @cache_level. On
> +	 * snooping based platforms this is not the case, unless the full
> +	 * I915_CACHE_LLC or similar setting is used.
> +	 *
> +	 * As a result of this we need to track coherency separately for reads
> +	 * and writes, in order to avoid superfluous flushing on shared-LLC
> +	 * platforms, for reads.
> +	 *
> +	 * I915_BO_CACHE_COHERENT_FOR_READ:
> +	 *
> +	 * When reading through the CPU cache, the GPU is still coherent. Note
> +	 * that no data has actually been modified here, so it might seem
> +	 * strange that we care about this.
> +	 *
> +	 * As an example, if some object is mapped on the CPU with write-back
> +	 * caching, and we read some page, then the cache likely now contains
> +	 * the data from that read. At this point the cache and main memory
> +	 * match up, so all good. But next the GPU needs to write some data to
> +	 * that same page. Now if the @cache_level is I915_CACHE_NONE and the
> +	 * the platform doesn't have the shared-LLC, then the GPU will
> +	 * effectively skip invalidating the cache(or however that works
> +	 * internally) when writing the new value.  This is really bad since the
> +	 * GPU has just written some new data to main memory, but the CPU cache
> +	 * is still valid and now contains stale data. As a result the next time
> +	 * we do a cached read with the CPU, we are rewarded with stale data.
> +	 * Likewise if the cache is later flushed, we might be rewarded with
> +	 * overwriting main memory with stale data.
> +	 *
> +	 * I915_BO_CACHE_COHERENT_FOR_WRITE:
> +	 *
> +	 * When writing through the CPU cache, the GPU is still coherent. Note
> +	 * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> +	 *
> +	 * This is never set when I915_CACHE_NONE is used for @cache_level,
> +	 * where instead we have to manually flush the caches after writing
> +	 * through the CPU cache. For other cache levels this should be set and
> +	 * the object is therefore considered coherent for both reads and writes
> +	 * through the CPU cache.
> +	 */
>  #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0)
>  #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1)
> +	unsigned int cache_coherent:2;
> +	/**
> +	 * @cache_dirty:
> +	 *
> +	 * Track if the cache might be dirty for the @pages i.e it has yet to be
> +	 * written back to main memory. As a result reading directly from main
> +	 * memory might yield stale data.
> +	 *
> +	 * This also ties into whether the kernel is tracking the object as
> +	 * coherent with the GPU, as per @cache_coherent, as it determines if
> +	 * flushing might be needed at various points.
> +	 *
> +	 * Another part of @cache_dirty is managing flushing when first
> +	 * acquiring the pages for system memory, at this point the pages are
> +	 * considered foreign, so the default assumption is that the cache is
> +	 * dirty, for example the page zeroing done my the kernel might leave
> +	 * writes though the CPU cache, or swapping-in, while the actual data in
> +	 * main memory is potentially stale.  Note that this is a potential
> +	 * security issue when dealing with userspace objects and zeroing. Now,
> +	 * whether we actually need apply the big sledgehammer of flushing all
> +	 * the pages on acquire depends on if @cache_coherent is marked as
> +	 * I915_BO_CACHE_COHERENT_FOR_WRITE, i.e that the GPU will be coherent
> +	 * for both reads and writes though the CPU cache. So pretty much this
> +	 * should only be needed for I915_CACHE_NONE objects.
> +	 */
>  	unsigned int cache_dirty:1;
>  
>  	/**
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c4747f4407ef..37bb1a3cadd4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -394,15 +394,6 @@ struct drm_i915_display_funcs {
>  	void (*read_luts)(struct intel_crtc_state *crtc_state);
>  };
>  
> -enum i915_cache_level {
> -	I915_CACHE_NONE = 0,
> -	I915_CACHE_LLC, /* also used for snoopable memory on non-LLC */
> -	I915_CACHE_L3_LLC, /* gen7+, L3 sits between the domain specifc
> -			      caches, eg sampler/render caches, and the
> -			      large Last-Level-Cache. LLC is coherent with
> -			      the CPU, but L3 is only visible to the GPU. */
> -	I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
> -};
>  
>  #define I915_COLOR_UNEVICTABLE (-1) /* a non-vma sharing the address space */
>  
> -- 
> 2.26.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx