All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Ramalingam C <ramalingam.c@intel.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Subject: Re: [PATCH v7 2/9] drm/i915/gt: Use XY_FAST_COLOR_BLT to clear obj on graphics ver 12+
Date: Tue, 29 Mar 2022 08:25:07 +0200	[thread overview]
Message-ID: <6d019f4c414ecac65f4e662e730e80f4d0886b1d.camel@linux.intel.com> (raw)
In-Reply-To: <20220328190736.19697-3-ramalingam.c@intel.com>

On Tue, 2022-03-29 at 00:37 +0530, Ramalingam C wrote:
> Use faster XY_FAST_COLOR_BLT cmd on graphics version of 12 and more,
> for clearing (Zero out) the pages of the newly allocated object.
> 
> XY_FAST_COLOR_BLT is faster than the older XY_COLOR_BLT.
> 
> v2:
>   Typo fix at title [Thomas]
> v3:
>   XY_FAST_COLOR_BLT is used only for FLAT_CCS capable gen12+

Hm. It's a huge benefit also for DG1. But we can do that as a follow up
patch.

> 
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  5 +++
>  drivers/gpu/drm/i915/gt/intel_migrate.c      | 43 +++++++++++++++++-
> --
>  2 files changed, 43 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index d112ffd56418..925e55b6a94f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -205,6 +205,11 @@
>  
>  #define COLOR_BLT_CMD                  (2 << 29 | 0x40 << 22 | (5 -
> 2))
>  #define XY_COLOR_BLT_CMD               (2 << 29 | 0x50 << 22)
> +#define XY_FAST_COLOR_BLT_CMD          (2 << 29 | 0x44 << 22)
> +#define   XY_FAST_COLOR_BLT_DEPTH_32   (2 << 19)
> +#define   XY_FAST_COLOR_BLT_DW         16
> +#define   XY_FAST_COLOR_BLT_MOCS_MASK  GENMASK(27, 21)
> +#define   XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT 31
>  #define SRC_COPY_BLT_CMD               (2 << 29 | 0x43 << 22)
>  #define GEN9_XY_FAST_COPY_BLT_CMD      (2 << 29 | 0x42 << 22)
>  #define XY_SRC_COPY_BLT_CMD            (2 << 29 | 0x53 << 22)
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 9e6c98a17441..17dd372a47d1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -614,18 +614,51 @@ intel_context_migrate_copy(struct intel_context
> *ce,
>         return err;
>  }
>  
> -static int emit_clear(struct i915_request *rq, u32 offset, int size,
> u32 value)
> +static int emit_clear(struct i915_request *rq, u32 offset, int size,
> +                     u32 value, bool is_lmem)
>  {
> -       const int ver = GRAPHICS_VER(rq->engine->i915);
> +       struct drm_i915_private *i915 = rq->engine->i915;
> +       int mocs = rq->engine->gt->mocs.uc_index << 1;
> +       const int ver = GRAPHICS_VER(i915);
> +       int ring_sz;
>         u32 *cs;
>  
>         GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
>  
> -       cs = intel_ring_begin(rq, ver >= 8 ? 8 : 6);
> +       if (HAS_FLAT_CCS(i915) && ver >= 12)
> +               ring_sz = XY_FAST_COLOR_BLT_DW;
> +       else if (ver >= 8)
> +               ring_sz = 8;
> +       else
> +               ring_sz = 6;
> +
> +       cs = intel_ring_begin(rq, ring_sz);
>         if (IS_ERR(cs))
>                 return PTR_ERR(cs);
>  
> -       if (ver >= 8) {
> +       if (HAS_FLAT_CCS(i915) && ver >= 12) {
> +               *cs++ = XY_FAST_COLOR_BLT_CMD |
> XY_FAST_COLOR_BLT_DEPTH_32 |
> +                       (XY_FAST_COLOR_BLT_DW - 2);
> +               *cs++ = FIELD_PREP(XY_FAST_COLOR_BLT_MOCS_MASK, mocs)
> |
> +                       (PAGE_SIZE - 1);
> +               *cs++ = 0;
> +               *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> +               *cs++ = offset;
> +               *cs++ = rq->engine->instance;
> +               *cs++ = !is_lmem << XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT;
> +               /* BG7 */
> +               *cs++ = value;
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               /* BG11 */
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               /* BG13 */
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               *cs++ = 0;
> +       } else if (ver >= 8) {
>                 *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
>                 *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY |
> PAGE_SIZE;
>                 *cs++ = 0;
> @@ -708,7 +741,7 @@ intel_context_migrate_clear(struct intel_context
> *ce,
>                 if (err)
>                         goto out_rq;
>  
> -               err = emit_clear(rq, offset, len, value);
> +               err = emit_clear(rq, offset, len, value, is_lmem);
>  
>                 /* Arbitration is re-enabled between requests. */
>  out_rq:



WARNING: multiple messages have this Message-ID (diff)
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Ramalingam C <ramalingam.c@intel.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Subject: Re: [Intel-gfx] [PATCH v7 2/9] drm/i915/gt: Use XY_FAST_COLOR_BLT to clear obj on graphics ver 12+
Date: Tue, 29 Mar 2022 08:25:07 +0200	[thread overview]
Message-ID: <6d019f4c414ecac65f4e662e730e80f4d0886b1d.camel@linux.intel.com> (raw)
In-Reply-To: <20220328190736.19697-3-ramalingam.c@intel.com>

On Tue, 2022-03-29 at 00:37 +0530, Ramalingam C wrote:
> Use faster XY_FAST_COLOR_BLT cmd on graphics version of 12 and more,
> for clearing (Zero out) the pages of the newly allocated object.
> 
> XY_FAST_COLOR_BLT is faster than the older XY_COLOR_BLT.
> 
> v2:
>   Typo fix at title [Thomas]
> v3:
>   XY_FAST_COLOR_BLT is used only for FLAT_CCS capable gen12+

Hm. It's a huge benefit also for DG1. But we can do that as a follow up
patch.

> 
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  5 +++
>  drivers/gpu/drm/i915/gt/intel_migrate.c      | 43 +++++++++++++++++-
> --
>  2 files changed, 43 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index d112ffd56418..925e55b6a94f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -205,6 +205,11 @@
>  
>  #define COLOR_BLT_CMD                  (2 << 29 | 0x40 << 22 | (5 -
> 2))
>  #define XY_COLOR_BLT_CMD               (2 << 29 | 0x50 << 22)
> +#define XY_FAST_COLOR_BLT_CMD          (2 << 29 | 0x44 << 22)
> +#define   XY_FAST_COLOR_BLT_DEPTH_32   (2 << 19)
> +#define   XY_FAST_COLOR_BLT_DW         16
> +#define   XY_FAST_COLOR_BLT_MOCS_MASK  GENMASK(27, 21)
> +#define   XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT 31
>  #define SRC_COPY_BLT_CMD               (2 << 29 | 0x43 << 22)
>  #define GEN9_XY_FAST_COPY_BLT_CMD      (2 << 29 | 0x42 << 22)
>  #define XY_SRC_COPY_BLT_CMD            (2 << 29 | 0x53 << 22)
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 9e6c98a17441..17dd372a47d1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -614,18 +614,51 @@ intel_context_migrate_copy(struct intel_context
> *ce,
>         return err;
>  }
>  
> -static int emit_clear(struct i915_request *rq, u32 offset, int size,
> u32 value)
> +static int emit_clear(struct i915_request *rq, u32 offset, int size,
> +                     u32 value, bool is_lmem)
>  {
> -       const int ver = GRAPHICS_VER(rq->engine->i915);
> +       struct drm_i915_private *i915 = rq->engine->i915;
> +       int mocs = rq->engine->gt->mocs.uc_index << 1;
> +       const int ver = GRAPHICS_VER(i915);
> +       int ring_sz;
>         u32 *cs;
>  
>         GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
>  
> -       cs = intel_ring_begin(rq, ver >= 8 ? 8 : 6);
> +       if (HAS_FLAT_CCS(i915) && ver >= 12)
> +               ring_sz = XY_FAST_COLOR_BLT_DW;
> +       else if (ver >= 8)
> +               ring_sz = 8;
> +       else
> +               ring_sz = 6;
> +
> +       cs = intel_ring_begin(rq, ring_sz);
>         if (IS_ERR(cs))
>                 return PTR_ERR(cs);
>  
> -       if (ver >= 8) {
> +       if (HAS_FLAT_CCS(i915) && ver >= 12) {
> +               *cs++ = XY_FAST_COLOR_BLT_CMD |
> XY_FAST_COLOR_BLT_DEPTH_32 |
> +                       (XY_FAST_COLOR_BLT_DW - 2);
> +               *cs++ = FIELD_PREP(XY_FAST_COLOR_BLT_MOCS_MASK, mocs)
> |
> +                       (PAGE_SIZE - 1);
> +               *cs++ = 0;
> +               *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> +               *cs++ = offset;
> +               *cs++ = rq->engine->instance;
> +               *cs++ = !is_lmem << XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT;
> +               /* BG7 */
> +               *cs++ = value;
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               /* BG11 */
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               /* BG13 */
> +               *cs++ = 0;
> +               *cs++ = 0;
> +               *cs++ = 0;
> +       } else if (ver >= 8) {
>                 *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
>                 *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY |
> PAGE_SIZE;
>                 *cs++ = 0;
> @@ -708,7 +741,7 @@ intel_context_migrate_clear(struct intel_context
> *ce,
>                 if (err)
>                         goto out_rq;
>  
> -               err = emit_clear(rq, offset, len, value);
> +               err = emit_clear(rq, offset, len, value, is_lmem);
>  
>                 /* Arbitration is re-enabled between requests. */
>  out_rq:



  reply	other threads:[~2022-03-29  6:25 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-28 19:07 [PATCH v7 0/9] drm/i915/ttm: Evict and restore of compressed object Ramalingam C
2022-03-28 19:07 ` [Intel-gfx] " Ramalingam C
2022-03-28 19:07 ` [PATCH v7 1/9] drm/i915/gt: use engine instance directly for offset Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29  6:27   ` Hellstrom, Thomas
2022-03-29  6:27     ` [Intel-gfx] " Hellstrom, Thomas
2022-03-28 19:07 ` [PATCH v7 2/9] drm/i915/gt: Use XY_FAST_COLOR_BLT to clear obj on graphics ver 12+ Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29  6:25   ` Thomas Hellström [this message]
2022-03-29  6:25     ` Thomas Hellström
2022-03-28 19:07 ` [PATCH v7 3/9] drm/i915/gt: Optimize the migration and clear loop Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29 13:23   ` Balasubramani Vivekanandan
2022-03-29 13:23     ` [Intel-gfx] " Balasubramani Vivekanandan
2022-04-05 10:10     ` Ramalingam C
2022-04-05 10:10       ` [Intel-gfx] " Ramalingam C
2022-03-28 19:07 ` [PATCH v7 4/9] drm/i915/gt: Clear compress metadata for Flat-ccs objects Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29  6:52   ` Thomas Hellström
2022-03-28 19:07 ` [PATCH v7 5/9] drm/i915/selftest_migrate: Consider the possible roundup of size Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29  6:53   ` Thomas Hellström (Intel)
2022-03-28 19:07 ` [PATCH v7 6/9] drm/i915/selftest_migrate: Check CCS meta data clear Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29  6:56   ` Thomas Hellström
2022-03-28 19:07 ` [PATCH v7 7/9] drm/ttm: Add a parameter to add extra pages into ttm_tt Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-28 19:07 ` [PATCH v7 8/9] drm/i915/gem: Add extra pages in ttm_tt for ccs data Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-28 19:07 ` [PATCH v7 9/9] drm/i915/migrate: Evict and restore the flatccs capable lmem obj Ramalingam C
2022-03-28 19:07   ` [Intel-gfx] " Ramalingam C
2022-03-29  8:13   ` Thomas Hellström
2022-03-29  0:39 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/ttm: Evict and restore of compressed object (rev5) Patchwork
2022-03-29  0:41 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-03-29  0:45 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2022-04-01 12:37 [PATCH v7 0/9] drm/i915/ttm: Evict and restore of compressed object Ramalingam C
2022-04-01 12:37 ` [PATCH v7 2/9] drm/i915/gt: Use XY_FAST_COLOR_BLT to clear obj on graphics ver 12+ Ramalingam C

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6d019f4c414ecac65f4e662e730e80f4d0886b1d.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=ramalingam.c@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.