From: fei.yang@intel.com To: intel-gfx@lists.freedesktop.org Cc: Nirmoy Das <nirmoy.das@intel.com>, Fei Yang <fei.yang@intel.com>, dri-devel@lists.freedesktop.org, Andi Shyti <andi.shyti@linux.intel.com> Subject: [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media Date: Wed, 19 Apr 2023 16:00:54 -0700 [thread overview] Message-ID: <20230419230058.2659455-5-fei.yang@intel.com> (raw) In-Reply-To: <20230419230058.2659455-1-fei.yang@intel.com> From: Fei Yang <fei.yang@intel.com> This patch implements Wa_22016122933. In MTL, memory writes initiated by Media tile update the whole cache line even for partial writes. This creates a coherency problem for cacheable memory if both CPU and GPU are writing data to different locations within a single cache line. CTB communication is impacted by this issue because the head and tail pointers are adjacent words within a cache line (see struct guc_ct_buffer_desc), where one is written by GuC and the other by the host. This patch circumvents the issue by making CPU/GPU shared memory uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for CTB which is being updated by both CPU and GuC, mfence instruction is added to make sure the CPU writes are visible to GPU right away (flush the write combining buffer). While fixing the CTB issue, we noticed some random GSC firmware loading failure because the share buffers are cacheable (WB) on CPU side but uncached on GPU side. To fix these issues we need to map such shared buffers as WC on CPU side. Since such allocations are not all done through GuC allocator, to avoid too many code changes, the i915_coherent_map_type() is now hard coded to return WC for MTL. BSpec: 45101 Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 5 ++++- drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++++++++++++ drivers/gpu/drm/i915/gt/uc/intel_guc.c | 7 +++++++ drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 6 ++++++ 4 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index ecd86130b74f..89fc8ea6bcfc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct drm_i915_private *i915, struct drm_i915_gem_object *obj, bool always_coherent) { - if (i915_gem_object_is_lmem(obj)) + /* + * Wa_22016122933: always return I915_MAP_WC for MTL + */ + if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915)) return I915_MAP_WC; if (HAS_LLC(i915) || always_coherent) return I915_MAP_WB; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c index 1d9fdfb11268..236673c02f9a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c @@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc) if (obj->base.size < gsc->fw.size) return -ENOSPC; + /* + * Wa_22016122933: For MTL the shared memory needs to be mapped + * as WC on CPU side and UC (PAT index 2) on GPU side + */ + if (IS_METEORLAKE(i915)) + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + dst = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(i915, obj, true)); if (IS_ERR(dst)) @@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc) memset(dst, 0, obj->base.size); memcpy(dst, src, gsc->fw.size); + /* + * Wa_22016122933: Making sure the data in dst is + * visible to GSC right away + */ + intel_guc_write_barrier(>->uc.guc); + i915_gem_object_unpin_map(gsc->fw.obj); i915_gem_object_unpin_map(obj); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index e89f16ecf1ae..c9f20385f6a0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -744,6 +744,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size) if (IS_ERR(obj)) return ERR_CAST(obj); + /* + * Wa_22016122933: For MTL the shared memory needs to be mapped + * as WC on CPU side and UC (PAT index 2) on GPU side + */ + if (IS_METEORLAKE(gt->i915)) + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); if (IS_ERR(vma)) goto err; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 1803a633ed64..99a0a89091e7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg) /* now update descriptor */ WRITE_ONCE(desc->head, head); + /* + * Wa_22016122933: Making sure the head update is + * visible to GuC right away + */ + intel_guc_write_barrier(ct_to_guc(ct)); + return available - len; corrupted: -- 2.25.1
WARNING: multiple messages have this Message-ID (diff)
From: fei.yang@intel.com To: intel-gfx@lists.freedesktop.org Cc: Nirmoy Das <nirmoy.das@intel.com>, dri-devel@lists.freedesktop.org Subject: [Intel-gfx] [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media Date: Wed, 19 Apr 2023 16:00:54 -0700 [thread overview] Message-ID: <20230419230058.2659455-5-fei.yang@intel.com> (raw) In-Reply-To: <20230419230058.2659455-1-fei.yang@intel.com> From: Fei Yang <fei.yang@intel.com> This patch implements Wa_22016122933. In MTL, memory writes initiated by Media tile update the whole cache line even for partial writes. This creates a coherency problem for cacheable memory if both CPU and GPU are writing data to different locations within a single cache line. CTB communication is impacted by this issue because the head and tail pointers are adjacent words within a cache line (see struct guc_ct_buffer_desc), where one is written by GuC and the other by the host. This patch circumvents the issue by making CPU/GPU shared memory uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for CTB which is being updated by both CPU and GuC, mfence instruction is added to make sure the CPU writes are visible to GPU right away (flush the write combining buffer). While fixing the CTB issue, we noticed some random GSC firmware loading failure because the share buffers are cacheable (WB) on CPU side but uncached on GPU side. To fix these issues we need to map such shared buffers as WC on CPU side. Since such allocations are not all done through GuC allocator, to avoid too many code changes, the i915_coherent_map_type() is now hard coded to return WC for MTL. BSpec: 45101 Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 5 ++++- drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++++++++++++ drivers/gpu/drm/i915/gt/uc/intel_guc.c | 7 +++++++ drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 6 ++++++ 4 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index ecd86130b74f..89fc8ea6bcfc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct drm_i915_private *i915, struct drm_i915_gem_object *obj, bool always_coherent) { - if (i915_gem_object_is_lmem(obj)) + /* + * Wa_22016122933: always return I915_MAP_WC for MTL + */ + if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915)) return I915_MAP_WC; if (HAS_LLC(i915) || always_coherent) return I915_MAP_WB; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c index 1d9fdfb11268..236673c02f9a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c @@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc) if (obj->base.size < gsc->fw.size) return -ENOSPC; + /* + * Wa_22016122933: For MTL the shared memory needs to be mapped + * as WC on CPU side and UC (PAT index 2) on GPU side + */ + if (IS_METEORLAKE(i915)) + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + dst = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(i915, obj, true)); if (IS_ERR(dst)) @@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc) memset(dst, 0, obj->base.size); memcpy(dst, src, gsc->fw.size); + /* + * Wa_22016122933: Making sure the data in dst is + * visible to GSC right away + */ + intel_guc_write_barrier(>->uc.guc); + i915_gem_object_unpin_map(gsc->fw.obj); i915_gem_object_unpin_map(obj); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index e89f16ecf1ae..c9f20385f6a0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -744,6 +744,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size) if (IS_ERR(obj)) return ERR_CAST(obj); + /* + * Wa_22016122933: For MTL the shared memory needs to be mapped + * as WC on CPU side and UC (PAT index 2) on GPU side + */ + if (IS_METEORLAKE(gt->i915)) + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); if (IS_ERR(vma)) goto err; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 1803a633ed64..99a0a89091e7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg) /* now update descriptor */ WRITE_ONCE(desc->head, head); + /* + * Wa_22016122933: Making sure the head update is + * visible to GuC right away + */ + intel_guc_write_barrier(ct_to_guc(ct)); + return available - len; corrupted: -- 2.25.1
next prev parent reply other threads:[~2023-04-19 23:00 UTC|newest] Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-04-19 23:00 [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-19 23:00 ` [PATCH 1/8] drm/i915/mtl: Set has_llc=0 fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 10:20 ` Das, Nirmoy 2023-04-20 10:20 ` Das, Nirmoy 2023-04-19 23:00 ` [PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 20:29 ` Matt Roper 2023-04-19 23:00 ` [PATCH 3/8] drm/i915/mtl: Add PTE encode function fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 20:40 ` Matt Roper 2023-04-21 17:27 ` Yang, Fei 2023-04-21 17:42 ` Matt Roper 2023-04-23 7:37 ` Yang, Fei 2023-04-23 7:37 ` Yang, Fei 2023-04-24 17:20 ` Matt Roper 2023-04-24 18:41 ` Yang, Fei 2023-04-19 23:00 ` fei.yang [this message] 2023-04-19 23:00 ` [Intel-gfx] [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media fei.yang 2023-04-20 8:26 ` Andrzej Hajda 2023-04-20 11:36 ` Das, Nirmoy 2023-04-20 11:36 ` Das, Nirmoy 2023-04-20 20:52 ` Matt Roper 2023-04-19 23:00 ` [PATCH 5/8] drm/i915/mtl: end support for set caching ioctl fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 21:05 ` Matt Roper 2023-04-19 23:00 ` [PATCH 6/8] drm/i915: preparation for using PAT index fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 8:45 ` Andrzej Hajda 2023-04-20 21:14 ` Matt Roper 2023-04-19 23:00 ` [PATCH 7/8] drm/i915: use pat_index instead of cache_level fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 10:13 ` Andrzej Hajda 2023-04-20 12:39 ` Tvrtko Ursulin 2023-04-20 20:34 ` Yang, Fei 2023-04-21 8:43 ` Tvrtko Ursulin 2023-04-21 10:17 ` Tvrtko Ursulin 2023-04-23 6:12 ` Yang, Fei 2023-04-23 6:12 ` Yang, Fei 2023-04-24 8:41 ` Tvrtko Ursulin 2023-04-21 11:39 ` Tvrtko Ursulin 2023-04-23 6:52 ` Yang, Fei 2023-04-23 6:52 ` Yang, Fei 2023-04-24 9:22 ` Tvrtko Ursulin 2023-04-19 23:00 ` [PATCH 8/8] drm/i915: Allow user to set cache at BO creation fei.yang 2023-04-19 23:00 ` [Intel-gfx] " fei.yang 2023-04-20 11:39 ` Andi Shyti 2023-04-20 11:39 ` [Intel-gfx] " Andi Shyti 2023-04-20 13:06 ` Tvrtko Ursulin 2023-04-20 16:11 ` Yang, Fei 2023-04-20 16:29 ` Andi Shyti 2023-04-20 16:29 ` Andi Shyti 2023-04-21 20:48 ` Jordan Justen 2023-04-21 20:48 ` Jordan Justen [not found] ` <BYAPR11MB2567F03AD43D7E2DE2628D5D9A669@BYAPR11MB2567.namprd11.prod.outlook.com> [not found] ` <168232538771.392286.3227368099155268955@jljusten-skl> 2023-04-24 9:08 ` Tvrtko Ursulin 2023-04-24 9:08 ` Tvrtko Ursulin 2023-04-24 17:13 ` Jordan Justen 2023-04-24 17:13 ` Jordan Justen 2023-04-25 13:41 ` IOCTL feature detection (Was: Re: [Intel-gfx] [PATCH 8/8] drm/i915: Allow user to set cache at BO creation) Joonas Lahtinen 2023-04-25 13:41 ` [Intel-gfx] IOCTL feature detection (Was: " Joonas Lahtinen 2023-04-25 17:21 ` IOCTL feature detection (Was: Re: [Intel-gfx] " Teres Alexis, Alan Previn 2023-04-25 17:21 ` [Intel-gfx] IOCTL feature detection (Was: " Teres Alexis, Alan Previn 2023-04-25 18:19 ` IOCTL feature detection (Was: Re: [Intel-gfx] " Jordan Justen 2023-04-25 18:19 ` [Intel-gfx] IOCTL feature detection (Was: " Jordan Justen 2023-04-26 11:52 ` IOCTL feature detection (Was: Re: [Intel-gfx] " Daniel Vetter 2023-04-26 11:52 ` [Intel-gfx] IOCTL feature detection (Was: " Daniel Vetter 2023-04-26 16:48 ` IOCTL feature detection (Was: Re: [Intel-gfx] " Teres Alexis, Alan Previn 2023-04-26 16:48 ` [Intel-gfx] IOCTL feature detection (Was: " Teres Alexis, Alan Previn 2023-04-26 18:10 ` IOCTL feature detection (Was: Re: [Intel-gfx] " Ceraolo Spurio, Daniele 2023-04-26 18:10 ` [Intel-gfx] IOCTL feature detection (Was: " Ceraolo Spurio, Daniele 2023-04-26 20:04 ` IOCTL feature detection (Was: Re: [Intel-gfx] " Jordan Justen 2023-04-26 20:04 ` [Intel-gfx] IOCTL feature detection (Was: " Jordan Justen 2023-04-19 23:29 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL (rev8) Patchwork 2023-04-19 23:51 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork 2023-04-20 11:30 ` [Intel-gfx] [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL Andi Shyti -- strict thread matches above, loose matches on Subject: below -- 2023-04-19 21:12 fei.yang 2023-04-19 21:12 ` [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media fei.yang 2023-04-19 18:09 [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang 2023-04-19 18:09 ` [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media fei.yang 2023-04-17 6:24 [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang 2023-04-17 6:24 ` [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media fei.yang 2023-04-19 15:14 ` Das, Nirmoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230419230058.2659455-5-fei.yang@intel.com \ --to=fei.yang@intel.com \ --cc=andi.shyti@linux.intel.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=intel-gfx@lists.freedesktop.org \ --cc=nirmoy.das@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.