* [PATCH 00/12] Catchup with a few dropped patches @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: dri-devel, Tvrtko Ursulin From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> A small chunk of dropped and mostly already reviewed patches (a couple need review updated due rebasing I had to do) with the goal of getting to actual fixes in the next round. Chris Wilson (12): drm/i915: Take rcu_read_lock for querying fence's driver/timeline names drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex drm/i915: Lift marking a lock as used to utils drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper drm/i915/selftests: Set cache status for huge_gem_object drm/i915/selftests: Use a coherent map to setup scratch batch buffers drm/i915/selftests: Replace the unbounded set-domain with an explicit wait drm/i915/selftests: Remove redundant set-to-gtt-domain drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts drm/i915/selftests: Replace an unbounded set-domain wait with a timeout drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission drm/i915/gem: Manage all set-domain waits explicitly drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 14 -- drivers/gpu/drm/i915/gem/i915_gem_shrinker.h | 2 - .../gpu/drm/i915/gem/selftests/huge_pages.c | 22 +-- .../i915/gem/selftests/i915_gem_client_blt.c | 26 ++- .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- .../drm/i915/gem/selftests/i915_gem_context.c | 18 +- .../drm/i915/gem/selftests/i915_gem_mman.c | 16 -- .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- .../gpu/drm/i915/gt/selftest_workarounds.c | 107 +++++------- drivers/gpu/drm/i915/i915_gem.c | 4 +- drivers/gpu/drm/i915/i915_sw_fence.c | 2 + drivers/gpu/drm/i915/i915_utils.c | 28 +++ drivers/gpu/drm/i915/i915_utils.h | 41 +++++ drivers/gpu/drm/i915/selftests/i915_vma.c | 6 - .../drm/i915/selftests/intel_memory_region.c | 7 +- 26 files changed, 240 insertions(+), 312 deletions(-) -- 2.30.2 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 00/12] Catchup with a few dropped patches @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: dri-devel From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> A small chunk of dropped and mostly already reviewed patches (a couple need review updated due rebasing I had to do) with the goal of getting to actual fixes in the next round. Chris Wilson (12): drm/i915: Take rcu_read_lock for querying fence's driver/timeline names drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex drm/i915: Lift marking a lock as used to utils drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper drm/i915/selftests: Set cache status for huge_gem_object drm/i915/selftests: Use a coherent map to setup scratch batch buffers drm/i915/selftests: Replace the unbounded set-domain with an explicit wait drm/i915/selftests: Remove redundant set-to-gtt-domain drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts drm/i915/selftests: Replace an unbounded set-domain wait with a timeout drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission drm/i915/gem: Manage all set-domain waits explicitly drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 14 -- drivers/gpu/drm/i915/gem/i915_gem_shrinker.h | 2 - .../gpu/drm/i915/gem/selftests/huge_pages.c | 22 +-- .../i915/gem/selftests/i915_gem_client_blt.c | 26 ++- .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- .../drm/i915/gem/selftests/i915_gem_context.c | 18 +- .../drm/i915/gem/selftests/i915_gem_mman.c | 16 -- .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- .../gpu/drm/i915/gt/selftest_workarounds.c | 107 +++++------- drivers/gpu/drm/i915/i915_gem.c | 4 +- drivers/gpu/drm/i915/i915_sw_fence.c | 2 + drivers/gpu/drm/i915/i915_utils.c | 28 +++ drivers/gpu/drm/i915/i915_utils.h | 41 +++++ drivers/gpu/drm/i915/selftests/i915_vma.c | 6 - .../drm/i915/selftests/intel_memory_region.c | 7 +- 26 files changed, 240 insertions(+), 312 deletions(-) -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 01/12] drm/i915: Take rcu_read_lock for querying fence's driver/timeline names 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Mika Kuoppala, Tvrtko Ursulin, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> The name very often may be freed independently of the fence, with the only protection being RCU. To be safe as we read the names, hold RCU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_sw_fence.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 2744558f3050..dfabf291e5cd 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -430,11 +430,13 @@ static void timer_i915_sw_fence_wake(struct timer_list *t) if (!fence) return; + rcu_read_lock(); pr_notice("Asynchronous wait on fence %s:%s:%llx timed out (hint:%ps)\n", cb->dma->ops->get_driver_name(cb->dma), cb->dma->ops->get_timeline_name(cb->dma), cb->dma->seqno, i915_sw_fence_debug_hint(fence)); + rcu_read_unlock(); i915_sw_fence_set_error_once(fence, -ETIMEDOUT); i915_sw_fence_complete(fence); -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 01/12] drm/i915: Take rcu_read_lock for querying fence's driver/timeline names @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> The name very often may be freed independently of the fence, with the only protection being RCU. To be safe as we read the names, hold RCU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_sw_fence.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 2744558f3050..dfabf291e5cd 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -430,11 +430,13 @@ static void timer_i915_sw_fence_wake(struct timer_list *t) if (!fence) return; + rcu_read_lock(); pr_notice("Asynchronous wait on fence %s:%s:%llx timed out (hint:%ps)\n", cb->dma->ops->get_driver_name(cb->dma), cb->dma->ops->get_timeline_name(cb->dma), cb->dma->seqno, i915_sw_fence_debug_hint(fence)); + rcu_read_unlock(); i915_sw_fence_set_error_once(fence, -ETIMEDOUT); i915_sw_fence_complete(fence); -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Take rcu_read_lock for querying fence's driver/timeline names 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-27 10:46 ` Daniel Vetter -1 siblings, 0 replies; 38+ messages in thread From: Daniel Vetter @ 2021-05-27 10:46 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: Intel-gfx, dri-devel, Chris Wilson On Wed, May 26, 2021 at 03:14:45PM +0100, Tvrtko Ursulin wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > The name very often may be freed independently of the fence, with the > only protection being RCU. To be safe as we read the names, hold RCU. Yeah no. If it's not clear why, figure it out first. -Daniel > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/i915_sw_fence.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c > index 2744558f3050..dfabf291e5cd 100644 > --- a/drivers/gpu/drm/i915/i915_sw_fence.c > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c > @@ -430,11 +430,13 @@ static void timer_i915_sw_fence_wake(struct timer_list *t) > if (!fence) > return; > > + rcu_read_lock(); > pr_notice("Asynchronous wait on fence %s:%s:%llx timed out (hint:%ps)\n", > cb->dma->ops->get_driver_name(cb->dma), > cb->dma->ops->get_timeline_name(cb->dma), > cb->dma->seqno, > i915_sw_fence_debug_hint(fence)); > + rcu_read_unlock(); > > i915_sw_fence_set_error_once(fence, -ETIMEDOUT); > i915_sw_fence_complete(fence); > -- > 2.30.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Take rcu_read_lock for querying fence's driver/timeline names @ 2021-05-27 10:46 ` Daniel Vetter 0 siblings, 0 replies; 38+ messages in thread From: Daniel Vetter @ 2021-05-27 10:46 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: Intel-gfx, dri-devel, Chris Wilson On Wed, May 26, 2021 at 03:14:45PM +0100, Tvrtko Ursulin wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > The name very often may be freed independently of the fence, with the > only protection being RCU. To be safe as we read the names, hold RCU. Yeah no. If it's not clear why, figure it out first. -Daniel > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/i915_sw_fence.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c > index 2744558f3050..dfabf291e5cd 100644 > --- a/drivers/gpu/drm/i915/i915_sw_fence.c > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c > @@ -430,11 +430,13 @@ static void timer_i915_sw_fence_wake(struct timer_list *t) > if (!fence) > return; > > + rcu_read_lock(); > pr_notice("Asynchronous wait on fence %s:%s:%llx timed out (hint:%ps)\n", > cb->dma->ops->get_driver_name(cb->dma), > cb->dma->ops->get_timeline_name(cb->dma), > cb->dma->seqno, > i915_sw_fence_debug_hint(fence)); > + rcu_read_unlock(); > > i915_sw_fence_set_error_once(fence, -ETIMEDOUT); > i915_sw_fence_complete(fence); > -- > 2.30.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 02/12] drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Thomas Hellström, Tvrtko Ursulin, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Since we dropped the use of dev->struct_mutex from inside the shrinker, we no longer include that as part of our fs_reclaim tainting. We can drop the i915 argument and rebrand it as a generic fs_reclaim tainter. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 14 -------------- drivers/gpu/drm/i915/gem/i915_gem_shrinker.h | 2 -- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- drivers/gpu/drm/i915/i915_utils.c | 13 +++++++++++++ drivers/gpu/drm/i915/i915_utils.h | 2 ++ 6 files changed, 17 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index f4fb68e8955a..d68679a89d93 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -438,20 +438,6 @@ void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915) unregister_shrinker(&i915->mm.shrinker); } -void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, - struct mutex *mutex) -{ - if (!IS_ENABLED(CONFIG_LOCKDEP)) - return; - - fs_reclaim_acquire(GFP_KERNEL); - - mutex_acquire(&mutex->dep_map, 0, 0, _RET_IP_); - mutex_release(&mutex->dep_map, _RET_IP_); - - fs_reclaim_release(GFP_KERNEL); -} - #define obj_to_i915(obj__) to_i915((obj__)->base.dev) void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h index 8512470f6fd6..17ad82ea961f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h @@ -27,7 +27,5 @@ unsigned long i915_gem_shrink(struct i915_gem_ww_ctx *ww, unsigned long i915_gem_shrink_all(struct drm_i915_private *i915); void i915_gem_driver_register__shrinker(struct drm_i915_private *i915); void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915); -void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, - struct mutex *mutex); #endif /* __I915_GEM_SHRINKER_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 9b98f9d9faa3..70d207057ce5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -156,7 +156,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) lockdep_set_subclass(&vm->mutex, subclass); if (!intel_vm_no_concurrent_access_wa(vm->i915)) { - i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex); + fs_reclaim_taints_mutex(&vm->mutex); } else { /* * CHV + BXT VTD workaround use stop_machine(), diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index d5094be6d90f..213dcfef68b1 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -1405,7 +1405,7 @@ void intel_gt_init_reset(struct intel_gt *gt) * within the shrinker, we forbid ourselves from performing any * fs-reclaim or taking related locks during reset. */ - i915_gem_shrinker_taints_mutex(gt->i915, >->reset.mutex); + fs_reclaim_taints_mutex(>->reset.mutex); /* no GPU until we are ready! */ __set_bit(I915_WEDGED, >->reset.flags); diff --git a/drivers/gpu/drm/i915/i915_utils.c b/drivers/gpu/drm/i915/i915_utils.c index f9e780dee9de..90c7f0c4838c 100644 --- a/drivers/gpu/drm/i915/i915_utils.c +++ b/drivers/gpu/drm/i915/i915_utils.c @@ -114,3 +114,16 @@ void set_timer_ms(struct timer_list *t, unsigned long timeout) /* Keep t->expires = 0 reserved to indicate a canceled timer. */ mod_timer(t, jiffies + timeout ?: 1); } + +void fs_reclaim_taints_mutex(struct mutex *mutex) +{ + if (!IS_ENABLED(CONFIG_LOCKDEP)) + return; + + fs_reclaim_acquire(GFP_KERNEL); + + mutex_acquire(&mutex->dep_map, 0, 0, _RET_IP_); + mutex_release(&mutex->dep_map, _RET_IP_); + + fs_reclaim_release(GFP_KERNEL); +} diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index f02f52ab5070..4133d5193839 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -266,6 +266,8 @@ static inline int list_is_last_rcu(const struct list_head *list, return READ_ONCE(list->next) == head; } +void fs_reclaim_taints_mutex(struct mutex *mutex); + static inline unsigned long msecs_to_jiffies_timeout(const unsigned int m) { unsigned long j = msecs_to_jiffies(m); -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 02/12] drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Thomas Hellström, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Since we dropped the use of dev->struct_mutex from inside the shrinker, we no longer include that as part of our fs_reclaim tainting. We can drop the i915 argument and rebrand it as a generic fs_reclaim tainter. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 14 -------------- drivers/gpu/drm/i915/gem/i915_gem_shrinker.h | 2 -- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- drivers/gpu/drm/i915/i915_utils.c | 13 +++++++++++++ drivers/gpu/drm/i915/i915_utils.h | 2 ++ 6 files changed, 17 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index f4fb68e8955a..d68679a89d93 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -438,20 +438,6 @@ void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915) unregister_shrinker(&i915->mm.shrinker); } -void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, - struct mutex *mutex) -{ - if (!IS_ENABLED(CONFIG_LOCKDEP)) - return; - - fs_reclaim_acquire(GFP_KERNEL); - - mutex_acquire(&mutex->dep_map, 0, 0, _RET_IP_); - mutex_release(&mutex->dep_map, _RET_IP_); - - fs_reclaim_release(GFP_KERNEL); -} - #define obj_to_i915(obj__) to_i915((obj__)->base.dev) void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h index 8512470f6fd6..17ad82ea961f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h @@ -27,7 +27,5 @@ unsigned long i915_gem_shrink(struct i915_gem_ww_ctx *ww, unsigned long i915_gem_shrink_all(struct drm_i915_private *i915); void i915_gem_driver_register__shrinker(struct drm_i915_private *i915); void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915); -void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, - struct mutex *mutex); #endif /* __I915_GEM_SHRINKER_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 9b98f9d9faa3..70d207057ce5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -156,7 +156,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) lockdep_set_subclass(&vm->mutex, subclass); if (!intel_vm_no_concurrent_access_wa(vm->i915)) { - i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex); + fs_reclaim_taints_mutex(&vm->mutex); } else { /* * CHV + BXT VTD workaround use stop_machine(), diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index d5094be6d90f..213dcfef68b1 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -1405,7 +1405,7 @@ void intel_gt_init_reset(struct intel_gt *gt) * within the shrinker, we forbid ourselves from performing any * fs-reclaim or taking related locks during reset. */ - i915_gem_shrinker_taints_mutex(gt->i915, >->reset.mutex); + fs_reclaim_taints_mutex(>->reset.mutex); /* no GPU until we are ready! */ __set_bit(I915_WEDGED, >->reset.flags); diff --git a/drivers/gpu/drm/i915/i915_utils.c b/drivers/gpu/drm/i915/i915_utils.c index f9e780dee9de..90c7f0c4838c 100644 --- a/drivers/gpu/drm/i915/i915_utils.c +++ b/drivers/gpu/drm/i915/i915_utils.c @@ -114,3 +114,16 @@ void set_timer_ms(struct timer_list *t, unsigned long timeout) /* Keep t->expires = 0 reserved to indicate a canceled timer. */ mod_timer(t, jiffies + timeout ?: 1); } + +void fs_reclaim_taints_mutex(struct mutex *mutex) +{ + if (!IS_ENABLED(CONFIG_LOCKDEP)) + return; + + fs_reclaim_acquire(GFP_KERNEL); + + mutex_acquire(&mutex->dep_map, 0, 0, _RET_IP_); + mutex_release(&mutex->dep_map, _RET_IP_); + + fs_reclaim_release(GFP_KERNEL); +} diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index f02f52ab5070..4133d5193839 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -266,6 +266,8 @@ static inline int list_is_last_rcu(const struct list_head *list, return READ_ONCE(list->next) == head; } +void fs_reclaim_taints_mutex(struct mutex *mutex); + static inline unsigned long msecs_to_jiffies_timeout(const unsigned int m) { unsigned long j = msecs_to_jiffies(m); -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 03/12] drm/i915: Lift marking a lock as used to utils 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Thomas Hellström, Tvrtko Ursulin, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> After calling lock_set_subclass() the lock _must_ be used, or else lockdep's internal nr_used_locks becomes unbalanced. Extract the little utility function to i915_utils.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +------------ drivers/gpu/drm/i915/i915_utils.c | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_utils.h | 7 +++++++ 3 files changed, 23 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 3f9a811eb02b..15566819539f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -794,18 +794,7 @@ intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass) spin_lock_init(&engine->active.lock); lockdep_set_subclass(&engine->active.lock, subclass); - - /* - * Due to an interesting quirk in lockdep's internal debug tracking, - * after setting a subclass we must ensure the lock is used. Otherwise, - * nr_unused_locks is incremented once too often. - */ -#ifdef CONFIG_DEBUG_LOCK_ALLOC - local_irq_disable(); - lock_map_acquire(&engine->active.lock.dep_map); - lock_map_release(&engine->active.lock.dep_map); - local_irq_enable(); -#endif + mark_lock_used_irq(&engine->active.lock); } static struct intel_context * diff --git a/drivers/gpu/drm/i915/i915_utils.c b/drivers/gpu/drm/i915/i915_utils.c index 90c7f0c4838c..894de60833ec 100644 --- a/drivers/gpu/drm/i915/i915_utils.c +++ b/drivers/gpu/drm/i915/i915_utils.c @@ -127,3 +127,18 @@ void fs_reclaim_taints_mutex(struct mutex *mutex) fs_reclaim_release(GFP_KERNEL); } + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +void __mark_lock_used_irq(struct lockdep_map *lock) +{ + /* + * Due to an interesting quirk in lockdep's internal debug tracking, + * after setting a subclass we must ensure the lock is used. Otherwise, + * nr_unused_locks is incremented once too often. + */ + local_irq_disable(); + lock_map_acquire(lock); + lock_map_release(lock); + local_irq_enable(); +} +#endif diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index 4133d5193839..c3d234133da7 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -455,6 +455,13 @@ static inline bool timer_expired(const struct timer_list *t) return timer_active(t) && !timer_pending(t); } +#ifdef CONFIG_DEBUG_LOCK_ALLOC +void __mark_lock_used_irq(struct lockdep_map *lock); +#define mark_lock_used_irq(lock) __mark_lock_used_irq(&(lock)->dep_map) +#else +#define mark_lock_used_irq(lock) +#endif + /* * This is a lookalike for IS_ENABLED() that takes a kconfig value, * e.g. CONFIG_DRM_I915_SPIN_REQUEST, and evaluates whether it is non-zero -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 03/12] drm/i915: Lift marking a lock as used to utils @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Thomas Hellström, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> After calling lock_set_subclass() the lock _must_ be used, or else lockdep's internal nr_used_locks becomes unbalanced. Extract the little utility function to i915_utils.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +------------ drivers/gpu/drm/i915/i915_utils.c | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_utils.h | 7 +++++++ 3 files changed, 23 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 3f9a811eb02b..15566819539f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -794,18 +794,7 @@ intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass) spin_lock_init(&engine->active.lock); lockdep_set_subclass(&engine->active.lock, subclass); - - /* - * Due to an interesting quirk in lockdep's internal debug tracking, - * after setting a subclass we must ensure the lock is used. Otherwise, - * nr_unused_locks is incremented once too often. - */ -#ifdef CONFIG_DEBUG_LOCK_ALLOC - local_irq_disable(); - lock_map_acquire(&engine->active.lock.dep_map); - lock_map_release(&engine->active.lock.dep_map); - local_irq_enable(); -#endif + mark_lock_used_irq(&engine->active.lock); } static struct intel_context * diff --git a/drivers/gpu/drm/i915/i915_utils.c b/drivers/gpu/drm/i915/i915_utils.c index 90c7f0c4838c..894de60833ec 100644 --- a/drivers/gpu/drm/i915/i915_utils.c +++ b/drivers/gpu/drm/i915/i915_utils.c @@ -127,3 +127,18 @@ void fs_reclaim_taints_mutex(struct mutex *mutex) fs_reclaim_release(GFP_KERNEL); } + +#ifdef CONFIG_DEBUG_LOCK_ALLOC +void __mark_lock_used_irq(struct lockdep_map *lock) +{ + /* + * Due to an interesting quirk in lockdep's internal debug tracking, + * after setting a subclass we must ensure the lock is used. Otherwise, + * nr_unused_locks is incremented once too often. + */ + local_irq_disable(); + lock_map_acquire(lock); + lock_map_release(lock); + local_irq_enable(); +} +#endif diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index 4133d5193839..c3d234133da7 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -455,6 +455,13 @@ static inline bool timer_expired(const struct timer_list *t) return timer_active(t) && !timer_pending(t); } +#ifdef CONFIG_DEBUG_LOCK_ALLOC +void __mark_lock_used_irq(struct lockdep_map *lock); +#define mark_lock_used_irq(lock) __mark_lock_used_irq(&(lock)->dep_map) +#else +#define mark_lock_used_irq(lock) +#endif + /* * This is a lookalike for IS_ENABLED() that takes a kconfig value, * e.g. CONFIG_DRM_I915_SPIN_REQUEST, and evaluates whether it is non-zero -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 04/12] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Wrap cmpxchg64 with a try_cmpxchg()-esque helper. Hiding the old-value dance in the helper allows for cleaner code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_utils.h | 32 +++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index c3d234133da7..a42d9ddd0415 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -475,4 +475,36 @@ void __mark_lock_used_irq(struct lockdep_map *lock); */ #define IS_ACTIVE(config) ((config) != 0) +#ifndef try_cmpxchg64 +#if IS_ENABLED(CONFIG_64BIT) +#define try_cmpxchg64(_ptr, _pold, _new) try_cmpxchg(_ptr, _pold, _new) +#else +#define try_cmpxchg64(_ptr, _pold, _new) \ +({ \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __cur = cmpxchg64(_ptr, __old, _new); \ + bool success = __cur == __old; \ + if (unlikely(!success)) \ + *_old = __cur; \ + likely(success); \ +}) +#endif +#endif + +#ifndef xchg64 +#if IS_ENABLED(CONFIG_64BIT) +#define xchg64(_ptr, _new) xchg(_ptr, _new) +#else +#define xchg64(_ptr, _new) \ +({ \ + __typeof__(_ptr) __ptr = (_ptr); \ + __typeof__(*(_ptr)) __old = *__ptr; \ + while (!try_cmpxchg64(__ptr, &__old, (_new))) \ + ; \ + __old; \ +}) +#endif +#endif + #endif /* !__I915_UTILS_H */ -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 04/12] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Wrap cmpxchg64 with a try_cmpxchg()-esque helper. Hiding the old-value dance in the helper allows for cleaner code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_utils.h | 32 +++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index c3d234133da7..a42d9ddd0415 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -475,4 +475,36 @@ void __mark_lock_used_irq(struct lockdep_map *lock); */ #define IS_ACTIVE(config) ((config) != 0) +#ifndef try_cmpxchg64 +#if IS_ENABLED(CONFIG_64BIT) +#define try_cmpxchg64(_ptr, _pold, _new) try_cmpxchg(_ptr, _pold, _new) +#else +#define try_cmpxchg64(_ptr, _pold, _new) \ +({ \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __cur = cmpxchg64(_ptr, __old, _new); \ + bool success = __cur == __old; \ + if (unlikely(!success)) \ + *_old = __cur; \ + likely(success); \ +}) +#endif +#endif + +#ifndef xchg64 +#if IS_ENABLED(CONFIG_64BIT) +#define xchg64(_ptr, _new) xchg(_ptr, _new) +#else +#define xchg64(_ptr, _new) \ +({ \ + __typeof__(_ptr) __ptr = (_ptr); \ + __typeof__(*(_ptr)) __old = *__ptr; \ + while (!try_cmpxchg64(__ptr, &__old, (_new))) \ + ; \ + __old; \ +}) +#endif +#endif + #endif /* !__I915_UTILS_H */ -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 05/12] drm/i915/selftests: Set cache status for huge_gem_object 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Set the cache coherency and status using the set-coherency helper. Otherwise, we forget to mark the new pages as cache dirty. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index dadd485bc52f..33dd4e2a1010 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -171,10 +171,8 @@ huge_pages_object(struct drm_i915_private *i915, I915_BO_ALLOC_STRUCT_PAGE); i915_gem_object_set_volatile(obj); - - obj->write_domain = I915_GEM_DOMAIN_CPU; - obj->read_domains = I915_GEM_DOMAIN_CPU; - obj->cache_level = I915_CACHE_NONE; + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + __start_cpu_write(obj); obj->mm.page_mask = page_mask; @@ -324,10 +322,8 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single) i915_gem_object_init(obj, &fake_ops, &lock_class, 0); i915_gem_object_set_volatile(obj); - - obj->write_domain = I915_GEM_DOMAIN_CPU; - obj->read_domains = I915_GEM_DOMAIN_CPU; - obj->cache_level = I915_CACHE_NONE; + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + __start_cpu_write(obj); return obj; } @@ -1004,7 +1000,7 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 dword, u32 val) u32 *ptr = kmap_atomic(i915_gem_object_get_page(obj, n)); if (needs_flush & CLFLUSH_BEFORE) - drm_clflush_virt_range(ptr, PAGE_SIZE); + drm_clflush_virt_range(&ptr[dword], sizeof(val)); if (ptr[dword] != val) { pr_err("n=%lu ptr[%u]=%u, val=%u\n", -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 05/12] drm/i915/selftests: Set cache status for huge_gem_object @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Set the cache coherency and status using the set-coherency helper. Otherwise, we forget to mark the new pages as cache dirty. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index dadd485bc52f..33dd4e2a1010 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -171,10 +171,8 @@ huge_pages_object(struct drm_i915_private *i915, I915_BO_ALLOC_STRUCT_PAGE); i915_gem_object_set_volatile(obj); - - obj->write_domain = I915_GEM_DOMAIN_CPU; - obj->read_domains = I915_GEM_DOMAIN_CPU; - obj->cache_level = I915_CACHE_NONE; + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + __start_cpu_write(obj); obj->mm.page_mask = page_mask; @@ -324,10 +322,8 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single) i915_gem_object_init(obj, &fake_ops, &lock_class, 0); i915_gem_object_set_volatile(obj); - - obj->write_domain = I915_GEM_DOMAIN_CPU; - obj->read_domains = I915_GEM_DOMAIN_CPU; - obj->cache_level = I915_CACHE_NONE; + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + __start_cpu_write(obj); return obj; } @@ -1004,7 +1000,7 @@ __cpu_check_shmem(struct drm_i915_gem_object *obj, u32 dword, u32 val) u32 *ptr = kmap_atomic(i915_gem_object_get_page(obj, n)); if (needs_flush & CLFLUSH_BEFORE) - drm_clflush_virt_range(ptr, PAGE_SIZE); + drm_clflush_virt_range(&ptr[dword], sizeof(val)); if (ptr[dword] != val) { pr_err("n=%lu ptr[%u]=%u, val=%u\n", -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 06/12] drm/i915/selftests: Use a coherent map to setup scratch batch buffers 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Instead of manipulating the object's cache domain, just use the device coherent map to write the batch buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../drm/i915/gem/selftests/i915_gem_context.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index ce70d0a3afb2..3d8d5f242e34 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -1622,7 +1622,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1658,7 +1658,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1707,15 +1707,17 @@ static int read_from_scratch(struct i915_gem_context *ctx, i915_vma_unpin(vma); + i915_request_get(rq); i915_request_add(rq); - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_cpu_domain(obj, false); - i915_gem_object_unlock(obj); - if (err) + if (i915_request_wait(rq, 0, HZ / 5) < 0) { + i915_request_put(rq); + err = -ETIME; goto out_vm; + } + i915_request_put(rq); - cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out_vm; -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 06/12] drm/i915/selftests: Use a coherent map to setup scratch batch buffers @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Instead of manipulating the object's cache domain, just use the device coherent map to write the batch buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../drm/i915/gem/selftests/i915_gem_context.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index ce70d0a3afb2..3d8d5f242e34 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -1622,7 +1622,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1658,7 +1658,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1707,15 +1707,17 @@ static int read_from_scratch(struct i915_gem_context *ctx, i915_vma_unpin(vma); + i915_request_get(rq); i915_request_add(rq); - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_cpu_domain(obj, false); - i915_gem_object_unlock(obj); - if (err) + if (i915_request_wait(rq, 0, HZ / 5) < 0) { + i915_request_put(rq); + err = -ETIME; goto out_vm; + } + i915_request_put(rq); - cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out_vm; -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 07/12] drm/i915/selftests: Replace the unbounded set-domain with an explicit wait 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> After running client_blt, we flush the object by changing its domain. This causes us to wait forever instead of an bounded wait suitable for the selftest timeout. So do an explicit wait with a suitable timeout -- which in turn means we have to limit the size of the object/blit to run within reason. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../i915/gem/selftests/i915_gem_client_blt.c | 26 ++++++++++++++----- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c index d36873885cc1..baec7bd1fa53 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c @@ -23,12 +23,19 @@ static int __igt_client_fill(struct intel_engine_cs *engine) I915_RND_STATE(prng); IGT_TIMEOUT(end); u32 *vaddr; + u64 limit; int err = 0; + /* Try to keep the blits within the timeout */ + limit = min_t(u64, ce->vm->total >> 4, + jiffies_to_msecs(i915_selftest.timeout_jiffies) * SZ_2M); + if (!limit) + limit = SZ_4K; + intel_engine_pm_get(engine); do { const u32 max_block_size = S16_MAX * PAGE_SIZE; - u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng)); + u32 sz = min_t(u64, limit, prandom_u32_state(&prng)); u32 phys_sz = sz % (max_block_size + 1); u32 val = prandom_u32_state(&prng); u32 i; @@ -73,13 +80,20 @@ static int __igt_client_fill(struct intel_engine_cs *engine) if (err) goto err_unpin; - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_cpu_domain(obj, false); - i915_gem_object_unlock(obj); - if (err) + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + 2 * i915_selftest.timeout_jiffies); + if (err) { + pr_err("%s fill %zxB timed out\n", + engine->name, obj->base.size); goto err_unpin; + } - for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) { + for (i = 0; + i < huge_gem_object_phys_size(obj) / sizeof(u32); + i += 17) { + if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ)) + clflush(&vaddr[i]); if (vaddr[i] != val) { pr_err("vaddr[%u]=%x, expected=%x\n", i, vaddr[i], val); -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 07/12] drm/i915/selftests: Replace the unbounded set-domain with an explicit wait @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> After running client_blt, we flush the object by changing its domain. This causes us to wait forever instead of an bounded wait suitable for the selftest timeout. So do an explicit wait with a suitable timeout -- which in turn means we have to limit the size of the object/blit to run within reason. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../i915/gem/selftests/i915_gem_client_blt.c | 26 ++++++++++++++----- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c index d36873885cc1..baec7bd1fa53 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c @@ -23,12 +23,19 @@ static int __igt_client_fill(struct intel_engine_cs *engine) I915_RND_STATE(prng); IGT_TIMEOUT(end); u32 *vaddr; + u64 limit; int err = 0; + /* Try to keep the blits within the timeout */ + limit = min_t(u64, ce->vm->total >> 4, + jiffies_to_msecs(i915_selftest.timeout_jiffies) * SZ_2M); + if (!limit) + limit = SZ_4K; + intel_engine_pm_get(engine); do { const u32 max_block_size = S16_MAX * PAGE_SIZE; - u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng)); + u32 sz = min_t(u64, limit, prandom_u32_state(&prng)); u32 phys_sz = sz % (max_block_size + 1); u32 val = prandom_u32_state(&prng); u32 i; @@ -73,13 +80,20 @@ static int __igt_client_fill(struct intel_engine_cs *engine) if (err) goto err_unpin; - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_cpu_domain(obj, false); - i915_gem_object_unlock(obj); - if (err) + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + 2 * i915_selftest.timeout_jiffies); + if (err) { + pr_err("%s fill %zxB timed out\n", + engine->name, obj->base.size); goto err_unpin; + } - for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) { + for (i = 0; + i < huge_gem_object_phys_size(obj) / sizeof(u32); + i += 17) { + if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ)) + clflush(&vaddr[i]); if (vaddr[i] != val) { pr_err("vaddr[%u]=%x, expected=%x\n", i, vaddr[i], val); -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 08/12] drm/i915/selftests: Remove redundant set-to-gtt-domain 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Since the vma's backing store is flushed upon first creation, remove the manual calls to set-to-gtt-domain. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../gpu/drm/i915/gem/selftests/i915_gem_mman.c | 16 ---------------- drivers/gpu/drm/i915/selftests/i915_vma.c | 6 ------ 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 05a3b29f545e..886e2446e081 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -104,14 +104,6 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) { - pr_err("Failed to flush to GTT write domain; err=%d\n", err); - return err; - } - page = i915_prandom_u32_max_state(npages, prng); view = compute_partial_view(obj, page, MIN_CHUNK_PAGES); @@ -189,14 +181,6 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) { - pr_err("Failed to flush to GTT write domain; err=%d\n", err); - return err; - } - for_each_prime_number_from(page, 1, npages) { struct i915_ggtt_view view = compute_partial_view(obj, page, MIN_CHUNK_PAGES); diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c index dd0607254a95..24a806801883 100644 --- a/drivers/gpu/drm/i915/selftests/i915_vma.c +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c @@ -987,12 +987,6 @@ static int igt_vma_remapped_gtt(void *arg) u32 __iomem *map; unsigned int x, y; - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) - goto out; - if (!plane_info[0].dst_stride) plane_info[0].dst_stride = *t == I915_GGTT_VIEW_ROTATED ? p->height : p->width; -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 08/12] drm/i915/selftests: Remove redundant set-to-gtt-domain @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Since the vma's backing store is flushed upon first creation, remove the manual calls to set-to-gtt-domain. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../gpu/drm/i915/gem/selftests/i915_gem_mman.c | 16 ---------------- drivers/gpu/drm/i915/selftests/i915_vma.c | 6 ------ 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 05a3b29f545e..886e2446e081 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -104,14 +104,6 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) { - pr_err("Failed to flush to GTT write domain; err=%d\n", err); - return err; - } - page = i915_prandom_u32_max_state(npages, prng); view = compute_partial_view(obj, page, MIN_CHUNK_PAGES); @@ -189,14 +181,6 @@ static int check_partial_mappings(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) { - pr_err("Failed to flush to GTT write domain; err=%d\n", err); - return err; - } - for_each_prime_number_from(page, 1, npages) { struct i915_ggtt_view view = compute_partial_view(obj, page, MIN_CHUNK_PAGES); diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c index dd0607254a95..24a806801883 100644 --- a/drivers/gpu/drm/i915/selftests/i915_vma.c +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c @@ -987,12 +987,6 @@ static int igt_vma_remapped_gtt(void *arg) u32 __iomem *map; unsigned int x, y; - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) - goto out; - if (!plane_info[0].dst_stride) plane_info[0].dst_stride = *t == I915_GGTT_VIEW_ROTATED ? p->height : p->width; -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 09/12] drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Let's prefer to use explicit request tracking and bounded timeouts in our selftests. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../gpu/drm/i915/gt/selftest_workarounds.c | 107 +++++++----------- 1 file changed, 40 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 64937ec3f2dc..72553a56b225 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -93,56 +93,27 @@ reference_lists_fini(struct intel_gt *gt, struct wa_lists *lists) intel_wa_list_free(&lists->gt_wa_list); } -static struct drm_i915_gem_object * -read_nonprivs(struct intel_context *ce) +static struct i915_request * +read_nonprivs(struct intel_context *ce, struct i915_vma *result) { struct intel_engine_cs *engine = ce->engine; const u32 base = engine->mmio_base; - struct drm_i915_gem_object *result; struct i915_request *rq; - struct i915_vma *vma; u32 srm, *cs; int err; int i; - result = i915_gem_object_create_internal(engine->i915, PAGE_SIZE); - if (IS_ERR(result)) - return result; - - i915_gem_object_set_cache_coherency(result, I915_CACHE_LLC); - - cs = i915_gem_object_pin_map_unlocked(result, I915_MAP_WB); - if (IS_ERR(cs)) { - err = PTR_ERR(cs); - goto err_obj; - } - memset(cs, 0xc5, PAGE_SIZE); - i915_gem_object_flush_map(result); - i915_gem_object_unpin_map(result); - - vma = i915_vma_instance(result, &engine->gt->ggtt->vm, NULL); - if (IS_ERR(vma)) { - err = PTR_ERR(vma); - goto err_obj; - } - - err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); - if (err) - goto err_obj; - rq = intel_context_create_request(ce); - if (IS_ERR(rq)) { - err = PTR_ERR(rq); - goto err_pin; - } + if (IS_ERR(rq)) + return rq; - i915_vma_lock(vma); - err = i915_request_await_object(rq, vma->obj, true); + i915_vma_lock(result); + err = i915_request_await_object(rq, result->obj, true); if (err == 0) - err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); - i915_vma_unlock(vma); + err = i915_vma_move_to_active(result, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(result); if (err) - goto err_req; + goto err_rq; srm = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; if (INTEL_GEN(engine->i915) >= 8) @@ -151,28 +122,24 @@ read_nonprivs(struct intel_context *ce) cs = intel_ring_begin(rq, 4 * RING_MAX_NONPRIV_SLOTS); if (IS_ERR(cs)) { err = PTR_ERR(cs); - goto err_req; + goto err_rq; } for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++) { *cs++ = srm; *cs++ = i915_mmio_reg_offset(RING_FORCE_TO_NONPRIV(base, i)); - *cs++ = i915_ggtt_offset(vma) + sizeof(u32) * i; + *cs++ = i915_ggtt_offset(result) + sizeof(u32) * i; *cs++ = 0; } intel_ring_advance(rq, cs); + i915_request_get(rq); i915_request_add(rq); - i915_vma_unpin(vma); - return result; + return rq; -err_req: +err_rq: i915_request_add(rq); -err_pin: - i915_vma_unpin(vma); -err_obj: - i915_gem_object_put(result); return ERR_PTR(err); } @@ -203,32 +170,36 @@ print_results(const struct intel_engine_cs *engine, const u32 *results) static int check_whitelist(struct intel_context *ce) { struct intel_engine_cs *engine = ce->engine; - struct drm_i915_gem_object *results; - struct intel_wedge_me wedge; + struct i915_vma *result; + struct i915_request *rq; + int err = 0; u32 *vaddr; - int err; int i; - results = read_nonprivs(ce); - if (IS_ERR(results)) - return PTR_ERR(results); - - err = 0; - i915_gem_object_lock(results, NULL); - intel_wedge_on_timeout(&wedge, engine->gt, HZ / 5) /* safety net! */ - err = i915_gem_object_set_to_cpu_domain(results, false); - - if (intel_gt_is_wedged(engine->gt)) - err = -EIO; - if (err) - goto out_put; + result = __vm_create_scratch_for_read(&engine->gt->ggtt->vm, PAGE_SIZE); + if (IS_ERR(result)) + return PTR_ERR(result); - vaddr = i915_gem_object_pin_map(results, I915_MAP_WB); + vaddr = i915_gem_object_pin_map(result->obj, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto out_put; } + memset(vaddr, 0xc5, PAGE_SIZE); + i915_gem_object_flush_map(result->obj); + + rq = read_nonprivs(ce, result); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_map; + } + + if (i915_request_wait(rq, 0, HZ / 5) < 0) { + err = -EIO; + goto out_rq; + } + for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++) { u32 expected = get_whitelist_reg(engine, i); u32 actual = vaddr[i]; @@ -243,10 +214,12 @@ static int check_whitelist(struct intel_context *ce) } } - i915_gem_object_unpin_map(results); +out_rq: + i915_request_put(rq); +out_map: + i915_gem_object_unpin_map(result->obj); out_put: - i915_gem_object_unlock(results); - i915_gem_object_put(results); + i915_vma_put(result); return err; } -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 09/12] drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Let's prefer to use explicit request tracking and bounded timeouts in our selftests. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- .../gpu/drm/i915/gt/selftest_workarounds.c | 107 +++++++----------- 1 file changed, 40 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 64937ec3f2dc..72553a56b225 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -93,56 +93,27 @@ reference_lists_fini(struct intel_gt *gt, struct wa_lists *lists) intel_wa_list_free(&lists->gt_wa_list); } -static struct drm_i915_gem_object * -read_nonprivs(struct intel_context *ce) +static struct i915_request * +read_nonprivs(struct intel_context *ce, struct i915_vma *result) { struct intel_engine_cs *engine = ce->engine; const u32 base = engine->mmio_base; - struct drm_i915_gem_object *result; struct i915_request *rq; - struct i915_vma *vma; u32 srm, *cs; int err; int i; - result = i915_gem_object_create_internal(engine->i915, PAGE_SIZE); - if (IS_ERR(result)) - return result; - - i915_gem_object_set_cache_coherency(result, I915_CACHE_LLC); - - cs = i915_gem_object_pin_map_unlocked(result, I915_MAP_WB); - if (IS_ERR(cs)) { - err = PTR_ERR(cs); - goto err_obj; - } - memset(cs, 0xc5, PAGE_SIZE); - i915_gem_object_flush_map(result); - i915_gem_object_unpin_map(result); - - vma = i915_vma_instance(result, &engine->gt->ggtt->vm, NULL); - if (IS_ERR(vma)) { - err = PTR_ERR(vma); - goto err_obj; - } - - err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); - if (err) - goto err_obj; - rq = intel_context_create_request(ce); - if (IS_ERR(rq)) { - err = PTR_ERR(rq); - goto err_pin; - } + if (IS_ERR(rq)) + return rq; - i915_vma_lock(vma); - err = i915_request_await_object(rq, vma->obj, true); + i915_vma_lock(result); + err = i915_request_await_object(rq, result->obj, true); if (err == 0) - err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); - i915_vma_unlock(vma); + err = i915_vma_move_to_active(result, rq, EXEC_OBJECT_WRITE); + i915_vma_unlock(result); if (err) - goto err_req; + goto err_rq; srm = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; if (INTEL_GEN(engine->i915) >= 8) @@ -151,28 +122,24 @@ read_nonprivs(struct intel_context *ce) cs = intel_ring_begin(rq, 4 * RING_MAX_NONPRIV_SLOTS); if (IS_ERR(cs)) { err = PTR_ERR(cs); - goto err_req; + goto err_rq; } for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++) { *cs++ = srm; *cs++ = i915_mmio_reg_offset(RING_FORCE_TO_NONPRIV(base, i)); - *cs++ = i915_ggtt_offset(vma) + sizeof(u32) * i; + *cs++ = i915_ggtt_offset(result) + sizeof(u32) * i; *cs++ = 0; } intel_ring_advance(rq, cs); + i915_request_get(rq); i915_request_add(rq); - i915_vma_unpin(vma); - return result; + return rq; -err_req: +err_rq: i915_request_add(rq); -err_pin: - i915_vma_unpin(vma); -err_obj: - i915_gem_object_put(result); return ERR_PTR(err); } @@ -203,32 +170,36 @@ print_results(const struct intel_engine_cs *engine, const u32 *results) static int check_whitelist(struct intel_context *ce) { struct intel_engine_cs *engine = ce->engine; - struct drm_i915_gem_object *results; - struct intel_wedge_me wedge; + struct i915_vma *result; + struct i915_request *rq; + int err = 0; u32 *vaddr; - int err; int i; - results = read_nonprivs(ce); - if (IS_ERR(results)) - return PTR_ERR(results); - - err = 0; - i915_gem_object_lock(results, NULL); - intel_wedge_on_timeout(&wedge, engine->gt, HZ / 5) /* safety net! */ - err = i915_gem_object_set_to_cpu_domain(results, false); - - if (intel_gt_is_wedged(engine->gt)) - err = -EIO; - if (err) - goto out_put; + result = __vm_create_scratch_for_read(&engine->gt->ggtt->vm, PAGE_SIZE); + if (IS_ERR(result)) + return PTR_ERR(result); - vaddr = i915_gem_object_pin_map(results, I915_MAP_WB); + vaddr = i915_gem_object_pin_map(result->obj, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto out_put; } + memset(vaddr, 0xc5, PAGE_SIZE); + i915_gem_object_flush_map(result->obj); + + rq = read_nonprivs(ce, result); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_map; + } + + if (i915_request_wait(rq, 0, HZ / 5) < 0) { + err = -EIO; + goto out_rq; + } + for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++) { u32 expected = get_whitelist_reg(engine, i); u32 actual = vaddr[i]; @@ -243,10 +214,12 @@ static int check_whitelist(struct intel_context *ce) } } - i915_gem_object_unpin_map(results); +out_rq: + i915_request_put(rq); +out_map: + i915_gem_object_unpin_map(result->obj); out_put: - i915_gem_object_unlock(results); - i915_gem_object_put(results); + i915_vma_put(result); return err; } -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 10/12] drm/i915/selftests: Replace an unbounded set-domain wait with a timeout 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> After the memory-region test completes, it flushes the test by calling set-to-cpu-domain. Use the igt_flush_test as it includes a timeout, recovery and reports and error for miscreant tests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/selftests/intel_memory_region.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index f85fd8cbfbf5..7a3f71e83140 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -826,11 +826,10 @@ static int igt_lmem_write_cpu(void *arg) if (err) goto out_unpin; - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_wc_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) + if (igt_flush_test(engine->i915)) { + err = -EIO; goto out_unpin; + } count = ARRAY_SIZE(bytes); order = i915_random_order(count * count, &prng); -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 10/12] drm/i915/selftests: Replace an unbounded set-domain wait with a timeout @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> After the memory-region test completes, it flushes the test by calling set-to-cpu-domain. Use the igt_flush_test as it includes a timeout, recovery and reports and error for miscreant tests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/selftests/intel_memory_region.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index f85fd8cbfbf5..7a3f71e83140 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -826,11 +826,10 @@ static int igt_lmem_write_cpu(void *arg) if (err) goto out_unpin; - i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_wc_domain(obj, true); - i915_gem_object_unlock(obj); - if (err) + if (igt_flush_test(engine->i915)) { + err = -EIO; goto out_unpin; + } count = ARRAY_SIZE(bytes); order = i915_random_order(count * count, &prng); -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 11/12] drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> In construction the rpcs_query batch we know that it is device coherent and ready for execution, the set-to-gtt-domain here is redudant. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 3d8d5f242e34..eed5be597eee 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -954,8 +954,6 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, err = i915_gem_object_lock(obj, &ww); if (!err) err = i915_gem_object_lock(rpcs, &ww); - if (!err) - err = i915_gem_object_set_to_gtt_domain(obj, false); if (!err) err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER); if (err) -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 11/12] drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> In construction the rpcs_query batch we know that it is device coherent and ready for execution, the set-to-gtt-domain here is redudant. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 3d8d5f242e34..eed5be597eee 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -954,8 +954,6 @@ emit_rpcs_query(struct drm_i915_gem_object *obj, err = i915_gem_object_lock(obj, &ww); if (!err) err = i915_gem_object_lock(rpcs, &ww); - if (!err) - err = i915_gem_object_set_to_gtt_domain(obj, false); if (!err) err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER); if (err) -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:14 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Only perform the domain transition under the object lock, and push the required waits to outside the lock. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + drivers/gpu/drm/i915/i915_gem.c | 4 +- 12 files changed, 89 insertions(+), 165 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index daf9284ef1f5..e4c24558eaa8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) { struct clflush *clflush; - GEM_BUG_ON(!obj->cache_dirty); - clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); if (!clflush) return NULL; @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, trace_i915_gem_object_clflush(obj); - clflush = NULL; - if (!(flags & I915_CLFLUSH_SYNC)) - clflush = clflush_work_create(obj); + clflush = clflush_work_create(obj); if (clflush) { i915_sw_fence_await_reservation(&clflush->base.chain, - obj->base.resv, NULL, true, - i915_fence_timeout(to_i915(obj->base.dev)), + obj->base.resv, NULL, true, 0, I915_FENCE_GFP); dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); dma_fence_work_commit(&clflush->base); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h index e6c382973129..4cd5787d1507 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h @@ -9,12 +9,10 @@ #include <linux/types.h> -struct drm_i915_private; struct drm_i915_gem_object; bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, unsigned int flags); #define I915_CLFLUSH_FORCE BIT(0) -#define I915_CLFLUSH_SYNC BIT(1) #endif /* __I915_GEM_CLFLUSH_H__ */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index ccede73c6465..0926e0895ee6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_cpu_domain(obj, write); + i915_gem_object_set_to_cpu_domain(obj, write); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_set_to_gtt_domain(obj, false); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 073822100da7..39fda97c49a7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) break; case I915_GEM_DOMAIN_CPU: - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + i915_gem_clflush_object(obj, 0); break; case I915_GEM_DOMAIN_RENDER: @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_WC) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) obj->write_domain = I915_GEM_DOMAIN_WC; obj->mm.dirty = true; } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_GTT) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_vma_set_ggtt_write(vma); spin_unlock(&obj->vma.lock); } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* Flush the CPU cache if it's still invalid. */ if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + /* + * While we track when we write though the CPU cache + * (with obj->cache_dirty), this is only a guide as we do + * not know when the CPU may have speculatively populated + * the cache. We have to invalidate such speculative cachelines + * prior to reading writes by the GPU. + */ + i915_gem_clflush_object(obj, 0); obj->read_domains |= I915_GEM_DOMAIN_CPU; } @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) */ if (write) __start_cpu_write(obj); - - return 0; } /** @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT; - /* - * Try to flush the object off the GPU without holding the lock. - * We will repeat the flush holding the lock in the normal manner - * to catch cases where we are gazumped. - */ - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (err) - goto out; - if (i915_gem_object_is_userptr(obj)) { /* * Try to grab userptr pages, iris uses set_domain to check * userptr validity */ err = i915_gem_object_userptr_validate(obj); - if (!err) - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - goto out; + if (err) + goto out; } /* @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, goto out_unpin; if (read_domains & I915_GEM_DOMAIN_WC) - err = i915_gem_object_set_to_wc_domain(obj, write_domain); + i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); + i915_gem_object_set_to_gtt_domain(obj, write_domain); else - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); + i915_gem_object_set_to_cpu_domain(obj, write_domain); out_unpin: i915_gem_object_unpin_pages(obj); @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, out_unlock: i915_gem_object_unlock(obj); + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_PRIORITY | + (write_domain ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); if (!err && write_domain) i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, false); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu read domain, set ourself into the gtt * read domain and manually flush cachelines (if required). This @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, !(obj->read_domains & I915_GEM_DOMAIN_CPU)) *needs_clflush = CLFLUSH_BEFORE; -out: /* return with the pages pinned */ return 0; @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, true); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu write domain, set ourself into the * gtt write domain and manually flush cachelines (as required). @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, *needs_clflush |= CLFLUSH_BEFORE; } -out: i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); obj->mm.dirty = true; /* return with the pages pinned */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 297143511f99..40fda9e81a78 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, if (use_cpu_reloc(cache, obj)) return NULL; - err = i915_gem_object_set_to_gtt_domain(obj, true); - if (err) - return ERR_PTR(err); + i915_gem_object_set_to_gtt_domain(obj, true); vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, PIN_MAPPABLE | diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 2ebd79537aea..8bbc835e70ce 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); -int __must_check -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, + bool write); struct i915_vma * __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, struct i915_gem_ww_ctx *ww, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 0727d0c76aa0..b8f0413bc3b0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -188,6 +188,12 @@ struct drm_i915_gem_object { unsigned int cache_coherent:2; #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) + /* + * Note cache_dirty is only a guide; we know when we have written + * through the CPU cache, but we do not know when the CPU may have + * speculatively populated the cache. Before a read via the cache + * of GPU written memory, we have to cautiously invalidate the cache. + */ unsigned int cache_dirty:1; /** diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 33dd4e2a1010..d85ca79ac433 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, u32 dw, u32 val) { - int err; - - i915_gem_object_lock(vma->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); - i915_gem_object_unlock(vma->obj); - if (err) - return err; - return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), vma->size >> PAGE_SHIFT, val); } diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index e937b6629019..77ba6d1ef4e4 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); + i915_gem_object_set_to_gtt_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); + i915_gem_object_set_to_gtt_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); + i915_gem_object_set_to_wc_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); + i915_gem_object_set_to_wc_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) return PTR_ERR(vma); i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); - if (err) - goto out_unlock; + i915_gem_object_set_to_gtt_domain(ctx->obj, false); rq = intel_engine_create_kernel_request(ctx->engine); if (IS_ERR(rq)) { @@ -247,7 +263,6 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) i915_request_add(rq); out_unpin: i915_vma_unpin(vma); -out_unlock: i915_gem_object_unlock(ctx->obj); return err; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c index 3a6ce87f8b52..4d7580762acc 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -53,14 +53,10 @@ static int mock_phys_object(void *arg) /* Make the object dirty so that put_pages must do copy back the data */ i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_set_to_gtt_domain(obj, true); i915_gem_object_unlock(obj); - if (err) { - pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", - err); - goto out_obj; - } + err = 0; out_obj: i915_gem_object_put(obj); out: diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index 0b092c62bb34..ba8c06778b6c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -7,6 +7,7 @@ #include "igt_gem_utils.h" #include "gem/i915_gem_context.h" +#include "gem/i915_gem_clflush.h" #include "gem/i915_gem_pm.h" #include "gt/intel_context.h" #include "gt/intel_gpu_commands.h" @@ -138,6 +139,8 @@ int igt_gpu_fill_dw(struct intel_context *ce, goto skip_request; i915_vma_lock(vma); + if (vma->obj->cache_dirty & ~vma->obj->cache_coherent) + i915_gem_clflush_object(vma->obj, 0); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index cffd7f4f87dc..dbb983970f34 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -301,9 +301,7 @@ static struct i915_vma *i915_gem_gtt_prepare(struct drm_i915_gem_object *obj, if (ret) goto err_ww; - ret = i915_gem_object_set_to_gtt_domain(obj, write); - if (ret) - goto err_ww; + i915_gem_object_set_to_gtt_domain(obj, write); if (!i915_gem_object_is_tiled(obj)) vma = i915_gem_object_ggtt_pin_ww(obj, &ww, NULL, 0, 0, -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly @ 2021-05-26 14:14 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:14 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Only perform the domain transition under the object lock, and push the required waits to outside the lock. v2 (Tvrtko): * Rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + drivers/gpu/drm/i915/i915_gem.c | 4 +- 12 files changed, 89 insertions(+), 165 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index daf9284ef1f5..e4c24558eaa8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) { struct clflush *clflush; - GEM_BUG_ON(!obj->cache_dirty); - clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); if (!clflush) return NULL; @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, trace_i915_gem_object_clflush(obj); - clflush = NULL; - if (!(flags & I915_CLFLUSH_SYNC)) - clflush = clflush_work_create(obj); + clflush = clflush_work_create(obj); if (clflush) { i915_sw_fence_await_reservation(&clflush->base.chain, - obj->base.resv, NULL, true, - i915_fence_timeout(to_i915(obj->base.dev)), + obj->base.resv, NULL, true, 0, I915_FENCE_GFP); dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); dma_fence_work_commit(&clflush->base); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h index e6c382973129..4cd5787d1507 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h @@ -9,12 +9,10 @@ #include <linux/types.h> -struct drm_i915_private; struct drm_i915_gem_object; bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, unsigned int flags); #define I915_CLFLUSH_FORCE BIT(0) -#define I915_CLFLUSH_SYNC BIT(1) #endif /* __I915_GEM_CLFLUSH_H__ */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index ccede73c6465..0926e0895ee6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_cpu_domain(obj, write); + i915_gem_object_set_to_cpu_domain(obj, write); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_set_to_gtt_domain(obj, false); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 073822100da7..39fda97c49a7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) break; case I915_GEM_DOMAIN_CPU: - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + i915_gem_clflush_object(obj, 0); break; case I915_GEM_DOMAIN_RENDER: @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_WC) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) obj->write_domain = I915_GEM_DOMAIN_WC; obj->mm.dirty = true; } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_GTT) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_vma_set_ggtt_write(vma); spin_unlock(&obj->vma.lock); } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* Flush the CPU cache if it's still invalid. */ if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + /* + * While we track when we write though the CPU cache + * (with obj->cache_dirty), this is only a guide as we do + * not know when the CPU may have speculatively populated + * the cache. We have to invalidate such speculative cachelines + * prior to reading writes by the GPU. + */ + i915_gem_clflush_object(obj, 0); obj->read_domains |= I915_GEM_DOMAIN_CPU; } @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) */ if (write) __start_cpu_write(obj); - - return 0; } /** @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT; - /* - * Try to flush the object off the GPU without holding the lock. - * We will repeat the flush holding the lock in the normal manner - * to catch cases where we are gazumped. - */ - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (err) - goto out; - if (i915_gem_object_is_userptr(obj)) { /* * Try to grab userptr pages, iris uses set_domain to check * userptr validity */ err = i915_gem_object_userptr_validate(obj); - if (!err) - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - goto out; + if (err) + goto out; } /* @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, goto out_unpin; if (read_domains & I915_GEM_DOMAIN_WC) - err = i915_gem_object_set_to_wc_domain(obj, write_domain); + i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); + i915_gem_object_set_to_gtt_domain(obj, write_domain); else - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); + i915_gem_object_set_to_cpu_domain(obj, write_domain); out_unpin: i915_gem_object_unpin_pages(obj); @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, out_unlock: i915_gem_object_unlock(obj); + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_PRIORITY | + (write_domain ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); if (!err && write_domain) i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, false); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu read domain, set ourself into the gtt * read domain and manually flush cachelines (if required). This @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, !(obj->read_domains & I915_GEM_DOMAIN_CPU)) *needs_clflush = CLFLUSH_BEFORE; -out: /* return with the pages pinned */ return 0; @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, true); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu write domain, set ourself into the * gtt write domain and manually flush cachelines (as required). @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, *needs_clflush |= CLFLUSH_BEFORE; } -out: i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); obj->mm.dirty = true; /* return with the pages pinned */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 297143511f99..40fda9e81a78 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, if (use_cpu_reloc(cache, obj)) return NULL; - err = i915_gem_object_set_to_gtt_domain(obj, true); - if (err) - return ERR_PTR(err); + i915_gem_object_set_to_gtt_domain(obj, true); vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, PIN_MAPPABLE | diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 2ebd79537aea..8bbc835e70ce 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); -int __must_check -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, + bool write); struct i915_vma * __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, struct i915_gem_ww_ctx *ww, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 0727d0c76aa0..b8f0413bc3b0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -188,6 +188,12 @@ struct drm_i915_gem_object { unsigned int cache_coherent:2; #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) + /* + * Note cache_dirty is only a guide; we know when we have written + * through the CPU cache, but we do not know when the CPU may have + * speculatively populated the cache. Before a read via the cache + * of GPU written memory, we have to cautiously invalidate the cache. + */ unsigned int cache_dirty:1; /** diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 33dd4e2a1010..d85ca79ac433 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, u32 dw, u32 val) { - int err; - - i915_gem_object_lock(vma->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); - i915_gem_object_unlock(vma->obj); - if (err) - return err; - return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), vma->size >> PAGE_SHIFT, val); } diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index e937b6629019..77ba6d1ef4e4 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); + i915_gem_object_set_to_gtt_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); + i915_gem_object_set_to_gtt_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); + i915_gem_object_set_to_wc_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); + i915_gem_object_set_to_wc_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) return PTR_ERR(vma); i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); - if (err) - goto out_unlock; + i915_gem_object_set_to_gtt_domain(ctx->obj, false); rq = intel_engine_create_kernel_request(ctx->engine); if (IS_ERR(rq)) { @@ -247,7 +263,6 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) i915_request_add(rq); out_unpin: i915_vma_unpin(vma); -out_unlock: i915_gem_object_unlock(ctx->obj); return err; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c index 3a6ce87f8b52..4d7580762acc 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -53,14 +53,10 @@ static int mock_phys_object(void *arg) /* Make the object dirty so that put_pages must do copy back the data */ i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_set_to_gtt_domain(obj, true); i915_gem_object_unlock(obj); - if (err) { - pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", - err); - goto out_obj; - } + err = 0; out_obj: i915_gem_object_put(obj); out: diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index 0b092c62bb34..ba8c06778b6c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -7,6 +7,7 @@ #include "igt_gem_utils.h" #include "gem/i915_gem_context.h" +#include "gem/i915_gem_clflush.h" #include "gem/i915_gem_pm.h" #include "gt/intel_context.h" #include "gt/intel_gpu_commands.h" @@ -138,6 +139,8 @@ int igt_gpu_fill_dw(struct intel_context *ce, goto skip_request; i915_vma_lock(vma); + if (vma->obj->cache_dirty & ~vma->obj->cache_coherent) + i915_gem_clflush_object(vma->obj, 0); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index cffd7f4f87dc..dbb983970f34 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -301,9 +301,7 @@ static struct i915_vma *i915_gem_gtt_prepare(struct drm_i915_gem_object *obj, if (ret) goto err_ww; - ret = i915_gem_object_set_to_gtt_domain(obj, write); - if (ret) - goto err_ww; + i915_gem_object_set_to_gtt_domain(obj, write); if (!i915_gem_object_is_tiled(obj)) vma = i915_gem_object_ggtt_pin_ww(obj, &ww, NULL, 0, 0, -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin @ 2021-05-26 14:30 ` Matthew Auld -1 siblings, 0 replies; 38+ messages in thread From: Matthew Auld @ 2021-05-26 14:30 UTC (permalink / raw) To: Tvrtko Ursulin, Intel-gfx; +Cc: Tvrtko Ursulin, dri-devel, Chris Wilson On 26/05/2021 15:14, Tvrtko Ursulin wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > Only perform the domain transition under the object lock, and push the > required waits to outside the lock. > > v2 (Tvrtko): > * Rebase. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- > drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- > drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + > .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - > .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- > .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- > .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + > drivers/gpu/drm/i915/i915_gem.c | 4 +- > 12 files changed, 89 insertions(+), 165 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > index daf9284ef1f5..e4c24558eaa8 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) > { > struct clflush *clflush; > > - GEM_BUG_ON(!obj->cache_dirty); > - > clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); > if (!clflush) > return NULL; > @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > > trace_i915_gem_object_clflush(obj); > > - clflush = NULL; > - if (!(flags & I915_CLFLUSH_SYNC)) > - clflush = clflush_work_create(obj); > + clflush = clflush_work_create(obj); > if (clflush) { > i915_sw_fence_await_reservation(&clflush->base.chain, > - obj->base.resv, NULL, true, > - i915_fence_timeout(to_i915(obj->base.dev)), > + obj->base.resv, NULL, true, 0, > I915_FENCE_GFP); > dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); > dma_fence_work_commit(&clflush->base); > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > index e6c382973129..4cd5787d1507 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > @@ -9,12 +9,10 @@ > > #include <linux/types.h> > > -struct drm_i915_private; > struct drm_i915_gem_object; > > bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > unsigned int flags); > #define I915_CLFLUSH_FORCE BIT(0) > -#define I915_CLFLUSH_SYNC BIT(1) > > #endif /* __I915_GEM_CLFLUSH_H__ */ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > index ccede73c6465..0926e0895ee6 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire > if (!err) > err = i915_gem_object_pin_pages(obj); > if (!err) { > - err = i915_gem_object_set_to_cpu_domain(obj, write); > + i915_gem_object_set_to_cpu_domain(obj, write); > i915_gem_object_unpin_pages(obj); > } > if (err == -EDEADLK) { > @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct > if (!err) > err = i915_gem_object_pin_pages(obj); > if (!err) { > - err = i915_gem_object_set_to_gtt_domain(obj, false); > + i915_gem_object_set_to_gtt_domain(obj, false); > i915_gem_object_unpin_pages(obj); > } > if (err == -EDEADLK) { > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > index 073822100da7..39fda97c49a7 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) > break; > > case I915_GEM_DOMAIN_CPU: > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > + i915_gem_clflush_object(obj, 0); > break; > > case I915_GEM_DOMAIN_RENDER: > @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) > * This function returns when the move is complete, including waiting on > * flushes to occur. > */ > -int > +void > i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > { > - int ret; > - > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - (write ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > if (obj->write_domain == I915_GEM_DOMAIN_WC) > - return 0; > - > - /* Flush and acquire obj->pages so that we are coherent through > - * direct access in memory with previous cached writes through > - * shmemfs and that our cache domain tracking remains valid. > - * For example, if the obj->filp was moved to swap without us > - * being notified and releasing the pages, we would mistakenly > - * continue to assume that the obj remained out of the CPU cached > - * domain. > - */ > - ret = i915_gem_object_pin_pages(obj); > - if (ret) > - return ret; > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); > > @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > obj->write_domain = I915_GEM_DOMAIN_WC; > obj->mm.dirty = true; > } > - > - i915_gem_object_unpin_pages(obj); > - return 0; > } > > /** > @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > * This function returns when the move is complete, including waiting on > * flushes to occur. > */ > -int > +void > i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > { > - int ret; > - > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - (write ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > if (obj->write_domain == I915_GEM_DOMAIN_GTT) > - return 0; > - > - /* Flush and acquire obj->pages so that we are coherent through > - * direct access in memory with previous cached writes through > - * shmemfs and that our cache domain tracking remains valid. > - * For example, if the obj->filp was moved to swap without us > - * being notified and releasing the pages, we would mistakenly > - * continue to assume that the obj remained out of the CPU cached > - * domain. > - */ > - ret = i915_gem_object_pin_pages(obj); > - if (ret) > - return ret; > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); > > @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > i915_vma_set_ggtt_write(vma); > spin_unlock(&obj->vma.lock); > } > - > - i915_gem_object_unpin_pages(obj); > - return 0; > } > > /** > @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * This function returns when the move is complete, including waiting on > * flushes to occur. > */ > -int > +void > i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > { > - int ret; > - > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - (write ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > /* Flush the CPU cache if it's still invalid. */ > if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > + /* > + * While we track when we write though the CPU cache > + * (with obj->cache_dirty), this is only a guide as we do > + * not know when the CPU may have speculatively populated > + * the cache. We have to invalidate such speculative cachelines > + * prior to reading writes by the GPU. > + */ > + i915_gem_clflush_object(obj, 0); > obj->read_domains |= I915_GEM_DOMAIN_CPU; > } > > @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > */ > if (write) > __start_cpu_write(obj); > - > - return 0; > } > > /** > @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > if (!obj) > return -ENOENT; > > - /* > - * Try to flush the object off the GPU without holding the lock. > - * We will repeat the flush holding the lock in the normal manner > - * to catch cases where we are gazumped. > - */ > - err = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - I915_WAIT_PRIORITY | > - (write_domain ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (err) > - goto out; > - > if (i915_gem_object_is_userptr(obj)) { > /* > * Try to grab userptr pages, iris uses set_domain to check > * userptr validity > */ > err = i915_gem_object_userptr_validate(obj); > - if (!err) > - err = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - I915_WAIT_PRIORITY | > - (write_domain ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - goto out; > + if (err) > + goto out; > } > > /* > @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > goto out_unpin; > > if (read_domains & I915_GEM_DOMAIN_WC) > - err = i915_gem_object_set_to_wc_domain(obj, write_domain); > + i915_gem_object_set_to_wc_domain(obj, write_domain); > else if (read_domains & I915_GEM_DOMAIN_GTT) > - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); > + i915_gem_object_set_to_gtt_domain(obj, write_domain); > else > - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); > + i915_gem_object_set_to_cpu_domain(obj, write_domain); > > out_unpin: > i915_gem_object_unpin_pages(obj); > @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > out_unlock: > i915_gem_object_unlock(obj); > > + err = i915_gem_object_wait(obj, > + I915_WAIT_INTERRUPTIBLE | > + I915_WAIT_PRIORITY | > + (write_domain ? I915_WAIT_ALL : 0), > + MAX_SCHEDULE_TIMEOUT); > if (!err && write_domain) > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > > @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE, > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > ret = i915_gem_object_pin_pages(obj); > if (ret) > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > - ret = i915_gem_object_set_to_cpu_domain(obj, false); > - if (ret) > - goto err_unpin; > - else > - goto out; > - } > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > + i915_gem_object_set_to_cpu_domain(obj, false); > + else > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > + ret = i915_gem_object_wait(obj, > + I915_WAIT_INTERRUPTIBLE, > + MAX_SCHEDULE_TIMEOUT); > + if (ret) > + goto err_unpin; > > /* If we're not in the cpu read domain, set ourself into the gtt > * read domain and manually flush cachelines (if required). This > @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > !(obj->read_domains & I915_GEM_DOMAIN_CPU)) > *needs_clflush = CLFLUSH_BEFORE; > > -out: > /* return with the pages pinned */ > return 0; > > @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - I915_WAIT_ALL, > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > ret = i915_gem_object_pin_pages(obj); > if (ret) > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > - ret = i915_gem_object_set_to_cpu_domain(obj, true); > - if (ret) > - goto err_unpin; > - else > - goto out; > - } > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > + i915_gem_object_set_to_cpu_domain(obj, true); > + else > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > + ret = i915_gem_object_wait(obj, > + I915_WAIT_INTERRUPTIBLE | > + I915_WAIT_ALL, > + MAX_SCHEDULE_TIMEOUT); > + if (ret) > + goto err_unpin; > > /* If we're not in the cpu write domain, set ourself into the > * gtt write domain and manually flush cachelines (as required). > @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > *needs_clflush |= CLFLUSH_BEFORE; > } > > -out: > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > obj->mm.dirty = true; > /* return with the pages pinned */ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 297143511f99..40fda9e81a78 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, > if (use_cpu_reloc(cache, obj)) > return NULL; > > - err = i915_gem_object_set_to_gtt_domain(obj, true); > - if (err) > - return ERR_PTR(err); > + i915_gem_object_set_to_gtt_domain(obj, true); > > vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, > PIN_MAPPABLE | > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h > index 2ebd79537aea..8bbc835e70ce 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, > void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); > void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); > > -int __must_check > -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); > -int __must_check > -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); > -int __must_check > -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); > +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, > + bool write); > +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, > + bool write); > +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, > + bool write); > struct i915_vma * __must_check > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > struct i915_gem_ww_ctx *ww, > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > index 0727d0c76aa0..b8f0413bc3b0 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > @@ -188,6 +188,12 @@ struct drm_i915_gem_object { > unsigned int cache_coherent:2; > #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) > #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) > + /* > + * Note cache_dirty is only a guide; we know when we have written > + * through the CPU cache, but we do not know when the CPU may have > + * speculatively populated the cache. Before a read via the cache > + * of GPU written memory, we have to cautiously invalidate the cache. > + */ > unsigned int cache_dirty:1; > > /** > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > index 33dd4e2a1010..d85ca79ac433 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, > u32 dw, > u32 val) > { > - int err; > - > - i915_gem_object_lock(vma->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); > - i915_gem_object_unlock(vma->obj); > - if (err) > - return err; > - > return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), > vma->size >> PAGE_SHIFT, val); > } > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > index e937b6629019..77ba6d1ef4e4 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > + i915_gem_object_set_to_gtt_domain(ctx->obj, true); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_ALL | > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) > int err; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); > + i915_gem_object_set_to_wc_domain(ctx->obj, true); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_ALL | > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) > int err; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); > + i915_gem_object_set_to_wc_domain(ctx->obj, false); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) > return PTR_ERR(vma); > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > - if (err) > - goto out_unlock; > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); IIRC Daniel pointed out that this looks odd, since this now becomes write=false for some reason. I think keep this as write=true, since it does look like that is what gpu_set wants. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Intel-gfx] [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly @ 2021-05-26 14:30 ` Matthew Auld 0 siblings, 0 replies; 38+ messages in thread From: Matthew Auld @ 2021-05-26 14:30 UTC (permalink / raw) To: Tvrtko Ursulin, Intel-gfx; +Cc: dri-devel, Chris Wilson On 26/05/2021 15:14, Tvrtko Ursulin wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > Only perform the domain transition under the object lock, and push the > required waits to outside the lock. > > v2 (Tvrtko): > * Rebase. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- > drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- > drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + > .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - > .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- > .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- > .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + > drivers/gpu/drm/i915/i915_gem.c | 4 +- > 12 files changed, 89 insertions(+), 165 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > index daf9284ef1f5..e4c24558eaa8 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) > { > struct clflush *clflush; > > - GEM_BUG_ON(!obj->cache_dirty); > - > clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); > if (!clflush) > return NULL; > @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > > trace_i915_gem_object_clflush(obj); > > - clflush = NULL; > - if (!(flags & I915_CLFLUSH_SYNC)) > - clflush = clflush_work_create(obj); > + clflush = clflush_work_create(obj); > if (clflush) { > i915_sw_fence_await_reservation(&clflush->base.chain, > - obj->base.resv, NULL, true, > - i915_fence_timeout(to_i915(obj->base.dev)), > + obj->base.resv, NULL, true, 0, > I915_FENCE_GFP); > dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); > dma_fence_work_commit(&clflush->base); > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > index e6c382973129..4cd5787d1507 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > @@ -9,12 +9,10 @@ > > #include <linux/types.h> > > -struct drm_i915_private; > struct drm_i915_gem_object; > > bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > unsigned int flags); > #define I915_CLFLUSH_FORCE BIT(0) > -#define I915_CLFLUSH_SYNC BIT(1) > > #endif /* __I915_GEM_CLFLUSH_H__ */ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > index ccede73c6465..0926e0895ee6 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire > if (!err) > err = i915_gem_object_pin_pages(obj); > if (!err) { > - err = i915_gem_object_set_to_cpu_domain(obj, write); > + i915_gem_object_set_to_cpu_domain(obj, write); > i915_gem_object_unpin_pages(obj); > } > if (err == -EDEADLK) { > @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct > if (!err) > err = i915_gem_object_pin_pages(obj); > if (!err) { > - err = i915_gem_object_set_to_gtt_domain(obj, false); > + i915_gem_object_set_to_gtt_domain(obj, false); > i915_gem_object_unpin_pages(obj); > } > if (err == -EDEADLK) { > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > index 073822100da7..39fda97c49a7 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) > break; > > case I915_GEM_DOMAIN_CPU: > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > + i915_gem_clflush_object(obj, 0); > break; > > case I915_GEM_DOMAIN_RENDER: > @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) > * This function returns when the move is complete, including waiting on > * flushes to occur. > */ > -int > +void > i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > { > - int ret; > - > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - (write ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > if (obj->write_domain == I915_GEM_DOMAIN_WC) > - return 0; > - > - /* Flush and acquire obj->pages so that we are coherent through > - * direct access in memory with previous cached writes through > - * shmemfs and that our cache domain tracking remains valid. > - * For example, if the obj->filp was moved to swap without us > - * being notified and releasing the pages, we would mistakenly > - * continue to assume that the obj remained out of the CPU cached > - * domain. > - */ > - ret = i915_gem_object_pin_pages(obj); > - if (ret) > - return ret; > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); > > @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > obj->write_domain = I915_GEM_DOMAIN_WC; > obj->mm.dirty = true; > } > - > - i915_gem_object_unpin_pages(obj); > - return 0; > } > > /** > @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > * This function returns when the move is complete, including waiting on > * flushes to occur. > */ > -int > +void > i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > { > - int ret; > - > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - (write ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > if (obj->write_domain == I915_GEM_DOMAIN_GTT) > - return 0; > - > - /* Flush and acquire obj->pages so that we are coherent through > - * direct access in memory with previous cached writes through > - * shmemfs and that our cache domain tracking remains valid. > - * For example, if the obj->filp was moved to swap without us > - * being notified and releasing the pages, we would mistakenly > - * continue to assume that the obj remained out of the CPU cached > - * domain. > - */ > - ret = i915_gem_object_pin_pages(obj); > - if (ret) > - return ret; > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); > > @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > i915_vma_set_ggtt_write(vma); > spin_unlock(&obj->vma.lock); > } > - > - i915_gem_object_unpin_pages(obj); > - return 0; > } > > /** > @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > * This function returns when the move is complete, including waiting on > * flushes to occur. > */ > -int > +void > i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > { > - int ret; > - > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - (write ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > /* Flush the CPU cache if it's still invalid. */ > if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > + /* > + * While we track when we write though the CPU cache > + * (with obj->cache_dirty), this is only a guide as we do > + * not know when the CPU may have speculatively populated > + * the cache. We have to invalidate such speculative cachelines > + * prior to reading writes by the GPU. > + */ > + i915_gem_clflush_object(obj, 0); > obj->read_domains |= I915_GEM_DOMAIN_CPU; > } > > @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > */ > if (write) > __start_cpu_write(obj); > - > - return 0; > } > > /** > @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > if (!obj) > return -ENOENT; > > - /* > - * Try to flush the object off the GPU without holding the lock. > - * We will repeat the flush holding the lock in the normal manner > - * to catch cases where we are gazumped. > - */ > - err = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - I915_WAIT_PRIORITY | > - (write_domain ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - if (err) > - goto out; > - > if (i915_gem_object_is_userptr(obj)) { > /* > * Try to grab userptr pages, iris uses set_domain to check > * userptr validity > */ > err = i915_gem_object_userptr_validate(obj); > - if (!err) > - err = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - I915_WAIT_PRIORITY | > - (write_domain ? I915_WAIT_ALL : 0), > - MAX_SCHEDULE_TIMEOUT); > - goto out; > + if (err) > + goto out; > } > > /* > @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > goto out_unpin; > > if (read_domains & I915_GEM_DOMAIN_WC) > - err = i915_gem_object_set_to_wc_domain(obj, write_domain); > + i915_gem_object_set_to_wc_domain(obj, write_domain); > else if (read_domains & I915_GEM_DOMAIN_GTT) > - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); > + i915_gem_object_set_to_gtt_domain(obj, write_domain); > else > - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); > + i915_gem_object_set_to_cpu_domain(obj, write_domain); > > out_unpin: > i915_gem_object_unpin_pages(obj); > @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > out_unlock: > i915_gem_object_unlock(obj); > > + err = i915_gem_object_wait(obj, > + I915_WAIT_INTERRUPTIBLE | > + I915_WAIT_PRIORITY | > + (write_domain ? I915_WAIT_ALL : 0), > + MAX_SCHEDULE_TIMEOUT); > if (!err && write_domain) > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > > @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE, > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > ret = i915_gem_object_pin_pages(obj); > if (ret) > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > - ret = i915_gem_object_set_to_cpu_domain(obj, false); > - if (ret) > - goto err_unpin; > - else > - goto out; > - } > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > + i915_gem_object_set_to_cpu_domain(obj, false); > + else > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > + ret = i915_gem_object_wait(obj, > + I915_WAIT_INTERRUPTIBLE, > + MAX_SCHEDULE_TIMEOUT); > + if (ret) > + goto err_unpin; > > /* If we're not in the cpu read domain, set ourself into the gtt > * read domain and manually flush cachelines (if required). This > @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > !(obj->read_domains & I915_GEM_DOMAIN_CPU)) > *needs_clflush = CLFLUSH_BEFORE; > > -out: > /* return with the pages pinned */ > return 0; > > @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > - I915_WAIT_INTERRUPTIBLE | > - I915_WAIT_ALL, > - MAX_SCHEDULE_TIMEOUT); > - if (ret) > - return ret; > - > ret = i915_gem_object_pin_pages(obj); > if (ret) > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > - ret = i915_gem_object_set_to_cpu_domain(obj, true); > - if (ret) > - goto err_unpin; > - else > - goto out; > - } > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > + i915_gem_object_set_to_cpu_domain(obj, true); > + else > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > + ret = i915_gem_object_wait(obj, > + I915_WAIT_INTERRUPTIBLE | > + I915_WAIT_ALL, > + MAX_SCHEDULE_TIMEOUT); > + if (ret) > + goto err_unpin; > > /* If we're not in the cpu write domain, set ourself into the > * gtt write domain and manually flush cachelines (as required). > @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > *needs_clflush |= CLFLUSH_BEFORE; > } > > -out: > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > obj->mm.dirty = true; > /* return with the pages pinned */ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 297143511f99..40fda9e81a78 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, > if (use_cpu_reloc(cache, obj)) > return NULL; > > - err = i915_gem_object_set_to_gtt_domain(obj, true); > - if (err) > - return ERR_PTR(err); > + i915_gem_object_set_to_gtt_domain(obj, true); > > vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, > PIN_MAPPABLE | > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h > index 2ebd79537aea..8bbc835e70ce 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, > void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); > void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); > > -int __must_check > -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); > -int __must_check > -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); > -int __must_check > -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); > +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, > + bool write); > +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, > + bool write); > +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, > + bool write); > struct i915_vma * __must_check > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > struct i915_gem_ww_ctx *ww, > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > index 0727d0c76aa0..b8f0413bc3b0 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > @@ -188,6 +188,12 @@ struct drm_i915_gem_object { > unsigned int cache_coherent:2; > #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) > #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) > + /* > + * Note cache_dirty is only a guide; we know when we have written > + * through the CPU cache, but we do not know when the CPU may have > + * speculatively populated the cache. Before a read via the cache > + * of GPU written memory, we have to cautiously invalidate the cache. > + */ > unsigned int cache_dirty:1; > > /** > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > index 33dd4e2a1010..d85ca79ac433 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, > u32 dw, > u32 val) > { > - int err; > - > - i915_gem_object_lock(vma->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); > - i915_gem_object_unlock(vma->obj); > - if (err) > - return err; > - > return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), > vma->size >> PAGE_SHIFT, val); > } > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > index e937b6629019..77ba6d1ef4e4 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > + i915_gem_object_set_to_gtt_domain(ctx->obj, true); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_ALL | > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) > int err; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); > + i915_gem_object_set_to_wc_domain(ctx->obj, true); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_ALL | > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) > int err; > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); > + i915_gem_object_set_to_wc_domain(ctx->obj, false); > i915_gem_object_unlock(ctx->obj); > + > + err = i915_gem_object_wait(ctx->obj, > + I915_WAIT_INTERRUPTIBLE, > + HZ / 2); > if (err) > return err; > > @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) > return PTR_ERR(vma); > > i915_gem_object_lock(ctx->obj, NULL); > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > - if (err) > - goto out_unlock; > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); IIRC Daniel pointed out that this looks odd, since this now becomes write=false for some reason. I think keep this as write=true, since it does look like that is what gpu_set wants. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 12/12] drm/i915/gem: Manage all set-domain waits explicitly 2021-05-26 14:30 ` [Intel-gfx] " Matthew Auld @ 2021-05-26 14:37 ` Tvrtko Ursulin -1 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:37 UTC (permalink / raw) To: Intel-gfx; +Cc: Tvrtko Ursulin, Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Only perform the domain transition under the object lock, and push the required waits to outside the lock. v2 (Tvrtko): * Rebase. v3 (Tvrtko): * Restore write to gtt domain in coherency selftest. (Matt) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + drivers/gpu/drm/i915/i915_gem.c | 4 +- 12 files changed, 89 insertions(+), 165 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index daf9284ef1f5..e4c24558eaa8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) { struct clflush *clflush; - GEM_BUG_ON(!obj->cache_dirty); - clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); if (!clflush) return NULL; @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, trace_i915_gem_object_clflush(obj); - clflush = NULL; - if (!(flags & I915_CLFLUSH_SYNC)) - clflush = clflush_work_create(obj); + clflush = clflush_work_create(obj); if (clflush) { i915_sw_fence_await_reservation(&clflush->base.chain, - obj->base.resv, NULL, true, - i915_fence_timeout(to_i915(obj->base.dev)), + obj->base.resv, NULL, true, 0, I915_FENCE_GFP); dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); dma_fence_work_commit(&clflush->base); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h index e6c382973129..4cd5787d1507 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h @@ -9,12 +9,10 @@ #include <linux/types.h> -struct drm_i915_private; struct drm_i915_gem_object; bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, unsigned int flags); #define I915_CLFLUSH_FORCE BIT(0) -#define I915_CLFLUSH_SYNC BIT(1) #endif /* __I915_GEM_CLFLUSH_H__ */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index ccede73c6465..0926e0895ee6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_cpu_domain(obj, write); + i915_gem_object_set_to_cpu_domain(obj, write); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_set_to_gtt_domain(obj, false); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 073822100da7..39fda97c49a7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) break; case I915_GEM_DOMAIN_CPU: - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + i915_gem_clflush_object(obj, 0); break; case I915_GEM_DOMAIN_RENDER: @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_WC) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) obj->write_domain = I915_GEM_DOMAIN_WC; obj->mm.dirty = true; } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_GTT) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_vma_set_ggtt_write(vma); spin_unlock(&obj->vma.lock); } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* Flush the CPU cache if it's still invalid. */ if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + /* + * While we track when we write though the CPU cache + * (with obj->cache_dirty), this is only a guide as we do + * not know when the CPU may have speculatively populated + * the cache. We have to invalidate such speculative cachelines + * prior to reading writes by the GPU. + */ + i915_gem_clflush_object(obj, 0); obj->read_domains |= I915_GEM_DOMAIN_CPU; } @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) */ if (write) __start_cpu_write(obj); - - return 0; } /** @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT; - /* - * Try to flush the object off the GPU without holding the lock. - * We will repeat the flush holding the lock in the normal manner - * to catch cases where we are gazumped. - */ - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (err) - goto out; - if (i915_gem_object_is_userptr(obj)) { /* * Try to grab userptr pages, iris uses set_domain to check * userptr validity */ err = i915_gem_object_userptr_validate(obj); - if (!err) - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - goto out; + if (err) + goto out; } /* @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, goto out_unpin; if (read_domains & I915_GEM_DOMAIN_WC) - err = i915_gem_object_set_to_wc_domain(obj, write_domain); + i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); + i915_gem_object_set_to_gtt_domain(obj, write_domain); else - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); + i915_gem_object_set_to_cpu_domain(obj, write_domain); out_unpin: i915_gem_object_unpin_pages(obj); @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, out_unlock: i915_gem_object_unlock(obj); + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_PRIORITY | + (write_domain ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); if (!err && write_domain) i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, false); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu read domain, set ourself into the gtt * read domain and manually flush cachelines (if required). This @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, !(obj->read_domains & I915_GEM_DOMAIN_CPU)) *needs_clflush = CLFLUSH_BEFORE; -out: /* return with the pages pinned */ return 0; @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, true); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu write domain, set ourself into the * gtt write domain and manually flush cachelines (as required). @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, *needs_clflush |= CLFLUSH_BEFORE; } -out: i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); obj->mm.dirty = true; /* return with the pages pinned */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 297143511f99..40fda9e81a78 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, if (use_cpu_reloc(cache, obj)) return NULL; - err = i915_gem_object_set_to_gtt_domain(obj, true); - if (err) - return ERR_PTR(err); + i915_gem_object_set_to_gtt_domain(obj, true); vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, PIN_MAPPABLE | diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 2ebd79537aea..8bbc835e70ce 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); -int __must_check -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, + bool write); struct i915_vma * __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, struct i915_gem_ww_ctx *ww, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 0727d0c76aa0..b8f0413bc3b0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -188,6 +188,12 @@ struct drm_i915_gem_object { unsigned int cache_coherent:2; #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) + /* + * Note cache_dirty is only a guide; we know when we have written + * through the CPU cache, but we do not know when the CPU may have + * speculatively populated the cache. Before a read via the cache + * of GPU written memory, we have to cautiously invalidate the cache. + */ unsigned int cache_dirty:1; /** diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 33dd4e2a1010..d85ca79ac433 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, u32 dw, u32 val) { - int err; - - i915_gem_object_lock(vma->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); - i915_gem_object_unlock(vma->obj); - if (err) - return err; - return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), vma->size >> PAGE_SHIFT, val); } diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index e937b6629019..6a5a7a7fbae2 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); + i915_gem_object_set_to_gtt_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); + i915_gem_object_set_to_gtt_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); + i915_gem_object_set_to_wc_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); + i915_gem_object_set_to_wc_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) return PTR_ERR(vma); i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); - if (err) - goto out_unlock; + i915_gem_object_set_to_gtt_domain(ctx->obj, true); rq = intel_engine_create_kernel_request(ctx->engine); if (IS_ERR(rq)) { @@ -247,7 +263,6 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) i915_request_add(rq); out_unpin: i915_vma_unpin(vma); -out_unlock: i915_gem_object_unlock(ctx->obj); return err; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c index 3a6ce87f8b52..4d7580762acc 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -53,14 +53,10 @@ static int mock_phys_object(void *arg) /* Make the object dirty so that put_pages must do copy back the data */ i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_set_to_gtt_domain(obj, true); i915_gem_object_unlock(obj); - if (err) { - pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", - err); - goto out_obj; - } + err = 0; out_obj: i915_gem_object_put(obj); out: diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index 0b092c62bb34..ba8c06778b6c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -7,6 +7,7 @@ #include "igt_gem_utils.h" #include "gem/i915_gem_context.h" +#include "gem/i915_gem_clflush.h" #include "gem/i915_gem_pm.h" #include "gt/intel_context.h" #include "gt/intel_gpu_commands.h" @@ -138,6 +139,8 @@ int igt_gpu_fill_dw(struct intel_context *ce, goto skip_request; i915_vma_lock(vma); + if (vma->obj->cache_dirty & ~vma->obj->cache_coherent) + i915_gem_clflush_object(vma->obj, 0); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index cffd7f4f87dc..dbb983970f34 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -301,9 +301,7 @@ static struct i915_vma *i915_gem_gtt_prepare(struct drm_i915_gem_object *obj, if (ret) goto err_ww; - ret = i915_gem_object_set_to_gtt_domain(obj, write); - if (ret) - goto err_ww; + i915_gem_object_set_to_gtt_domain(obj, write); if (!i915_gem_object_is_tiled(obj)) vma = i915_gem_object_ggtt_pin_ww(obj, &ww, NULL, 0, 0, -- 2.30.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Intel-gfx] [PATCH v2 12/12] drm/i915/gem: Manage all set-domain waits explicitly @ 2021-05-26 14:37 ` Tvrtko Ursulin 0 siblings, 0 replies; 38+ messages in thread From: Tvrtko Ursulin @ 2021-05-26 14:37 UTC (permalink / raw) To: Intel-gfx; +Cc: Matthew Auld, dri-devel, Chris Wilson From: Chris Wilson <chris@chris-wilson.co.uk> Only perform the domain transition under the object lock, and push the required waits to outside the lock. v2 (Tvrtko): * Rebase. v3 (Tvrtko): * Restore write to gtt domain in coherency selftest. (Matt) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + drivers/gpu/drm/i915/i915_gem.c | 4 +- 12 files changed, 89 insertions(+), 165 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index daf9284ef1f5..e4c24558eaa8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) { struct clflush *clflush; - GEM_BUG_ON(!obj->cache_dirty); - clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); if (!clflush) return NULL; @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, trace_i915_gem_object_clflush(obj); - clflush = NULL; - if (!(flags & I915_CLFLUSH_SYNC)) - clflush = clflush_work_create(obj); + clflush = clflush_work_create(obj); if (clflush) { i915_sw_fence_await_reservation(&clflush->base.chain, - obj->base.resv, NULL, true, - i915_fence_timeout(to_i915(obj->base.dev)), + obj->base.resv, NULL, true, 0, I915_FENCE_GFP); dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); dma_fence_work_commit(&clflush->base); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h index e6c382973129..4cd5787d1507 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h @@ -9,12 +9,10 @@ #include <linux/types.h> -struct drm_i915_private; struct drm_i915_gem_object; bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, unsigned int flags); #define I915_CLFLUSH_FORCE BIT(0) -#define I915_CLFLUSH_SYNC BIT(1) #endif /* __I915_GEM_CLFLUSH_H__ */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index ccede73c6465..0926e0895ee6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_cpu_domain(obj, write); + i915_gem_object_set_to_cpu_domain(obj, write); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct if (!err) err = i915_gem_object_pin_pages(obj); if (!err) { - err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_set_to_gtt_domain(obj, false); i915_gem_object_unpin_pages(obj); } if (err == -EDEADLK) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 073822100da7..39fda97c49a7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) break; case I915_GEM_DOMAIN_CPU: - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + i915_gem_clflush_object(obj, 0); break; case I915_GEM_DOMAIN_RENDER: @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_WC) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) obj->write_domain = I915_GEM_DOMAIN_WC; obj->mm.dirty = true; } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - if (obj->write_domain == I915_GEM_DOMAIN_GTT) - return 0; - - /* Flush and acquire obj->pages so that we are coherent through - * direct access in memory with previous cached writes through - * shmemfs and that our cache domain tracking remains valid. - * For example, if the obj->filp was moved to swap without us - * being notified and releasing the pages, we would mistakenly - * continue to assume that the obj remained out of the CPU cached - * domain. - */ - ret = i915_gem_object_pin_pages(obj); - if (ret) - return ret; + return; flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_vma_set_ggtt_write(vma); spin_unlock(&obj->vma.lock); } - - i915_gem_object_unpin_pages(obj); - return 0; } /** @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, * This function returns when the move is complete, including waiting on * flushes to occur. */ -int +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) { - int ret; - assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - (write ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); /* Flush the CPU cache if it's still invalid. */ if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); + /* + * While we track when we write though the CPU cache + * (with obj->cache_dirty), this is only a guide as we do + * not know when the CPU may have speculatively populated + * the cache. We have to invalidate such speculative cachelines + * prior to reading writes by the GPU. + */ + i915_gem_clflush_object(obj, 0); obj->read_domains |= I915_GEM_DOMAIN_CPU; } @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) */ if (write) __start_cpu_write(obj); - - return 0; } /** @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT; - /* - * Try to flush the object off the GPU without holding the lock. - * We will repeat the flush holding the lock in the normal manner - * to catch cases where we are gazumped. - */ - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - if (err) - goto out; - if (i915_gem_object_is_userptr(obj)) { /* * Try to grab userptr pages, iris uses set_domain to check * userptr validity */ err = i915_gem_object_userptr_validate(obj); - if (!err) - err = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_PRIORITY | - (write_domain ? I915_WAIT_ALL : 0), - MAX_SCHEDULE_TIMEOUT); - goto out; + if (err) + goto out; } /* @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, goto out_unpin; if (read_domains & I915_GEM_DOMAIN_WC) - err = i915_gem_object_set_to_wc_domain(obj, write_domain); + i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); + i915_gem_object_set_to_gtt_domain(obj, write_domain); else - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); + i915_gem_object_set_to_cpu_domain(obj, write_domain); out_unpin: i915_gem_object_unpin_pages(obj); @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, out_unlock: i915_gem_object_unlock(obj); + err = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_PRIORITY | + (write_domain ? I915_WAIT_ALL : 0), + MAX_SCHEDULE_TIMEOUT); if (!err && write_domain) i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, false); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, false); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu read domain, set ourself into the gtt * read domain and manually flush cachelines (if required). This @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, !(obj->read_domains & I915_GEM_DOMAIN_CPU)) *needs_clflush = CLFLUSH_BEFORE; -out: /* return with the pages pinned */ return 0; @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, assert_object_held(obj); - ret = i915_gem_object_wait(obj, - I915_WAIT_INTERRUPTIBLE | - I915_WAIT_ALL, - MAX_SCHEDULE_TIMEOUT); - if (ret) - return ret; - ret = i915_gem_object_pin_pages(obj); if (ret) return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) - goto err_unpin; - else - goto out; - } + !static_cpu_has(X86_FEATURE_CLFLUSH)) + i915_gem_object_set_to_cpu_domain(obj, true); + else + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); + ret = i915_gem_object_wait(obj, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + if (ret) + goto err_unpin; /* If we're not in the cpu write domain, set ourself into the * gtt write domain and manually flush cachelines (as required). @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, *needs_clflush |= CLFLUSH_BEFORE; } -out: i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); obj->mm.dirty = true; /* return with the pages pinned */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 297143511f99..40fda9e81a78 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, if (use_cpu_reloc(cache, obj)) return NULL; - err = i915_gem_object_set_to_gtt_domain(obj, true); - if (err) - return ERR_PTR(err); + i915_gem_object_set_to_gtt_domain(obj, true); vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, PIN_MAPPABLE | diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 2ebd79537aea..8bbc835e70ce 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); -int __must_check -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, + bool write); +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, + bool write); struct i915_vma * __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, struct i915_gem_ww_ctx *ww, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 0727d0c76aa0..b8f0413bc3b0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -188,6 +188,12 @@ struct drm_i915_gem_object { unsigned int cache_coherent:2; #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) + /* + * Note cache_dirty is only a guide; we know when we have written + * through the CPU cache, but we do not know when the CPU may have + * speculatively populated the cache. Before a read via the cache + * of GPU written memory, we have to cautiously invalidate the cache. + */ unsigned int cache_dirty:1; /** diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 33dd4e2a1010..d85ca79ac433 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, u32 dw, u32 val) { - int err; - - i915_gem_object_lock(vma->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); - i915_gem_object_unlock(vma->obj); - if (err) - return err; - return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), vma->size >> PAGE_SHIFT, val); } diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index e937b6629019..6a5a7a7fbae2 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); + i915_gem_object_set_to_gtt_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) int err = 0; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); + i915_gem_object_set_to_gtt_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); + i915_gem_object_set_to_wc_domain(ctx->obj, true); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_ALL | + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) int err; i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); + i915_gem_object_set_to_wc_domain(ctx->obj, false); i915_gem_object_unlock(ctx->obj); + + err = i915_gem_object_wait(ctx->obj, + I915_WAIT_INTERRUPTIBLE, + HZ / 2); if (err) return err; @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) return PTR_ERR(vma); i915_gem_object_lock(ctx->obj, NULL); - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); - if (err) - goto out_unlock; + i915_gem_object_set_to_gtt_domain(ctx->obj, true); rq = intel_engine_create_kernel_request(ctx->engine); if (IS_ERR(rq)) { @@ -247,7 +263,6 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) i915_request_add(rq); out_unpin: i915_vma_unpin(vma); -out_unlock: i915_gem_object_unlock(ctx->obj); return err; diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c index 3a6ce87f8b52..4d7580762acc 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c @@ -53,14 +53,10 @@ static int mock_phys_object(void *arg) /* Make the object dirty so that put_pages must do copy back the data */ i915_gem_object_lock(obj, NULL); - err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_set_to_gtt_domain(obj, true); i915_gem_object_unlock(obj); - if (err) { - pr_err("i915_gem_object_set_to_gtt_domain failed with err=%d\n", - err); - goto out_obj; - } + err = 0; out_obj: i915_gem_object_put(obj); out: diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index 0b092c62bb34..ba8c06778b6c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -7,6 +7,7 @@ #include "igt_gem_utils.h" #include "gem/i915_gem_context.h" +#include "gem/i915_gem_clflush.h" #include "gem/i915_gem_pm.h" #include "gt/intel_context.h" #include "gt/intel_gpu_commands.h" @@ -138,6 +139,8 @@ int igt_gpu_fill_dw(struct intel_context *ce, goto skip_request; i915_vma_lock(vma); + if (vma->obj->cache_dirty & ~vma->obj->cache_coherent) + i915_gem_clflush_object(vma->obj, 0); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index cffd7f4f87dc..dbb983970f34 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -301,9 +301,7 @@ static struct i915_vma *i915_gem_gtt_prepare(struct drm_i915_gem_object *obj, if (ret) goto err_ww; - ret = i915_gem_object_set_to_gtt_domain(obj, write); - if (ret) - goto err_ww; + i915_gem_object_set_to_gtt_domain(obj, write); if (!i915_gem_object_is_tiled(obj)) vma = i915_gem_object_ggtt_pin_ww(obj, &ww, NULL, 0, 0, -- 2.30.2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly 2021-05-26 14:30 ` [Intel-gfx] " Matthew Auld @ 2021-05-27 10:44 ` Daniel Vetter -1 siblings, 0 replies; 38+ messages in thread From: Daniel Vetter @ 2021-05-27 10:44 UTC (permalink / raw) To: Matthew Auld Cc: Tvrtko Ursulin, Intel-gfx, Chris Wilson, dri-devel, Tvrtko Ursulin On Wed, May 26, 2021 at 03:30:57PM +0100, Matthew Auld wrote: > On 26/05/2021 15:14, Tvrtko Ursulin wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > Only perform the domain transition under the object lock, and push the > > required waits to outside the lock. Do we have actual performance data justifying this? Anything else justifying this? If we resurrect patches, I do expect actual review happens here, and that's not even close the case for this patch here at least. I didn't bother looking at the others. -Daniel > > > > v2 (Tvrtko): > > * Rebase. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- > > drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - > > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- > > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- > > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- > > drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- > > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + > > .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - > > .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- > > .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- > > .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + > > drivers/gpu/drm/i915/i915_gem.c | 4 +- > > 12 files changed, 89 insertions(+), 165 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > > index daf9284ef1f5..e4c24558eaa8 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > > @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) > > { > > struct clflush *clflush; > > - GEM_BUG_ON(!obj->cache_dirty); > > - > > clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); > > if (!clflush) > > return NULL; > > @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > > trace_i915_gem_object_clflush(obj); > > - clflush = NULL; > > - if (!(flags & I915_CLFLUSH_SYNC)) > > - clflush = clflush_work_create(obj); > > + clflush = clflush_work_create(obj); > > if (clflush) { > > i915_sw_fence_await_reservation(&clflush->base.chain, > > - obj->base.resv, NULL, true, > > - i915_fence_timeout(to_i915(obj->base.dev)), > > + obj->base.resv, NULL, true, 0, > > I915_FENCE_GFP); > > dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); > > dma_fence_work_commit(&clflush->base); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > > index e6c382973129..4cd5787d1507 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > > @@ -9,12 +9,10 @@ > > #include <linux/types.h> > > -struct drm_i915_private; > > struct drm_i915_gem_object; > > bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > > unsigned int flags); > > #define I915_CLFLUSH_FORCE BIT(0) > > -#define I915_CLFLUSH_SYNC BIT(1) > > #endif /* __I915_GEM_CLFLUSH_H__ */ > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > > index ccede73c6465..0926e0895ee6 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > > @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire > > if (!err) > > err = i915_gem_object_pin_pages(obj); > > if (!err) { > > - err = i915_gem_object_set_to_cpu_domain(obj, write); > > + i915_gem_object_set_to_cpu_domain(obj, write); > > i915_gem_object_unpin_pages(obj); > > } > > if (err == -EDEADLK) { > > @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct > > if (!err) > > err = i915_gem_object_pin_pages(obj); > > if (!err) { > > - err = i915_gem_object_set_to_gtt_domain(obj, false); > > + i915_gem_object_set_to_gtt_domain(obj, false); > > i915_gem_object_unpin_pages(obj); > > } > > if (err == -EDEADLK) { > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > > index 073822100da7..39fda97c49a7 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > > @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) > > break; > > case I915_GEM_DOMAIN_CPU: > > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > > + i915_gem_clflush_object(obj, 0); > > break; > > case I915_GEM_DOMAIN_RENDER: > > @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) > > * This function returns when the move is complete, including waiting on > > * flushes to occur. > > */ > > -int > > +void > > i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > > { > > - int ret; > > - > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - (write ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > if (obj->write_domain == I915_GEM_DOMAIN_WC) > > - return 0; > > - > > - /* Flush and acquire obj->pages so that we are coherent through > > - * direct access in memory with previous cached writes through > > - * shmemfs and that our cache domain tracking remains valid. > > - * For example, if the obj->filp was moved to swap without us > > - * being notified and releasing the pages, we would mistakenly > > - * continue to assume that the obj remained out of the CPU cached > > - * domain. > > - */ > > - ret = i915_gem_object_pin_pages(obj); > > - if (ret) > > - return ret; > > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); > > @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > > obj->write_domain = I915_GEM_DOMAIN_WC; > > obj->mm.dirty = true; > > } > > - > > - i915_gem_object_unpin_pages(obj); > > - return 0; > > } > > /** > > @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > > * This function returns when the move is complete, including waiting on > > * flushes to occur. > > */ > > -int > > +void > > i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > > { > > - int ret; > > - > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - (write ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > if (obj->write_domain == I915_GEM_DOMAIN_GTT) > > - return 0; > > - > > - /* Flush and acquire obj->pages so that we are coherent through > > - * direct access in memory with previous cached writes through > > - * shmemfs and that our cache domain tracking remains valid. > > - * For example, if the obj->filp was moved to swap without us > > - * being notified and releasing the pages, we would mistakenly > > - * continue to assume that the obj remained out of the CPU cached > > - * domain. > > - */ > > - ret = i915_gem_object_pin_pages(obj); > > - if (ret) > > - return ret; > > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); > > @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > > i915_vma_set_ggtt_write(vma); > > spin_unlock(&obj->vma.lock); > > } > > - > > - i915_gem_object_unpin_pages(obj); > > - return 0; > > } > > /** > > @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > > * This function returns when the move is complete, including waiting on > > * flushes to occur. > > */ > > -int > > +void > > i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > > { > > - int ret; > > - > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - (write ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > /* Flush the CPU cache if it's still invalid. */ > > if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { > > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > > + /* > > + * While we track when we write though the CPU cache > > + * (with obj->cache_dirty), this is only a guide as we do > > + * not know when the CPU may have speculatively populated > > + * the cache. We have to invalidate such speculative cachelines > > + * prior to reading writes by the GPU. > > + */ > > + i915_gem_clflush_object(obj, 0); > > obj->read_domains |= I915_GEM_DOMAIN_CPU; > > } > > @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > > */ > > if (write) > > __start_cpu_write(obj); > > - > > - return 0; > > } > > /** > > @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > > if (!obj) > > return -ENOENT; > > - /* > > - * Try to flush the object off the GPU without holding the lock. > > - * We will repeat the flush holding the lock in the normal manner > > - * to catch cases where we are gazumped. > > - */ > > - err = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - I915_WAIT_PRIORITY | > > - (write_domain ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (err) > > - goto out; > > - > > if (i915_gem_object_is_userptr(obj)) { > > /* > > * Try to grab userptr pages, iris uses set_domain to check > > * userptr validity > > */ > > err = i915_gem_object_userptr_validate(obj); > > - if (!err) > > - err = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - I915_WAIT_PRIORITY | > > - (write_domain ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - goto out; > > + if (err) > > + goto out; > > } > > /* > > @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > > goto out_unpin; > > if (read_domains & I915_GEM_DOMAIN_WC) > > - err = i915_gem_object_set_to_wc_domain(obj, write_domain); > > + i915_gem_object_set_to_wc_domain(obj, write_domain); > > else if (read_domains & I915_GEM_DOMAIN_GTT) > > - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); > > + i915_gem_object_set_to_gtt_domain(obj, write_domain); > > else > > - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); > > + i915_gem_object_set_to_cpu_domain(obj, write_domain); > > out_unpin: > > i915_gem_object_unpin_pages(obj); > > @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > > out_unlock: > > i915_gem_object_unlock(obj); > > + err = i915_gem_object_wait(obj, > > + I915_WAIT_INTERRUPTIBLE | > > + I915_WAIT_PRIORITY | > > + (write_domain ? I915_WAIT_ALL : 0), > > + MAX_SCHEDULE_TIMEOUT); > > if (!err && write_domain) > > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > > @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE, > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > ret = i915_gem_object_pin_pages(obj); > > if (ret) > > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || > > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > > - ret = i915_gem_object_set_to_cpu_domain(obj, false); > > - if (ret) > > - goto err_unpin; > > - else > > - goto out; > > - } > > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > > + i915_gem_object_set_to_cpu_domain(obj, false); > > + else > > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > + ret = i915_gem_object_wait(obj, > > + I915_WAIT_INTERRUPTIBLE, > > + MAX_SCHEDULE_TIMEOUT); > > + if (ret) > > + goto err_unpin; > > /* If we're not in the cpu read domain, set ourself into the gtt > > * read domain and manually flush cachelines (if required). This > > @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > > !(obj->read_domains & I915_GEM_DOMAIN_CPU)) > > *needs_clflush = CLFLUSH_BEFORE; > > -out: > > /* return with the pages pinned */ > > return 0; > > @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - I915_WAIT_ALL, > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > ret = i915_gem_object_pin_pages(obj); > > if (ret) > > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || > > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > > - ret = i915_gem_object_set_to_cpu_domain(obj, true); > > - if (ret) > > - goto err_unpin; > > - else > > - goto out; > > - } > > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > > + i915_gem_object_set_to_cpu_domain(obj, true); > > + else > > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > + ret = i915_gem_object_wait(obj, > > + I915_WAIT_INTERRUPTIBLE | > > + I915_WAIT_ALL, > > + MAX_SCHEDULE_TIMEOUT); > > + if (ret) > > + goto err_unpin; > > /* If we're not in the cpu write domain, set ourself into the > > * gtt write domain and manually flush cachelines (as required). > > @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > > *needs_clflush |= CLFLUSH_BEFORE; > > } > > -out: > > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > > obj->mm.dirty = true; > > /* return with the pages pinned */ > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > index 297143511f99..40fda9e81a78 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, > > if (use_cpu_reloc(cache, obj)) > > return NULL; > > - err = i915_gem_object_set_to_gtt_domain(obj, true); > > - if (err) > > - return ERR_PTR(err); > > + i915_gem_object_set_to_gtt_domain(obj, true); > > vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, > > PIN_MAPPABLE | > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > index 2ebd79537aea..8bbc835e70ce 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, > > void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); > > void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); > > -int __must_check > > -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); > > -int __must_check > > -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); > > -int __must_check > > -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); > > +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, > > + bool write); > > +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, > > + bool write); > > +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, > > + bool write); > > struct i915_vma * __must_check > > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > > struct i915_gem_ww_ctx *ww, > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > index 0727d0c76aa0..b8f0413bc3b0 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > @@ -188,6 +188,12 @@ struct drm_i915_gem_object { > > unsigned int cache_coherent:2; > > #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) > > #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) > > + /* > > + * Note cache_dirty is only a guide; we know when we have written > > + * through the CPU cache, but we do not know when the CPU may have > > + * speculatively populated the cache. Before a read via the cache > > + * of GPU written memory, we have to cautiously invalidate the cache. > > + */ > > unsigned int cache_dirty:1; > > /** > > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > index 33dd4e2a1010..d85ca79ac433 100644 > > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, > > u32 dw, > > u32 val) > > { > > - int err; > > - > > - i915_gem_object_lock(vma->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); > > - i915_gem_object_unlock(vma->obj); > > - if (err) > > - return err; > > - > > return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), > > vma->size >> PAGE_SHIFT, val); > > } > > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > > index e937b6629019..77ba6d1ef4e4 100644 > > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > > @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) > > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > > + i915_gem_object_set_to_gtt_domain(ctx->obj, true); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_ALL | > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) > > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); > > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) > > int err; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); > > + i915_gem_object_set_to_wc_domain(ctx->obj, true); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_ALL | > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) > > int err; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); > > + i915_gem_object_set_to_wc_domain(ctx->obj, false); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) > > return PTR_ERR(vma); > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > > - if (err) > > - goto out_unlock; > > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); > > IIRC Daniel pointed out that this looks odd, since this now becomes > write=false for some reason. I think keep this as write=true, since it does > look like that is what gpu_set wants. -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Intel-gfx] [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly @ 2021-05-27 10:44 ` Daniel Vetter 0 siblings, 0 replies; 38+ messages in thread From: Daniel Vetter @ 2021-05-27 10:44 UTC (permalink / raw) To: Matthew Auld; +Cc: Intel-gfx, Chris Wilson, dri-devel On Wed, May 26, 2021 at 03:30:57PM +0100, Matthew Auld wrote: > On 26/05/2021 15:14, Tvrtko Ursulin wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > Only perform the domain transition under the object lock, and push the > > required waits to outside the lock. Do we have actual performance data justifying this? Anything else justifying this? If we resurrect patches, I do expect actual review happens here, and that's not even close the case for this patch here at least. I didn't bother looking at the others. -Daniel > > > > v2 (Tvrtko): > > * Rebase. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 9 +- > > drivers/gpu/drm/i915/gem/i915_gem_clflush.h | 2 - > > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +- > > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 163 +++++------------- > > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 +- > > drivers/gpu/drm/i915/gem/i915_gem_object.h | 12 +- > > .../gpu/drm/i915/gem/i915_gem_object_types.h | 6 + > > .../gpu/drm/i915/gem/selftests/huge_pages.c | 8 - > > .../i915/gem/selftests/i915_gem_coherency.c | 31 +++- > > .../drm/i915/gem/selftests/i915_gem_phys.c | 8 +- > > .../drm/i915/gem/selftests/igt_gem_utils.c | 3 + > > drivers/gpu/drm/i915/i915_gem.c | 4 +- > > 12 files changed, 89 insertions(+), 165 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > > index daf9284ef1f5..e4c24558eaa8 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c > > @@ -51,8 +51,6 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) > > { > > struct clflush *clflush; > > - GEM_BUG_ON(!obj->cache_dirty); > > - > > clflush = kmalloc(sizeof(*clflush), GFP_KERNEL); > > if (!clflush) > > return NULL; > > @@ -101,13 +99,10 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > > trace_i915_gem_object_clflush(obj); > > - clflush = NULL; > > - if (!(flags & I915_CLFLUSH_SYNC)) > > - clflush = clflush_work_create(obj); > > + clflush = clflush_work_create(obj); > > if (clflush) { > > i915_sw_fence_await_reservation(&clflush->base.chain, > > - obj->base.resv, NULL, true, > > - i915_fence_timeout(to_i915(obj->base.dev)), > > + obj->base.resv, NULL, true, 0, > > I915_FENCE_GFP); > > dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); > > dma_fence_work_commit(&clflush->base); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > > index e6c382973129..4cd5787d1507 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.h > > @@ -9,12 +9,10 @@ > > #include <linux/types.h> > > -struct drm_i915_private; > > struct drm_i915_gem_object; > > bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > > unsigned int flags); > > #define I915_CLFLUSH_FORCE BIT(0) > > -#define I915_CLFLUSH_SYNC BIT(1) > > #endif /* __I915_GEM_CLFLUSH_H__ */ > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > > index ccede73c6465..0926e0895ee6 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > > @@ -132,7 +132,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire > > if (!err) > > err = i915_gem_object_pin_pages(obj); > > if (!err) { > > - err = i915_gem_object_set_to_cpu_domain(obj, write); > > + i915_gem_object_set_to_cpu_domain(obj, write); > > i915_gem_object_unpin_pages(obj); > > } > > if (err == -EDEADLK) { > > @@ -156,7 +156,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct > > if (!err) > > err = i915_gem_object_pin_pages(obj); > > if (!err) { > > - err = i915_gem_object_set_to_gtt_domain(obj, false); > > + i915_gem_object_set_to_gtt_domain(obj, false); > > i915_gem_object_unpin_pages(obj); > > } > > if (err == -EDEADLK) { > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > > index 073822100da7..39fda97c49a7 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > > @@ -49,7 +49,7 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) > > break; > > case I915_GEM_DOMAIN_CPU: > > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > > + i915_gem_clflush_object(obj, 0); > > break; > > case I915_GEM_DOMAIN_RENDER: > > @@ -97,34 +97,13 @@ void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) > > * This function returns when the move is complete, including waiting on > > * flushes to occur. > > */ > > -int > > +void > > i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > > { > > - int ret; > > - > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - (write ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > if (obj->write_domain == I915_GEM_DOMAIN_WC) > > - return 0; > > - > > - /* Flush and acquire obj->pages so that we are coherent through > > - * direct access in memory with previous cached writes through > > - * shmemfs and that our cache domain tracking remains valid. > > - * For example, if the obj->filp was moved to swap without us > > - * being notified and releasing the pages, we would mistakenly > > - * continue to assume that the obj remained out of the CPU cached > > - * domain. > > - */ > > - ret = i915_gem_object_pin_pages(obj); > > - if (ret) > > - return ret; > > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_WC); > > @@ -145,9 +124,6 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > > obj->write_domain = I915_GEM_DOMAIN_WC; > > obj->mm.dirty = true; > > } > > - > > - i915_gem_object_unpin_pages(obj); > > - return 0; > > } > > /** > > @@ -158,34 +134,13 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write) > > * This function returns when the move is complete, including waiting on > > * flushes to occur. > > */ > > -int > > +void > > i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > > { > > - int ret; > > - > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - (write ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > if (obj->write_domain == I915_GEM_DOMAIN_GTT) > > - return 0; > > - > > - /* Flush and acquire obj->pages so that we are coherent through > > - * direct access in memory with previous cached writes through > > - * shmemfs and that our cache domain tracking remains valid. > > - * For example, if the obj->filp was moved to swap without us > > - * being notified and releasing the pages, we would mistakenly > > - * continue to assume that the obj remained out of the CPU cached > > - * domain. > > - */ > > - ret = i915_gem_object_pin_pages(obj); > > - if (ret) > > - return ret; > > + return; > > flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT); > > @@ -214,9 +169,6 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > > i915_vma_set_ggtt_write(vma); > > spin_unlock(&obj->vma.lock); > > } > > - > > - i915_gem_object_unpin_pages(obj); > > - return 0; > > } > > /** > > @@ -431,25 +383,23 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > > * This function returns when the move is complete, including waiting on > > * flushes to occur. > > */ > > -int > > +void > > i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > > { > > - int ret; > > - > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - (write ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > /* Flush the CPU cache if it's still invalid. */ > > if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) { > > - i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); > > + /* > > + * While we track when we write though the CPU cache > > + * (with obj->cache_dirty), this is only a guide as we do > > + * not know when the CPU may have speculatively populated > > + * the cache. We have to invalidate such speculative cachelines > > + * prior to reading writes by the GPU. > > + */ > > + i915_gem_clflush_object(obj, 0); > > obj->read_domains |= I915_GEM_DOMAIN_CPU; > > } > > @@ -463,8 +413,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > > */ > > if (write) > > __start_cpu_write(obj); > > - > > - return 0; > > } > > /** > > @@ -502,32 +450,14 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > > if (!obj) > > return -ENOENT; > > - /* > > - * Try to flush the object off the GPU without holding the lock. > > - * We will repeat the flush holding the lock in the normal manner > > - * to catch cases where we are gazumped. > > - */ > > - err = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - I915_WAIT_PRIORITY | > > - (write_domain ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - if (err) > > - goto out; > > - > > if (i915_gem_object_is_userptr(obj)) { > > /* > > * Try to grab userptr pages, iris uses set_domain to check > > * userptr validity > > */ > > err = i915_gem_object_userptr_validate(obj); > > - if (!err) > > - err = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - I915_WAIT_PRIORITY | > > - (write_domain ? I915_WAIT_ALL : 0), > > - MAX_SCHEDULE_TIMEOUT); > > - goto out; > > + if (err) > > + goto out; > > } > > /* > > @@ -572,11 +502,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > > goto out_unpin; > > if (read_domains & I915_GEM_DOMAIN_WC) > > - err = i915_gem_object_set_to_wc_domain(obj, write_domain); > > + i915_gem_object_set_to_wc_domain(obj, write_domain); > > else if (read_domains & I915_GEM_DOMAIN_GTT) > > - err = i915_gem_object_set_to_gtt_domain(obj, write_domain); > > + i915_gem_object_set_to_gtt_domain(obj, write_domain); > > else > > - err = i915_gem_object_set_to_cpu_domain(obj, write_domain); > > + i915_gem_object_set_to_cpu_domain(obj, write_domain); > > out_unpin: > > i915_gem_object_unpin_pages(obj); > > @@ -584,6 +514,11 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, > > out_unlock: > > i915_gem_object_unlock(obj); > > + err = i915_gem_object_wait(obj, > > + I915_WAIT_INTERRUPTIBLE | > > + I915_WAIT_PRIORITY | > > + (write_domain ? I915_WAIT_ALL : 0), > > + MAX_SCHEDULE_TIMEOUT); > > if (!err && write_domain) > > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > > @@ -608,26 +543,21 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE, > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > ret = i915_gem_object_pin_pages(obj); > > if (ret) > > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || > > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > > - ret = i915_gem_object_set_to_cpu_domain(obj, false); > > - if (ret) > > - goto err_unpin; > > - else > > - goto out; > > - } > > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > > + i915_gem_object_set_to_cpu_domain(obj, false); > > + else > > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > + ret = i915_gem_object_wait(obj, > > + I915_WAIT_INTERRUPTIBLE, > > + MAX_SCHEDULE_TIMEOUT); > > + if (ret) > > + goto err_unpin; > > /* If we're not in the cpu read domain, set ourself into the gtt > > * read domain and manually flush cachelines (if required). This > > @@ -638,7 +568,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, > > !(obj->read_domains & I915_GEM_DOMAIN_CPU)) > > *needs_clflush = CLFLUSH_BEFORE; > > -out: > > /* return with the pages pinned */ > > return 0; > > @@ -658,27 +587,22 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > > assert_object_held(obj); > > - ret = i915_gem_object_wait(obj, > > - I915_WAIT_INTERRUPTIBLE | > > - I915_WAIT_ALL, > > - MAX_SCHEDULE_TIMEOUT); > > - if (ret) > > - return ret; > > - > > ret = i915_gem_object_pin_pages(obj); > > if (ret) > > return ret; > > if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || > > - !static_cpu_has(X86_FEATURE_CLFLUSH)) { > > - ret = i915_gem_object_set_to_cpu_domain(obj, true); > > - if (ret) > > - goto err_unpin; > > - else > > - goto out; > > - } > > + !static_cpu_has(X86_FEATURE_CLFLUSH)) > > + i915_gem_object_set_to_cpu_domain(obj, true); > > + else > > + flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > - flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU); > > + ret = i915_gem_object_wait(obj, > > + I915_WAIT_INTERRUPTIBLE | > > + I915_WAIT_ALL, > > + MAX_SCHEDULE_TIMEOUT); > > + if (ret) > > + goto err_unpin; > > /* If we're not in the cpu write domain, set ourself into the > > * gtt write domain and manually flush cachelines (as required). > > @@ -696,7 +620,6 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, > > *needs_clflush |= CLFLUSH_BEFORE; > > } > > -out: > > i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); > > obj->mm.dirty = true; > > /* return with the pages pinned */ > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > index 297143511f99..40fda9e81a78 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > @@ -1212,9 +1212,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, > > if (use_cpu_reloc(cache, obj)) > > return NULL; > > - err = i915_gem_object_set_to_gtt_domain(obj, true); > > - if (err) > > - return ERR_PTR(err); > > + i915_gem_object_set_to_gtt_domain(obj, true); > > vma = i915_gem_object_ggtt_pin_ww(obj, &eb->ww, NULL, 0, 0, > > PIN_MAPPABLE | > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > index 2ebd79537aea..8bbc835e70ce 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > @@ -515,12 +515,12 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, > > void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); > > void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj); > > -int __must_check > > -i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write); > > -int __must_check > > -i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); > > -int __must_check > > -i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); > > +void i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, > > + bool write); > > +void i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, > > + bool write); > > +void i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, > > + bool write); > > struct i915_vma * __must_check > > i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > > struct i915_gem_ww_ctx *ww, > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > index 0727d0c76aa0..b8f0413bc3b0 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > @@ -188,6 +188,12 @@ struct drm_i915_gem_object { > > unsigned int cache_coherent:2; > > #define I915_BO_CACHE_COHERENT_FOR_READ BIT(0) > > #define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1) > > + /* > > + * Note cache_dirty is only a guide; we know when we have written > > + * through the CPU cache, but we do not know when the CPU may have > > + * speculatively populated the cache. Before a read via the cache > > + * of GPU written memory, we have to cautiously invalidate the cache. > > + */ > > unsigned int cache_dirty:1; > > /** > > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > index 33dd4e2a1010..d85ca79ac433 100644 > > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c > > @@ -972,14 +972,6 @@ static int gpu_write(struct intel_context *ce, > > u32 dw, > > u32 val) > > { > > - int err; > > - > > - i915_gem_object_lock(vma->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(vma->obj, true); > > - i915_gem_object_unlock(vma->obj); > > - if (err) > > - return err; > > - > > return igt_gpu_fill_dw(ce, vma, dw * sizeof(u32), > > vma->size >> PAGE_SHIFT, val); > > } > > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > > index e937b6629019..77ba6d1ef4e4 100644 > > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c > > @@ -90,8 +90,13 @@ static int gtt_set(struct context *ctx, unsigned long offset, u32 v) > > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > > + i915_gem_object_set_to_gtt_domain(ctx->obj, true); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_ALL | > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -123,8 +128,12 @@ static int gtt_get(struct context *ctx, unsigned long offset, u32 *v) > > int err = 0; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, false); > > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -155,8 +164,13 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) > > int err; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_wc_domain(ctx->obj, true); > > + i915_gem_object_set_to_wc_domain(ctx->obj, true); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_ALL | > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -178,8 +192,12 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) > > int err; > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_wc_domain(ctx->obj, false); > > + i915_gem_object_set_to_wc_domain(ctx->obj, false); > > i915_gem_object_unlock(ctx->obj); > > + > > + err = i915_gem_object_wait(ctx->obj, > > + I915_WAIT_INTERRUPTIBLE, > > + HZ / 2); > > if (err) > > return err; > > @@ -205,9 +223,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) > > return PTR_ERR(vma); > > i915_gem_object_lock(ctx->obj, NULL); > > - err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); > > - if (err) > > - goto out_unlock; > > + i915_gem_object_set_to_gtt_domain(ctx->obj, false); > > IIRC Daniel pointed out that this looks odd, since this now becomes > write=false for some reason. I think keep this as write=true, since it does > look like that is what gpu_set wants. -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin (?) (?) @ 2021-05-27 13:33 ` kernel test robot -1 siblings, 0 replies; 38+ messages in thread From: kernel test robot @ 2021-05-27 13:33 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 5988 bytes --] Hi Tvrtko, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-tip/drm-tip] [cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next drm/drm-next v5.13-rc3 next-20210527] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Tvrtko-Ursulin/Catchup-with-a-few-dropped-patches/20210526-221712 base: git://anongit.freedesktop.org/drm/drm-tip drm-tip config: x86_64-rhel-8.3-kbuiltin (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/6ec1bb3d9c72c92dca67e5dc0246034a83e7b0db git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Tvrtko-Ursulin/Catchup-with-a-few-dropped-patches/20210526-221712 git checkout 6ec1bb3d9c72c92dca67e5dc0246034a83e7b0db # save the attached .config to linux build tree make W=1 ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/gpu/drm/i915/gvt/cmd_parser.c: In function 'shadow_indirect_ctx': >> drivers/gpu/drm/i915/gvt/cmd_parser.c:3019:6: error: void value not ignored as it ought to be 3019 | ret = i915_gem_object_set_to_cpu_domain(obj, false); | ^ vim +3019 drivers/gpu/drm/i915/gvt/cmd_parser.c be1da7070aeaee Zhi Wang 2016-05-03 2991 be1da7070aeaee Zhi Wang 2016-05-03 2992 static int shadow_indirect_ctx(struct intel_shadow_wa_ctx *wa_ctx) be1da7070aeaee Zhi Wang 2016-05-03 2993 { be1da7070aeaee Zhi Wang 2016-05-03 2994 int ctx_size = wa_ctx->indirect_ctx.size; be1da7070aeaee Zhi Wang 2016-05-03 2995 unsigned long guest_gma = wa_ctx->indirect_ctx.guest_gma; c10c12558c8bb3 Tina Zhang 2017-03-17 2996 struct intel_vgpu_workload *workload = container_of(wa_ctx, c10c12558c8bb3 Tina Zhang 2017-03-17 2997 struct intel_vgpu_workload, c10c12558c8bb3 Tina Zhang 2017-03-17 2998 wa_ctx); c10c12558c8bb3 Tina Zhang 2017-03-17 2999 struct intel_vgpu *vgpu = workload->vgpu; 894cf7d1563469 Chris Wilson 2016-10-19 3000 struct drm_i915_gem_object *obj; be1da7070aeaee Zhi Wang 2016-05-03 3001 int ret = 0; bcd0aeded478f1 Chris Wilson 2016-10-19 3002 void *map; be1da7070aeaee Zhi Wang 2016-05-03 3003 8fde41076f6df5 Chris Wilson 2020-03-04 3004 obj = i915_gem_object_create_shmem(workload->engine->i915, 894cf7d1563469 Chris Wilson 2016-10-19 3005 roundup(ctx_size + CACHELINE_BYTES, 894cf7d1563469 Chris Wilson 2016-10-19 3006 PAGE_SIZE)); 894cf7d1563469 Chris Wilson 2016-10-19 3007 if (IS_ERR(obj)) 894cf7d1563469 Chris Wilson 2016-10-19 3008 return PTR_ERR(obj); be1da7070aeaee Zhi Wang 2016-05-03 3009 be1da7070aeaee Zhi Wang 2016-05-03 3010 /* get the va of the shadow batch buffer */ bcd0aeded478f1 Chris Wilson 2016-10-19 3011 map = i915_gem_object_pin_map(obj, I915_MAP_WB); bcd0aeded478f1 Chris Wilson 2016-10-19 3012 if (IS_ERR(map)) { 695fbc08d80f93 Tina Zhang 2017-03-10 3013 gvt_vgpu_err("failed to vmap shadow indirect ctx\n"); bcd0aeded478f1 Chris Wilson 2016-10-19 3014 ret = PTR_ERR(map); bcd0aeded478f1 Chris Wilson 2016-10-19 3015 goto put_obj; be1da7070aeaee Zhi Wang 2016-05-03 3016 } be1da7070aeaee Zhi Wang 2016-05-03 3017 80f0b679d6f068 Maarten Lankhorst 2020-08-19 3018 i915_gem_object_lock(obj, NULL); 894cf7d1563469 Chris Wilson 2016-10-19 @3019 ret = i915_gem_object_set_to_cpu_domain(obj, false); 6951e5893b4821 Chris Wilson 2019-05-28 3020 i915_gem_object_unlock(obj); be1da7070aeaee Zhi Wang 2016-05-03 3021 if (ret) { 695fbc08d80f93 Tina Zhang 2017-03-10 3022 gvt_vgpu_err("failed to set shadow indirect ctx to CPU\n"); be1da7070aeaee Zhi Wang 2016-05-03 3023 goto unmap_src; be1da7070aeaee Zhi Wang 2016-05-03 3024 } be1da7070aeaee Zhi Wang 2016-05-03 3025 c10c12558c8bb3 Tina Zhang 2017-03-17 3026 ret = copy_gma_to_hva(workload->vgpu, c10c12558c8bb3 Tina Zhang 2017-03-17 3027 workload->vgpu->gtt.ggtt_mm, bcd0aeded478f1 Chris Wilson 2016-10-19 3028 guest_gma, guest_gma + ctx_size, bcd0aeded478f1 Chris Wilson 2016-10-19 3029 map); 8bcad07a45637f Zhenyu Wang 2017-03-29 3030 if (ret < 0) { 695fbc08d80f93 Tina Zhang 2017-03-10 3031 gvt_vgpu_err("fail to copy guest indirect ctx\n"); 894cf7d1563469 Chris Wilson 2016-10-19 3032 goto unmap_src; be1da7070aeaee Zhi Wang 2016-05-03 3033 } be1da7070aeaee Zhi Wang 2016-05-03 3034 894cf7d1563469 Chris Wilson 2016-10-19 3035 wa_ctx->indirect_ctx.obj = obj; bcd0aeded478f1 Chris Wilson 2016-10-19 3036 wa_ctx->indirect_ctx.shadow_va = map; be1da7070aeaee Zhi Wang 2016-05-03 3037 return 0; be1da7070aeaee Zhi Wang 2016-05-03 3038 be1da7070aeaee Zhi Wang 2016-05-03 3039 unmap_src: bcd0aeded478f1 Chris Wilson 2016-10-19 3040 i915_gem_object_unpin_map(obj); 894cf7d1563469 Chris Wilson 2016-10-19 3041 put_obj: ffeaf9aaf97b4b fred gao 2017-08-16 3042 i915_gem_object_put(obj); be1da7070aeaee Zhi Wang 2016-05-03 3043 return ret; be1da7070aeaee Zhi Wang 2016-05-03 3044 } be1da7070aeaee Zhi Wang 2016-05-03 3045 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 40830 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Catchup with a few dropped patches (rev2) 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin ` (12 preceding siblings ...) (?) @ 2021-05-26 20:19 ` Patchwork -1 siblings, 0 replies; 38+ messages in thread From: Patchwork @ 2021-05-26 20:19 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx == Series Details == Series: Catchup with a few dropped patches (rev2) URL : https://patchwork.freedesktop.org/series/90611/ State : warning == Summary == $ dim checkpatch origin/drm-tip 06eaaa77f454 drm/i915: Take rcu_read_lock for querying fence's driver/timeline names 699c8e356b0c drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex 794fd65790d7 drm/i915: Lift marking a lock as used to utils 0381b2ec34da drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper -:25: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_ptr' - possible side-effects? #25: FILE: drivers/gpu/drm/i915/i915_utils.h:482: +#define try_cmpxchg64(_ptr, _pold, _new) \ +({ \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __cur = cmpxchg64(_ptr, __old, _new); \ + bool success = __cur == __old; \ + if (unlikely(!success)) \ + *_old = __cur; \ + likely(success); \ +}) -:42: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_ptr' - possible side-effects? #42: FILE: drivers/gpu/drm/i915/i915_utils.h:499: +#define xchg64(_ptr, _new) \ +({ \ + __typeof__(_ptr) __ptr = (_ptr); \ + __typeof__(*(_ptr)) __old = *__ptr; \ + while (!try_cmpxchg64(__ptr, &__old, (_new))) \ + ; \ + __old; \ +}) total: 0 errors, 0 warnings, 2 checks, 36 lines checked 4c11f7a7ba2a drm/i915/selftests: Set cache status for huge_gem_object 78904915722c drm/i915/selftests: Use a coherent map to setup scratch batch buffers 4b1c2e4e65db drm/i915/selftests: Replace the unbounded set-domain with an explicit wait b572875c1081 drm/i915/selftests: Remove redundant set-to-gtt-domain 9ac234560778 drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts cd286f3a5c04 drm/i915/selftests: Replace an unbounded set-domain wait with a timeout bbfac81d2ec6 drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission a26743a5500b drm/i915/gem: Manage all set-domain waits explicitly _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Catchup with a few dropped patches (rev2) 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin ` (13 preceding siblings ...) (?) @ 2021-05-26 20:21 ` Patchwork -1 siblings, 0 replies; 38+ messages in thread From: Patchwork @ 2021-05-26 20:21 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx == Series Details == Series: Catchup with a few dropped patches (rev2) URL : https://patchwork.freedesktop.org/series/90611/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. +drivers/gpu/drm/i915/intel_wakeref.c:137:19: warning: context imbalance in 'wakeref_auto_timeout' - unexpected unlock +drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for Catchup with a few dropped patches (rev2) 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin ` (14 preceding siblings ...) (?) @ 2021-05-26 21:00 ` Patchwork -1 siblings, 0 replies; 38+ messages in thread From: Patchwork @ 2021-05-26 21:00 UTC (permalink / raw) To: Tvrtko Ursulin; +Cc: intel-gfx [-- Attachment #1.1: Type: text/plain, Size: 12882 bytes --] == Series Details == Series: Catchup with a few dropped patches (rev2) URL : https://patchwork.freedesktop.org/series/90611/ State : failure == Summary == CI Bug Log - changes from CI_DRM_10138 -> Patchwork_20206 ==================================================== Summary ------- **FAILURE** Serious unknown changes coming with Patchwork_20206 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_20206, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/index.html Possible new issues ------------------- Here are the unknown changes that may have been introduced in Patchwork_20206: ### IGT changes ### #### Possible regressions #### * igt@i915_selftest@live@workarounds: - fi-tgl-u2: [PASS][1] -> [INCOMPLETE][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-tgl-u2/igt@i915_selftest@live@workarounds.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-tgl-u2/igt@i915_selftest@live@workarounds.html - fi-skl-6700k2: [PASS][3] -> [INCOMPLETE][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-skl-6700k2/igt@i915_selftest@live@workarounds.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-skl-6700k2/igt@i915_selftest@live@workarounds.html - fi-icl-y: [PASS][5] -> [INCOMPLETE][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-icl-y/igt@i915_selftest@live@workarounds.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-icl-y/igt@i915_selftest@live@workarounds.html - fi-kbl-x1275: [PASS][7] -> [INCOMPLETE][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-x1275/igt@i915_selftest@live@workarounds.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-x1275/igt@i915_selftest@live@workarounds.html - fi-cfl-guc: [PASS][9] -> [INCOMPLETE][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cfl-guc/igt@i915_selftest@live@workarounds.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cfl-guc/igt@i915_selftest@live@workarounds.html - fi-kbl-7567u: [PASS][11] -> [INCOMPLETE][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-7567u/igt@i915_selftest@live@workarounds.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-7567u/igt@i915_selftest@live@workarounds.html - fi-tgl-y: [PASS][13] -> [INCOMPLETE][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-tgl-y/igt@i915_selftest@live@workarounds.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-tgl-y/igt@i915_selftest@live@workarounds.html - fi-skl-6600u: [PASS][15] -> [INCOMPLETE][16] [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-skl-6600u/igt@i915_selftest@live@workarounds.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-skl-6600u/igt@i915_selftest@live@workarounds.html - fi-glk-dsi: [PASS][17] -> [INCOMPLETE][18] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-glk-dsi/igt@i915_selftest@live@workarounds.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-glk-dsi/igt@i915_selftest@live@workarounds.html - fi-cfl-8700k: [PASS][19] -> [INCOMPLETE][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cfl-8700k/igt@i915_selftest@live@workarounds.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cfl-8700k/igt@i915_selftest@live@workarounds.html - fi-kbl-r: [PASS][21] -> [INCOMPLETE][22] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-r/igt@i915_selftest@live@workarounds.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-r/igt@i915_selftest@live@workarounds.html - fi-icl-u2: [PASS][23] -> [INCOMPLETE][24] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-icl-u2/igt@i915_selftest@live@workarounds.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-icl-u2/igt@i915_selftest@live@workarounds.html - fi-cfl-8109u: [PASS][25] -> [INCOMPLETE][26] [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cfl-8109u/igt@i915_selftest@live@workarounds.html [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cfl-8109u/igt@i915_selftest@live@workarounds.html - fi-kbl-7500u: [PASS][27] -> [INCOMPLETE][28] [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-7500u/igt@i915_selftest@live@workarounds.html [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-7500u/igt@i915_selftest@live@workarounds.html - fi-kbl-guc: [PASS][29] -> [INCOMPLETE][30] [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-guc/igt@i915_selftest@live@workarounds.html [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-guc/igt@i915_selftest@live@workarounds.html - fi-kbl-soraka: [PASS][31] -> [INCOMPLETE][32] [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-soraka/igt@i915_selftest@live@workarounds.html [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-soraka/igt@i915_selftest@live@workarounds.html - fi-cml-u2: [PASS][33] -> [INCOMPLETE][34] [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cml-u2/igt@i915_selftest@live@workarounds.html [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cml-u2/igt@i915_selftest@live@workarounds.html #### Suppressed #### The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@i915_selftest@live@workarounds: - {fi-cml-drallion}: [PASS][35] -> [INCOMPLETE][36] [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cml-drallion/igt@i915_selftest@live@workarounds.html [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cml-drallion/igt@i915_selftest@live@workarounds.html - {fi-tgl-dsi}: [PASS][37] -> [INCOMPLETE][38] [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-tgl-dsi/igt@i915_selftest@live@workarounds.html [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-tgl-dsi/igt@i915_selftest@live@workarounds.html - {fi-rkl-11500t}: [PASS][39] -> [INCOMPLETE][40] [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-rkl-11500t/igt@i915_selftest@live@workarounds.html [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-rkl-11500t/igt@i915_selftest@live@workarounds.html - {fi-ehl-1}: [PASS][41] -> [INCOMPLETE][42] [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-ehl-1/igt@i915_selftest@live@workarounds.html [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-ehl-1/igt@i915_selftest@live@workarounds.html Known issues ------------ Here are the changes found in Patchwork_20206 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@kms_chamelium@dp-crc-fast: - fi-kbl-7500u: [PASS][43] -> [FAIL][44] ([i915#1372]) [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-7500u/igt@kms_chamelium@dp-crc-fast.html #### Warnings #### * igt@runner@aborted: - fi-kbl-x1275: [FAIL][45] ([i915#1436] / [i915#3363]) -> [FAIL][46] ([i915#1436] / [i915#2426] / [i915#3363]) [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-x1275/igt@runner@aborted.html [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-x1275/igt@runner@aborted.html - fi-cfl-8700k: [FAIL][47] ([i915#3363]) -> [FAIL][48] ([i915#2426] / [i915#3363]) [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cfl-8700k/igt@runner@aborted.html [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cfl-8700k/igt@runner@aborted.html - fi-skl-6600u: [FAIL][49] ([i915#1436] / [i915#3363]) -> [FAIL][50] ([i915#1436] / [i915#2426] / [i915#3363]) [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-skl-6600u/igt@runner@aborted.html [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-skl-6600u/igt@runner@aborted.html - fi-cfl-8109u: [FAIL][51] ([i915#3363]) -> [FAIL][52] ([i915#2426] / [i915#3363]) [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cfl-8109u/igt@runner@aborted.html [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cfl-8109u/igt@runner@aborted.html - fi-kbl-r: [FAIL][53] ([i915#1436] / [i915#3363]) -> [FAIL][54] ([i915#1436] / [i915#2426] / [i915#3363]) [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-r/igt@runner@aborted.html [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-r/igt@runner@aborted.html - fi-kbl-7500u: [FAIL][55] ([i915#1436] / [i915#3363]) -> [FAIL][56] ([i915#1436] / [i915#2426] / [i915#3363]) [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-7500u/igt@runner@aborted.html [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-7500u/igt@runner@aborted.html - fi-cfl-guc: [FAIL][57] ([i915#3363]) -> [FAIL][58] ([i915#2426] / [i915#3363]) [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-cfl-guc/igt@runner@aborted.html [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-cfl-guc/igt@runner@aborted.html - fi-kbl-7567u: [FAIL][59] ([i915#1436] / [i915#3363]) -> [FAIL][60] ([i915#1436] / [i915#2426] / [i915#3363]) [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-kbl-7567u/igt@runner@aborted.html [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-kbl-7567u/igt@runner@aborted.html - fi-skl-6700k2: [FAIL][61] ([i915#1436] / [i915#3363]) -> [FAIL][62] ([i915#1436] / [i915#2426] / [i915#3363]) [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10138/fi-skl-6700k2/igt@runner@aborted.html [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/fi-skl-6700k2/igt@runner@aborted.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#1222]: https://gitlab.freedesktop.org/drm/intel/issues/1222 [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372 [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436 [i915#2082]: https://gitlab.freedesktop.org/drm/intel/issues/2082 [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426 [i915#2932]: https://gitlab.freedesktop.org/drm/intel/issues/2932 [i915#2966]: https://gitlab.freedesktop.org/drm/intel/issues/2966 [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363 [i915#3462]: https://gitlab.freedesktop.org/drm/intel/issues/3462 Participating hosts (44 -> 41) ------------------------------ Missing (3): fi-ilk-m540 fi-bdw-samus fi-hsw-4200u Build changes ------------- * Linux: CI_DRM_10138 -> Patchwork_20206 CI-20190529: 20190529 CI_DRM_10138: 041f69e539b30565783cd1298842cc269f5005cb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6094: f62d8953c0bc5ed68ea978662e62f9dbb46cf101 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_20206: a26743a5500b0ae723e62eb794f09db88abecf90 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == a26743a5500b drm/i915/gem: Manage all set-domain waits explicitly bbfac81d2ec6 drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission cd286f3a5c04 drm/i915/selftests: Replace an unbounded set-domain wait with a timeout 9ac234560778 drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts b572875c1081 drm/i915/selftests: Remove redundant set-to-gtt-domain 4b1c2e4e65db drm/i915/selftests: Replace the unbounded set-domain with an explicit wait 78904915722c drm/i915/selftests: Use a coherent map to setup scratch batch buffers 4c11f7a7ba2a drm/i915/selftests: Set cache status for huge_gem_object 0381b2ec34da drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper 794fd65790d7 drm/i915: Lift marking a lock as used to utils 699c8e356b0c drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex 06eaaa77f454 drm/i915: Take rcu_read_lock for querying fence's driver/timeline names == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20206/index.html [-- Attachment #1.2: Type: text/html, Size: 16057 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2021-05-27 13:33 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-26 14:14 [PATCH 00/12] Catchup with a few dropped patches Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 01/12] drm/i915: Take rcu_read_lock for querying fence's driver/timeline names Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-27 10:46 ` Daniel Vetter 2021-05-27 10:46 ` Daniel Vetter 2021-05-26 14:14 ` [PATCH 02/12] drm/i915: Remove notion of GEM from i915_gem_shrinker_taints_mutex Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 03/12] drm/i915: Lift marking a lock as used to utils Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 04/12] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 05/12] drm/i915/selftests: Set cache status for huge_gem_object Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 06/12] drm/i915/selftests: Use a coherent map to setup scratch batch buffers Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 07/12] drm/i915/selftests: Replace the unbounded set-domain with an explicit wait Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 08/12] drm/i915/selftests: Remove redundant set-to-gtt-domain Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 09/12] drm/i915/selftests: Replace unbound set-domain waits with explicit timeouts Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 10/12] drm/i915/selftests: Replace an unbounded set-domain wait with a timeout Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 11/12] drm/i915/selftests: Remove redundant set-to-gtt-domain before batch submission Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:14 ` [PATCH 12/12] drm/i915/gem: Manage all set-domain waits explicitly Tvrtko Ursulin 2021-05-26 14:14 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-26 14:30 ` Matthew Auld 2021-05-26 14:30 ` [Intel-gfx] " Matthew Auld 2021-05-26 14:37 ` [PATCH v2 " Tvrtko Ursulin 2021-05-26 14:37 ` [Intel-gfx] " Tvrtko Ursulin 2021-05-27 10:44 ` [PATCH " Daniel Vetter 2021-05-27 10:44 ` [Intel-gfx] " Daniel Vetter 2021-05-27 13:33 ` kernel test robot 2021-05-26 20:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Catchup with a few dropped patches (rev2) Patchwork 2021-05-26 20:21 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork 2021-05-26 21:00 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.