* [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
@ 2020-07-15 10:50 Chris Wilson
2020-07-15 12:06 ` Tvrtko Ursulin
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Chris Wilson @ 2020-07-15 10:50 UTC (permalink / raw)
To: intel-gfx; +Cc: Matthew Auld, Chris Wilson
Currently, we use i915_request_completed() directly in
i915_request_wait() and follow up with a manual invocation of
dma_fence_signal(). This appears to cause a large number of contentions
on i915_request.lock as when the process is woken up after the fence is
signaled by an interrupt, we will then try and call dma_fence_signal()
ourselves while the signaler is still holding the lock.
dma_fence_is_signaled() has the benefit of checking the
DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
avoids most of that contention.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_request.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0b2fe55e6194..bb4eb1a8780e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1640,7 +1640,7 @@ static bool busywait_stop(unsigned long timeout, unsigned int cpu)
return this_cpu != cpu;
}
-static bool __i915_spin_request(const struct i915_request * const rq, int state)
+static bool __i915_spin_request(struct i915_request * const rq, int state)
{
unsigned long timeout_ns;
unsigned int cpu;
@@ -1673,7 +1673,7 @@ static bool __i915_spin_request(const struct i915_request * const rq, int state)
timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
timeout_ns += local_clock_ns(&cpu);
do {
- if (i915_request_completed(rq))
+ if (dma_fence_is_signaled(&rq->fence))
return true;
if (signal_pending_state(state, current))
@@ -1766,10 +1766,8 @@ long i915_request_wait(struct i915_request *rq,
* duration, which we currently lack.
*/
if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
- __i915_spin_request(rq, state)) {
- dma_fence_signal(&rq->fence);
+ __i915_spin_request(rq, state))
goto out;
- }
/*
* This client is about to stall waiting for the GPU. In many cases
@@ -1796,10 +1794,8 @@ long i915_request_wait(struct i915_request *rq,
for (;;) {
set_current_state(state);
- if (i915_request_completed(rq)) {
- dma_fence_signal(&rq->fence);
+ if (dma_fence_is_signaled(&rq->fence))
break;
- }
intel_engine_flush_submission(rq->engine);
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 10:50 [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait Chris Wilson
@ 2020-07-15 12:06 ` Tvrtko Ursulin
2020-07-15 12:26 ` Tvrtko Ursulin
2020-07-15 12:12 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for " Patchwork
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2020-07-15 12:06 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Matthew Auld
On 15/07/2020 11:50, Chris Wilson wrote:
> Currently, we use i915_request_completed() directly in
> i915_request_wait() and follow up with a manual invocation of
> dma_fence_signal(). This appears to cause a large number of contentions
> on i915_request.lock as when the process is woken up after the fence is
> signaled by an interrupt, we will then try and call dma_fence_signal()
> ourselves while the signaler is still holding the lock.
> dma_fence_is_signaled() has the benefit of checking the
> DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
> avoids most of that contention.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_request.c | 12 ++++--------
> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 0b2fe55e6194..bb4eb1a8780e 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1640,7 +1640,7 @@ static bool busywait_stop(unsigned long timeout, unsigned int cpu)
> return this_cpu != cpu;
> }
>
> -static bool __i915_spin_request(const struct i915_request * const rq, int state)
> +static bool __i915_spin_request(struct i915_request * const rq, int state)
> {
> unsigned long timeout_ns;
> unsigned int cpu;
> @@ -1673,7 +1673,7 @@ static bool __i915_spin_request(const struct i915_request * const rq, int state)
> timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
> timeout_ns += local_clock_ns(&cpu);
> do {
> - if (i915_request_completed(rq))
> + if (dma_fence_is_signaled(&rq->fence))
> return true;
>
> if (signal_pending_state(state, current))
> @@ -1766,10 +1766,8 @@ long i915_request_wait(struct i915_request *rq,
> * duration, which we currently lack.
> */
> if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
> - __i915_spin_request(rq, state)) {
> - dma_fence_signal(&rq->fence);
> + __i915_spin_request(rq, state))
> goto out;
> - }
>
> /*
> * This client is about to stall waiting for the GPU. In many cases
> @@ -1796,10 +1794,8 @@ long i915_request_wait(struct i915_request *rq,
> for (;;) {
> set_current_state(state);
>
> - if (i915_request_completed(rq)) {
> - dma_fence_signal(&rq->fence);
> + if (dma_fence_is_signaled(&rq->fence))
> break;
> - }
>
> intel_engine_flush_submission(rq->engine);
>
>
In other words putting some latency back into the waiters, which is
probably okay, since sync waits is not our primary model.
I have a slight concern about the remaining value of busy spinning if
i915_request_completed check is removed from there as well. Of course it
doesn't make sense to have different completion criteria between the
two.. We could wait a bit longer if real check in busyspin said request
is actually completed, just not signal it but wait for the breadcrumbs
to do it.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 10:50 [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait Chris Wilson
2020-07-15 12:06 ` Tvrtko Ursulin
@ 2020-07-15 12:12 ` Patchwork
2020-07-15 12:34 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-07-15 16:11 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2020-07-15 12:12 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Reduce i915_request.lock contention for i915_request_wait
URL : https://patchwork.freedesktop.org/series/79514/
State : warning
== Summary ==
$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 12:06 ` Tvrtko Ursulin
@ 2020-07-15 12:26 ` Tvrtko Ursulin
2020-07-15 14:47 ` Chris Wilson
0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2020-07-15 12:26 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Matthew Auld
On 15/07/2020 13:06, Tvrtko Ursulin wrote:
>
> On 15/07/2020 11:50, Chris Wilson wrote:
>> Currently, we use i915_request_completed() directly in
>> i915_request_wait() and follow up with a manual invocation of
>> dma_fence_signal(). This appears to cause a large number of contentions
>> on i915_request.lock as when the process is woken up after the fence is
>> signaled by an interrupt, we will then try and call dma_fence_signal()
>> ourselves while the signaler is still holding the lock.
>> dma_fence_is_signaled() has the benefit of checking the
>> DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
>> avoids most of that contention.
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Matthew Auld <matthew.auld@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_request.c | 12 ++++--------
>> 1 file changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_request.c
>> b/drivers/gpu/drm/i915/i915_request.c
>> index 0b2fe55e6194..bb4eb1a8780e 100644
>> --- a/drivers/gpu/drm/i915/i915_request.c
>> +++ b/drivers/gpu/drm/i915/i915_request.c
>> @@ -1640,7 +1640,7 @@ static bool busywait_stop(unsigned long timeout,
>> unsigned int cpu)
>> return this_cpu != cpu;
>> }
>> -static bool __i915_spin_request(const struct i915_request * const rq,
>> int state)
>> +static bool __i915_spin_request(struct i915_request * const rq, int
>> state)
>> {
>> unsigned long timeout_ns;
>> unsigned int cpu;
>> @@ -1673,7 +1673,7 @@ static bool __i915_spin_request(const struct
>> i915_request * const rq, int state)
>> timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
>> timeout_ns += local_clock_ns(&cpu);
>> do {
>> - if (i915_request_completed(rq))
>> + if (dma_fence_is_signaled(&rq->fence))
>> return true;
>> if (signal_pending_state(state, current))
>> @@ -1766,10 +1766,8 @@ long i915_request_wait(struct i915_request *rq,
>> * duration, which we currently lack.
>> */
>> if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
>> - __i915_spin_request(rq, state)) {
>> - dma_fence_signal(&rq->fence);
>> + __i915_spin_request(rq, state))
>> goto out;
>> - }
>> /*
>> * This client is about to stall waiting for the GPU. In many cases
>> @@ -1796,10 +1794,8 @@ long i915_request_wait(struct i915_request *rq,
>> for (;;) {
>> set_current_state(state);
>> - if (i915_request_completed(rq)) {
>> - dma_fence_signal(&rq->fence);
>> + if (dma_fence_is_signaled(&rq->fence))
>> break;
>> - }
>> intel_engine_flush_submission(rq->engine);
>>
>
> In other words putting some latency back into the waiters, which is
> probably okay, since sync waits is not our primary model.
>
> I have a slight concern about the remaining value of busy spinning if
> i915_request_completed check is removed from there as well. Of course it
> doesn't make sense to have different completion criteria between the
> two.. We could wait a bit longer if real check in busyspin said request
> is actually completed, just not signal it but wait for the breadcrumbs
> to do it.
What a load of nonsense.. :)
Okay, I think the only real question is i915_request_completed vs
dma_fence_signaled in __i915_spin_request. Do we want to burn CPU cycles
waiting on GPU and breadcrumb irq work, or just the GPU.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 10:50 [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait Chris Wilson
2020-07-15 12:06 ` Tvrtko Ursulin
2020-07-15 12:12 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for " Patchwork
@ 2020-07-15 12:34 ` Patchwork
2020-07-15 16:11 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2020-07-15 12:34 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 6494 bytes --]
== Series Details ==
Series: drm/i915: Reduce i915_request.lock contention for i915_request_wait
URL : https://patchwork.freedesktop.org/series/79514/
State : success
== Summary ==
CI Bug Log - changes from CI_DRM_8749 -> Patchwork_18176
====================================================
Summary
-------
**SUCCESS**
No regressions found.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/index.html
Known issues
------------
Here are the changes found in Patchwork_18176 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_exec_suspend@basic-s0:
- fi-tgl-u2: [PASS][1] -> [FAIL][2] ([i915#1888])
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html
* igt@i915_pm_rpm@module-reload:
- fi-kbl-guc: [PASS][3] -> [SKIP][4] ([fdo#109271])
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-kbl-guc/igt@i915_pm_rpm@module-reload.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-kbl-guc/igt@i915_pm_rpm@module-reload.html
* igt@i915_selftest@live@gt_contexts:
- fi-snb-2520m: [PASS][5] -> [DMESG-FAIL][6] ([i915#541])
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-snb-2520m/igt@i915_selftest@live@gt_contexts.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-snb-2520m/igt@i915_selftest@live@gt_contexts.html
* igt@kms_force_connector_basic@force-connector-state:
- fi-tgl-y: [PASS][7] -> [DMESG-WARN][8] ([i915#1982]) +1 similar issue
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-tgl-y/igt@kms_force_connector_basic@force-connector-state.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-tgl-y/igt@kms_force_connector_basic@force-connector-state.html
* igt@prime_vgem@basic-write:
- fi-tgl-y: [PASS][9] -> [DMESG-WARN][10] ([i915#402]) +1 similar issue
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-tgl-y/igt@prime_vgem@basic-write.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-tgl-y/igt@prime_vgem@basic-write.html
#### Possible fixes ####
* igt@i915_pm_rpm@module-reload:
- fi-byt-j1900: [DMESG-WARN][11] ([i915#1982]) -> [PASS][12]
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-byt-j1900/igt@i915_pm_rpm@module-reload.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-byt-j1900/igt@i915_pm_rpm@module-reload.html
* igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
- fi-bsw-n3050: [DMESG-WARN][13] ([i915#1982]) -> [PASS][14]
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-bsw-n3050/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-bsw-n3050/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
- fi-bsw-kefka: [DMESG-WARN][15] ([i915#1982]) -> [PASS][16]
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
* igt@kms_flip@basic-plain-flip@d-dsi1:
- {fi-tgl-dsi}: [DMESG-WARN][17] ([i915#1982]) -> [PASS][18]
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-tgl-dsi/igt@kms_flip@basic-plain-flip@d-dsi1.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-tgl-dsi/igt@kms_flip@basic-plain-flip@d-dsi1.html
* igt@vgem_basic@create:
- fi-tgl-y: [DMESG-WARN][19] ([i915#402]) -> [PASS][20]
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-tgl-y/igt@vgem_basic@create.html
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-tgl-y/igt@vgem_basic@create.html
#### Warnings ####
* igt@kms_cursor_legacy@basic-flip-before-cursor-varying-size:
- fi-kbl-x1275: [DMESG-WARN][21] ([i915#62] / [i915#92]) -> [DMESG-WARN][22] ([i915#62] / [i915#92] / [i915#95]) +5 similar issues
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-before-cursor-varying-size.html
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-before-cursor-varying-size.html
* igt@kms_force_connector_basic@force-edid:
- fi-kbl-x1275: [DMESG-WARN][23] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][24] ([i915#62] / [i915#92]) +1 similar issue
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
[i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
[i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
[i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
[i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541
[i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
[i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
[i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95
Participating hosts (47 -> 40)
------------------------------
Missing (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* Linux: CI_DRM_8749 -> Patchwork_18176
CI-20190529: 20190529
CI_DRM_8749: c704cccd64e81bb9a8d23c414fdcafff65697801 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5735: 21f8204e54c122e4a0f8ca4b59e4b2db8d1ba687 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_18176: e688f32314fbdcc1be26177cb1fb29f3be801120 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
e688f32314fb drm/i915: Reduce i915_request.lock contention for i915_request_wait
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/index.html
[-- Attachment #1.2: Type: text/html, Size: 8184 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 12:26 ` Tvrtko Ursulin
@ 2020-07-15 14:47 ` Chris Wilson
2020-07-15 14:47 ` Chris Wilson
0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2020-07-15 14:47 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld
Quoting Tvrtko Ursulin (2020-07-15 13:26:23)
>
> On 15/07/2020 13:06, Tvrtko Ursulin wrote:
> >
> > On 15/07/2020 11:50, Chris Wilson wrote:
> >> Currently, we use i915_request_completed() directly in
> >> i915_request_wait() and follow up with a manual invocation of
> >> dma_fence_signal(). This appears to cause a large number of contentions
> >> on i915_request.lock as when the process is woken up after the fence is
> >> signaled by an interrupt, we will then try and call dma_fence_signal()
> >> ourselves while the signaler is still holding the lock.
> >> dma_fence_is_signaled() has the benefit of checking the
> >> DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
> >> avoids most of that contention.
> >>
> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Matthew Auld <matthew.auld@intel.com>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >> drivers/gpu/drm/i915/i915_request.c | 12 ++++--------
> >> 1 file changed, 4 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_request.c
> >> b/drivers/gpu/drm/i915/i915_request.c
> >> index 0b2fe55e6194..bb4eb1a8780e 100644
> >> --- a/drivers/gpu/drm/i915/i915_request.c
> >> +++ b/drivers/gpu/drm/i915/i915_request.c
> >> @@ -1640,7 +1640,7 @@ static bool busywait_stop(unsigned long timeout,
> >> unsigned int cpu)
> >> return this_cpu != cpu;
> >> }
> >> -static bool __i915_spin_request(const struct i915_request * const rq,
> >> int state)
> >> +static bool __i915_spin_request(struct i915_request * const rq, int
> >> state)
> >> {
> >> unsigned long timeout_ns;
> >> unsigned int cpu;
> >> @@ -1673,7 +1673,7 @@ static bool __i915_spin_request(const struct
> >> i915_request * const rq, int state)
> >> timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
> >> timeout_ns += local_clock_ns(&cpu);
> >> do {
> >> - if (i915_request_completed(rq))
> >> + if (dma_fence_is_signaled(&rq->fence))
> >> return true;
> >> if (signal_pending_state(state, current))
> >> @@ -1766,10 +1766,8 @@ long i915_request_wait(struct i915_request *rq,
> >> * duration, which we currently lack.
> >> */
> >> if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
> >> - __i915_spin_request(rq, state)) {
> >> - dma_fence_signal(&rq->fence);
> >> + __i915_spin_request(rq, state))
> >> goto out;
> >> - }
> >> /*
> >> * This client is about to stall waiting for the GPU. In many cases
> >> @@ -1796,10 +1794,8 @@ long i915_request_wait(struct i915_request *rq,
> >> for (;;) {
> >> set_current_state(state);
> >> - if (i915_request_completed(rq)) {
> >> - dma_fence_signal(&rq->fence);
> >> + if (dma_fence_is_signaled(&rq->fence))
> >> break;
> >> - }
> >> intel_engine_flush_submission(rq->engine);
> >>
> >
> > In other words putting some latency back into the waiters, which is
> > probably okay, since sync waits is not our primary model.
> >
> > I have a slight concern about the remaining value of busy spinning if
> > i915_request_completed check is removed from there as well. Of course it
> > doesn't make sense to have different completion criteria between the
> > two.. We could wait a bit longer if real check in busyspin said request
> > is actually completed, just not signal it but wait for the breadcrumbs
> > to do it.
>
> What a load of nonsense.. :)
>
> Okay, I think the only real question is i915_request_completed vs
> dma_fence_signaled in __i915_spin_request. Do we want to burn CPU cycles
> waiting on GPU and breadcrumb irq work, or just the GPU.
dma_fence_is_signaled() {
if (test_bit(SIGNALED_BIT))
return true;
if (i915_request_completed()) {
dma_fence_signal();
return true;
}
return false;
}
with the indirection. So the question is whether the indirection is
worth the extra test bit. Just purely looking at the i915_request.lock
contention suggests that it probably is. For the spinner, burning a few
extra CPU cycles for *vfunc is not an issue, it's the wakeup latency,
and since we are calling dma_fence_signal() upon wakeup we do take the
spinlock without checking for an early return from the SIGNALED_BIT.
So I think it's a net positive. The alternative was to write
if (i915_request_completed()) {
if (!i915_request_is_signaled())
dma_fence_signal();
break;
}
but
if (dma_fence_is_signaled())
break;
does appear simpler, if only by virtue of hiding the details in an
inline.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 14:47 ` Chris Wilson
@ 2020-07-15 14:47 ` Chris Wilson
2020-07-16 8:41 ` Tvrtko Ursulin
0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2020-07-15 14:47 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld
Quoting Chris Wilson (2020-07-15 15:47:15)
> Quoting Tvrtko Ursulin (2020-07-15 13:26:23)
> >
> > On 15/07/2020 13:06, Tvrtko Ursulin wrote:
> > >
> > > On 15/07/2020 11:50, Chris Wilson wrote:
> > >> Currently, we use i915_request_completed() directly in
> > >> i915_request_wait() and follow up with a manual invocation of
> > >> dma_fence_signal(). This appears to cause a large number of contentions
> > >> on i915_request.lock as when the process is woken up after the fence is
> > >> signaled by an interrupt, we will then try and call dma_fence_signal()
> > >> ourselves while the signaler is still holding the lock.
> > >> dma_fence_is_signaled() has the benefit of checking the
> > >> DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
> > >> avoids most of that contention.
> > >>
> > >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >> Cc: Matthew Auld <matthew.auld@intel.com>
> > >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >> ---
> > >> drivers/gpu/drm/i915/i915_request.c | 12 ++++--------
> > >> 1 file changed, 4 insertions(+), 8 deletions(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/i915_request.c
> > >> b/drivers/gpu/drm/i915/i915_request.c
> > >> index 0b2fe55e6194..bb4eb1a8780e 100644
> > >> --- a/drivers/gpu/drm/i915/i915_request.c
> > >> +++ b/drivers/gpu/drm/i915/i915_request.c
> > >> @@ -1640,7 +1640,7 @@ static bool busywait_stop(unsigned long timeout,
> > >> unsigned int cpu)
> > >> return this_cpu != cpu;
> > >> }
> > >> -static bool __i915_spin_request(const struct i915_request * const rq,
> > >> int state)
> > >> +static bool __i915_spin_request(struct i915_request * const rq, int
> > >> state)
> > >> {
> > >> unsigned long timeout_ns;
> > >> unsigned int cpu;
> > >> @@ -1673,7 +1673,7 @@ static bool __i915_spin_request(const struct
> > >> i915_request * const rq, int state)
> > >> timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
> > >> timeout_ns += local_clock_ns(&cpu);
> > >> do {
> > >> - if (i915_request_completed(rq))
> > >> + if (dma_fence_is_signaled(&rq->fence))
> > >> return true;
> > >> if (signal_pending_state(state, current))
> > >> @@ -1766,10 +1766,8 @@ long i915_request_wait(struct i915_request *rq,
> > >> * duration, which we currently lack.
> > >> */
> > >> if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
> > >> - __i915_spin_request(rq, state)) {
> > >> - dma_fence_signal(&rq->fence);
> > >> + __i915_spin_request(rq, state))
> > >> goto out;
> > >> - }
> > >> /*
> > >> * This client is about to stall waiting for the GPU. In many cases
> > >> @@ -1796,10 +1794,8 @@ long i915_request_wait(struct i915_request *rq,
> > >> for (;;) {
> > >> set_current_state(state);
> > >> - if (i915_request_completed(rq)) {
> > >> - dma_fence_signal(&rq->fence);
> > >> + if (dma_fence_is_signaled(&rq->fence))
> > >> break;
> > >> - }
> > >> intel_engine_flush_submission(rq->engine);
> > >>
> > >
> > > In other words putting some latency back into the waiters, which is
> > > probably okay, since sync waits is not our primary model.
> > >
> > > I have a slight concern about the remaining value of busy spinning if
> > > i915_request_completed check is removed from there as well. Of course it
> > > doesn't make sense to have different completion criteria between the
> > > two.. We could wait a bit longer if real check in busyspin said request
> > > is actually completed, just not signal it but wait for the breadcrumbs
> > > to do it.
> >
> > What a load of nonsense.. :)
> >
> > Okay, I think the only real question is i915_request_completed vs
> > dma_fence_signaled in __i915_spin_request. Do we want to burn CPU cycles
> > waiting on GPU and breadcrumb irq work, or just the GPU.
>
> dma_fence_is_signaled() {
> if (test_bit(SIGNALED_BIT))
> return true;
>
> if (i915_request_completed()) {
> dma_fence_signal();
> return true;
> }
>
> return false;
> }
>
> with the indirection. So the question is whether the indirection is
> worth the extra test bit. Just purely looking at the i915_request.lock
> contention suggests that it probably is. For the spinner, burning a few
> extra CPU cycles for *vfunc is not an issue, it's the wakeup latency,
> and since we are calling dma_fence_signal() upon wakeup we do take the
> spinlock without checking for an early return from the SIGNALED_BIT.
> So I think it's a net positive. The alternative was to write
>
> if (i915_request_completed()) {
> if (!i915_request_is_signaled())
> dma_fence_signal();
> break;
> }
>
> but
>
> if (dma_fence_is_signaled())
> break;
>
> does appear simpler, if only by virtue of hiding the details in an
> inline.
Or a patch to add test_bit() to dma_fence_signal, but this looked
better.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 10:50 [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait Chris Wilson
` (2 preceding siblings ...)
2020-07-15 12:34 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-07-15 16:11 ` Patchwork
3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2020-07-15 16:11 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 14362 bytes --]
== Series Details ==
Series: drm/i915: Reduce i915_request.lock contention for i915_request_wait
URL : https://patchwork.freedesktop.org/series/79514/
State : success
== Summary ==
CI Bug Log - changes from CI_DRM_8749_full -> Patchwork_18176_full
====================================================
Summary
-------
**SUCCESS**
No regressions found.
Known issues
------------
Here are the changes found in Patchwork_18176_full that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_exec_balancer@bonded-early:
- shard-kbl: [PASS][1] -> [FAIL][2] ([i915#2079])
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-kbl3/igt@gem_exec_balancer@bonded-early.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-kbl7/igt@gem_exec_balancer@bonded-early.html
* igt@gem_exec_whisper@basic-queues-priority-all:
- shard-glk: [PASS][3] -> [DMESG-WARN][4] ([i915#118] / [i915#95])
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-glk7/igt@gem_exec_whisper@basic-queues-priority-all.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-glk5/igt@gem_exec_whisper@basic-queues-priority-all.html
* igt@gem_softpin@evict-active:
- shard-tglb: [PASS][5] -> [DMESG-WARN][6] ([i915#402]) +2 similar issues
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-tglb5/igt@gem_softpin@evict-active.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-tglb8/igt@gem_softpin@evict-active.html
* igt@kms_big_fb@y-tiled-64bpp-rotate-180:
- shard-glk: [PASS][7] -> [DMESG-FAIL][8] ([i915#118] / [i915#95])
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-glk9/igt@kms_big_fb@y-tiled-64bpp-rotate-180.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-glk8/igt@kms_big_fb@y-tiled-64bpp-rotate-180.html
* igt@kms_cursor_crc@pipe-b-cursor-64x64-sliding:
- shard-skl: [PASS][9] -> [FAIL][10] ([i915#54])
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl8/igt@kms_cursor_crc@pipe-b-cursor-64x64-sliding.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl1/igt@kms_cursor_crc@pipe-b-cursor-64x64-sliding.html
* igt@kms_flip@2x-flip-vs-expired-vblank@ab-hdmi-a1-hdmi-a2:
- shard-glk: [PASS][11] -> [FAIL][12] ([i915#79])
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-glk1/igt@kms_flip@2x-flip-vs-expired-vblank@ab-hdmi-a1-hdmi-a2.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-glk7/igt@kms_flip@2x-flip-vs-expired-vblank@ab-hdmi-a1-hdmi-a2.html
* igt@kms_flip@basic-plain-flip@a-edp1:
- shard-skl: [PASS][13] -> [DMESG-WARN][14] ([i915#1982]) +11 similar issues
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl7/igt@kms_flip@basic-plain-flip@a-edp1.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl4/igt@kms_flip@basic-plain-flip@a-edp1.html
* igt@kms_flip@flip-vs-suspend-interruptible@c-dp1:
- shard-kbl: [PASS][15] -> [DMESG-WARN][16] ([i915#165])
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-kbl6/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html
* igt@kms_frontbuffer_tracking@fbcpsr-stridechange:
- shard-tglb: [PASS][17] -> [DMESG-WARN][18] ([i915#1982]) +2 similar issues
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-tglb7/igt@kms_frontbuffer_tracking@fbcpsr-stridechange.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-tglb2/igt@kms_frontbuffer_tracking@fbcpsr-stridechange.html
* igt@kms_hdr@bpc-switch-suspend:
- shard-kbl: [PASS][19] -> [DMESG-WARN][20] ([i915#180]) +3 similar issues
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-kbl3/igt@kms_hdr@bpc-switch-suspend.html
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-kbl7/igt@kms_hdr@bpc-switch-suspend.html
* igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
- shard-skl: [PASS][21] -> [FAIL][22] ([fdo#108145] / [i915#265]) +1 similar issue
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl5/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl9/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
* igt@kms_psr@psr2_primary_page_flip:
- shard-iclb: [PASS][23] -> [SKIP][24] ([fdo#109441])
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-iclb2/igt@kms_psr@psr2_primary_page_flip.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-iclb3/igt@kms_psr@psr2_primary_page_flip.html
* igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend:
- shard-skl: [PASS][25] -> [INCOMPLETE][26] ([i915#69])
[25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl1/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html
[26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl6/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html
* igt@prime_self_import@reimport-vs-gem_close-race:
- shard-iclb: [PASS][27] -> [DMESG-WARN][28] ([i915#1982])
[27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-iclb4/igt@prime_self_import@reimport-vs-gem_close-race.html
[28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-iclb2/igt@prime_self_import@reimport-vs-gem_close-race.html
#### Possible fixes ####
* igt@gem_exec_reloc@basic-concurrent0:
- shard-glk: [FAIL][29] ([i915#1930]) -> [PASS][30]
[29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-glk8/igt@gem_exec_reloc@basic-concurrent0.html
[30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-glk6/igt@gem_exec_reloc@basic-concurrent0.html
* igt@gem_exec_whisper@basic-queues-priority:
- shard-glk: [DMESG-WARN][31] ([i915#118] / [i915#95]) -> [PASS][32]
[31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-glk2/igt@gem_exec_whisper@basic-queues-priority.html
[32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-glk3/igt@gem_exec_whisper@basic-queues-priority.html
* igt@gen9_exec_parse@allowed-all:
- shard-apl: [DMESG-WARN][33] ([i915#1436] / [i915#1635] / [i915#716]) -> [PASS][34]
[33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-apl1/igt@gen9_exec_parse@allowed-all.html
[34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-apl8/igt@gen9_exec_parse@allowed-all.html
* igt@i915_suspend@fence-restore-tiled2untiled:
- shard-skl: [INCOMPLETE][35] ([i915#69]) -> [PASS][36]
[35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl8/igt@i915_suspend@fence-restore-tiled2untiled.html
[36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl3/igt@i915_suspend@fence-restore-tiled2untiled.html
* igt@kms_big_fb@y-tiled-64bpp-rotate-0:
- shard-glk: [DMESG-FAIL][37] ([i915#118] / [i915#95]) -> [PASS][38]
[37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-glk8/igt@kms_big_fb@y-tiled-64bpp-rotate-0.html
[38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-glk6/igt@kms_big_fb@y-tiled-64bpp-rotate-0.html
* igt@kms_cursor_crc@pipe-a-cursor-128x128-onscreen:
- shard-skl: [FAIL][39] ([i915#54]) -> [PASS][40] +1 similar issue
[39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl7/igt@kms_cursor_crc@pipe-a-cursor-128x128-onscreen.html
[40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl4/igt@kms_cursor_crc@pipe-a-cursor-128x128-onscreen.html
* igt@kms_draw_crc@draw-method-xrgb2101010-mmap-gtt-ytiled:
- shard-apl: [DMESG-WARN][41] ([i915#1635] / [i915#1982]) -> [PASS][42] +1 similar issue
[41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-apl2/igt@kms_draw_crc@draw-method-xrgb2101010-mmap-gtt-ytiled.html
[42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-apl1/igt@kms_draw_crc@draw-method-xrgb2101010-mmap-gtt-ytiled.html
* igt@kms_flip@2x-flip-vs-wf_vblank-interruptible@ab-vga1-hdmi-a1:
- shard-hsw: [DMESG-WARN][43] ([i915#1982]) -> [PASS][44]
[43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-hsw6/igt@kms_flip@2x-flip-vs-wf_vblank-interruptible@ab-vga1-hdmi-a1.html
[44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-hsw6/igt@kms_flip@2x-flip-vs-wf_vblank-interruptible@ab-vga1-hdmi-a1.html
* igt@kms_flip@flip-vs-panning-interruptible@a-edp1:
- shard-skl: [DMESG-WARN][45] ([i915#1982]) -> [PASS][46] +4 similar issues
[45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl10/igt@kms_flip@flip-vs-panning-interruptible@a-edp1.html
[46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl7/igt@kms_flip@flip-vs-panning-interruptible@a-edp1.html
* igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:
- shard-kbl: [DMESG-WARN][47] ([i915#180]) -> [PASS][48] +2 similar issues
[47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
[48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-kbl6/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
* igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-mmap-cpu:
- shard-tglb: [DMESG-WARN][49] ([i915#1982]) -> [PASS][50] +2 similar issues
[49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-tglb2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-mmap-cpu.html
[50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-tglb8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-mmap-cpu.html
* igt@kms_hdr@bpc-switch-suspend:
- shard-skl: [FAIL][51] ([i915#1188]) -> [PASS][52]
[51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl6/igt@kms_hdr@bpc-switch-suspend.html
[52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl6/igt@kms_hdr@bpc-switch-suspend.html
* igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
- shard-skl: [FAIL][53] ([fdo#108145] / [i915#265]) -> [PASS][54]
[53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-skl7/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
[54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-skl4/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
* igt@kms_psr@psr2_suspend:
- shard-iclb: [SKIP][55] ([fdo#109441]) -> [PASS][56] +2 similar issues
[55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-iclb4/igt@kms_psr@psr2_suspend.html
[56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-iclb2/igt@kms_psr@psr2_suspend.html
#### Warnings ####
* igt@i915_module_load@reload:
- shard-tglb: [DMESG-WARN][57] ([i915#1982]) -> [DMESG-WARN][58] ([i915#402])
[57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-tglb3/igt@i915_module_load@reload.html
[58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-tglb5/igt@i915_module_load@reload.html
* igt@i915_pm_dc@dc3co-vpb-simulation:
- shard-iclb: [SKIP][59] ([i915#658]) -> [SKIP][60] ([i915#588])
[59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-iclb4/igt@i915_pm_dc@dc3co-vpb-simulation.html
[60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-iclb2/igt@i915_pm_dc@dc3co-vpb-simulation.html
* igt@kms_cursor_crc@pipe-c-cursor-suspend:
- shard-kbl: [DMESG-WARN][61] ([i915#180]) -> [INCOMPLETE][62] ([i915#155])
[61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8749/shard-kbl7/igt@kms_cursor_crc@pipe-c-cursor-suspend.html
[62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/shard-kbl6/igt@kms_cursor_crc@pipe-c-cursor-suspend.html
[fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
[fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
[i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
[i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
[i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
[i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
[i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
[i915#165]: https://gitlab.freedesktop.org/drm/intel/issues/165
[i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
[i915#1930]: https://gitlab.freedesktop.org/drm/intel/issues/1930
[i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
[i915#2079]: https://gitlab.freedesktop.org/drm/intel/issues/2079
[i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
[i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
[i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
[i915#588]: https://gitlab.freedesktop.org/drm/intel/issues/588
[i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
[i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
[i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
[i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
[i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95
Participating hosts (10 -> 10)
------------------------------
No changes in participating hosts
Build changes
-------------
* Linux: CI_DRM_8749 -> Patchwork_18176
CI-20190529: 20190529
CI_DRM_8749: c704cccd64e81bb9a8d23c414fdcafff65697801 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5735: 21f8204e54c122e4a0f8ca4b59e4b2db8d1ba687 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_18176: e688f32314fbdcc1be26177cb1fb29f3be801120 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18176/index.html
[-- Attachment #1.2: Type: text/html, Size: 17182 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-15 14:47 ` Chris Wilson
@ 2020-07-16 8:41 ` Tvrtko Ursulin
2020-07-16 8:47 ` Chris Wilson
0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2020-07-16 8:41 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Matthew Auld
On 15/07/2020 15:47, Chris Wilson wrote:
> Quoting Chris Wilson (2020-07-15 15:47:15)
>> Quoting Tvrtko Ursulin (2020-07-15 13:26:23)
>>>
>>> On 15/07/2020 13:06, Tvrtko Ursulin wrote:
>>>>
>>>> On 15/07/2020 11:50, Chris Wilson wrote:
>>>>> Currently, we use i915_request_completed() directly in
>>>>> i915_request_wait() and follow up with a manual invocation of
>>>>> dma_fence_signal(). This appears to cause a large number of contentions
>>>>> on i915_request.lock as when the process is woken up after the fence is
>>>>> signaled by an interrupt, we will then try and call dma_fence_signal()
>>>>> ourselves while the signaler is still holding the lock.
>>>>> dma_fence_is_signaled() has the benefit of checking the
>>>>> DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
>>>>> avoids most of that contention.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/i915_request.c | 12 ++++--------
>>>>> 1 file changed, 4 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c
>>>>> b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 0b2fe55e6194..bb4eb1a8780e 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -1640,7 +1640,7 @@ static bool busywait_stop(unsigned long timeout,
>>>>> unsigned int cpu)
>>>>> return this_cpu != cpu;
>>>>> }
>>>>> -static bool __i915_spin_request(const struct i915_request * const rq,
>>>>> int state)
>>>>> +static bool __i915_spin_request(struct i915_request * const rq, int
>>>>> state)
>>>>> {
>>>>> unsigned long timeout_ns;
>>>>> unsigned int cpu;
>>>>> @@ -1673,7 +1673,7 @@ static bool __i915_spin_request(const struct
>>>>> i915_request * const rq, int state)
>>>>> timeout_ns = READ_ONCE(rq->engine->props.max_busywait_duration_ns);
>>>>> timeout_ns += local_clock_ns(&cpu);
>>>>> do {
>>>>> - if (i915_request_completed(rq))
>>>>> + if (dma_fence_is_signaled(&rq->fence))
>>>>> return true;
>>>>> if (signal_pending_state(state, current))
>>>>> @@ -1766,10 +1766,8 @@ long i915_request_wait(struct i915_request *rq,
>>>>> * duration, which we currently lack.
>>>>> */
>>>>> if (IS_ACTIVE(CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT) &&
>>>>> - __i915_spin_request(rq, state)) {
>>>>> - dma_fence_signal(&rq->fence);
>>>>> + __i915_spin_request(rq, state))
>>>>> goto out;
>>>>> - }
>>>>> /*
>>>>> * This client is about to stall waiting for the GPU. In many cases
>>>>> @@ -1796,10 +1794,8 @@ long i915_request_wait(struct i915_request *rq,
>>>>> for (;;) {
>>>>> set_current_state(state);
>>>>> - if (i915_request_completed(rq)) {
>>>>> - dma_fence_signal(&rq->fence);
>>>>> + if (dma_fence_is_signaled(&rq->fence))
>>>>> break;
>>>>> - }
>>>>> intel_engine_flush_submission(rq->engine);
>>>>>
>>>>
>>>> In other words putting some latency back into the waiters, which is
>>>> probably okay, since sync waits is not our primary model.
>>>>
>>>> I have a slight concern about the remaining value of busy spinning if
>>>> i915_request_completed check is removed from there as well. Of course it
>>>> doesn't make sense to have different completion criteria between the
>>>> two.. We could wait a bit longer if real check in busyspin said request
>>>> is actually completed, just not signal it but wait for the breadcrumbs
>>>> to do it.
>>>
>>> What a load of nonsense.. :)
>>>
>>> Okay, I think the only real question is i915_request_completed vs
>>> dma_fence_signaled in __i915_spin_request. Do we want to burn CPU cycles
>>> waiting on GPU and breadcrumb irq work, or just the GPU.
>>
>> dma_fence_is_signaled() {
>> if (test_bit(SIGNALED_BIT))
>> return true;
>>
>> if (i915_request_completed()) {
>> dma_fence_signal();
>> return true;
>> }
>>
>> return false;
>> }
>>
>> with the indirection. So the question is whether the indirection is
>> worth the extra test bit. Just purely looking at the i915_request.lock
>> contention suggests that it probably is. For the spinner, burning a few
>> extra CPU cycles for *vfunc is not an issue, it's the wakeup latency,
>> and since we are calling dma_fence_signal() upon wakeup we do take the
>> spinlock without checking for an early return from the SIGNALED_BIT.
>> So I think it's a net positive. The alternative was to write
>>
>> if (i915_request_completed()) {
>> if (!i915_request_is_signaled())
>> dma_fence_signal();
>> break;
>> }
>>
>> but
>>
>> if (dma_fence_is_signaled())
>> break;
>>
>> does appear simpler, if only by virtue of hiding the details in an
>> inline.
>
> Or a patch to add test_bit() to dma_fence_signal, but this looked
> better.
Right I missed dma_fence_is_signaled calls i915_request_completed.
In this case the remaining question is do we care about wait ioctl
potentially returning before the hypothetical sync fence for the same
request is signaled? This seems like a slight change in user observable
behaviour.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-16 8:41 ` Tvrtko Ursulin
@ 2020-07-16 8:47 ` Chris Wilson
2020-07-16 9:02 ` Tvrtko Ursulin
0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2020-07-16 8:47 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld
Quoting Tvrtko Ursulin (2020-07-16 09:41:17)
> Right I missed dma_fence_is_signaled calls i915_request_completed.
>
> In this case the remaining question is do we care about wait ioctl
> potentially returning before the hypothetical sync fence for the same
> request is signaled? This seems like a slight change in user observable
> behaviour.
You would not be able to observe the difference after wakeup in state
between a sync_file and busy_ioctl/wait_ioctl, as a query after wakeup
will see the same signaled bit. So the only possible difference is a
change in wakeup order; but process wakeup/execution order is definitely
not defined by us :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait
2020-07-16 8:47 ` Chris Wilson
@ 2020-07-16 9:02 ` Tvrtko Ursulin
0 siblings, 0 replies; 11+ messages in thread
From: Tvrtko Ursulin @ 2020-07-16 9:02 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Matthew Auld
On 16/07/2020 09:47, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-07-16 09:41:17)
>> Right I missed dma_fence_is_signaled calls i915_request_completed.
>>
>> In this case the remaining question is do we care about wait ioctl
>> potentially returning before the hypothetical sync fence for the same
>> request is signaled? This seems like a slight change in user observable
>> behaviour.
>
> You would not be able to observe the difference after wakeup in state
> between a sync_file and busy_ioctl/wait_ioctl, as a query after wakeup
> will see the same signaled bit. So the only possible difference is a
> change in wakeup order; but process wakeup/execution order is definitely
> not defined by us :)
Doh.. true, same vfunc..
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-07-16 9:02 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 10:50 [Intel-gfx] [PATCH] drm/i915: Reduce i915_request.lock contention for i915_request_wait Chris Wilson
2020-07-15 12:06 ` Tvrtko Ursulin
2020-07-15 12:26 ` Tvrtko Ursulin
2020-07-15 14:47 ` Chris Wilson
2020-07-15 14:47 ` Chris Wilson
2020-07-16 8:41 ` Tvrtko Ursulin
2020-07-16 8:47 ` Chris Wilson
2020-07-16 9:02 ` Tvrtko Ursulin
2020-07-15 12:12 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for " Patchwork
2020-07-15 12:34 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-07-15 16:11 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.