linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Francisco Jerez <currojerez@riseup.net>
To: chris.p.wilson@intel.com, intel-gfx@lists.freedesktop.org,
	linux-pm@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>, "Pandruvada\,
	Srinivas" <srinivas.pandruvada@intel.com>
Subject: Re: [Intel-gfx] [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load.
Date: Wed, 18 Mar 2020 12:42:11 -0700	[thread overview]
Message-ID: <87k13h78mk.fsf@riseup.net> (raw)
In-Reply-To: <87r1xzafwn.fsf@riseup.net>


[-- Attachment #1.1: Type: text/plain, Size: 3850 bytes --]

Francisco Jerez <currojerez@riseup.net> writes:

> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
>> Quoting Francisco Jerez (2020-03-10 21:41:55)
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index b9b3f78f1324..a5d7a80b826d 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -1577,6 +1577,11 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>>>         /* we need to manually load the submit queue */
>>>         if (execlists->ctrl_reg)
>>>                 writel(EL_CTRL_LOAD, execlists->ctrl_reg);
>>> +
>>> +       if (execlists_num_ports(execlists) > 1 &&
>> pending[1] is always defined, the minimum submission is one slot, with
>> pending[1] as the sentinel NULL.
>>
>>> +           execlists->pending[1] &&
>>> +           !atomic_xchg(&execlists->overload, 1))
>>> +               intel_gt_pm_active_begin(&engine->i915->gt);
>>
>> engine->gt
>>
>
> Applied your suggestions above locally, will probably wait to have a few
> more changes batched up before sending a v2.
>
>>>  }
>>>  
>>>  static bool ctx_single_port_submission(const struct intel_context *ce)
>>> @@ -2213,6 +2218,12 @@ cancel_port_requests(struct intel_engine_execlists * const execlists)
>>>         clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight));
>>>  
>>>         WRITE_ONCE(execlists->active, execlists->inflight);
>>> +
>>> +       if (atomic_xchg(&execlists->overload, 0)) {
>>> +               struct intel_engine_cs *engine =
>>> +                       container_of(execlists, typeof(*engine), execlists);
>>> +               intel_gt_pm_active_end(&engine->i915->gt);
>>> +       }
>>>  }
>>>  
>>>  static inline void
>>> @@ -2386,6 +2397,9 @@ static void process_csb(struct intel_engine_cs *engine)
>>>                         /* port0 completed, advanced to port1 */
>>>                         trace_ports(execlists, "completed", execlists->active);
>>>  
>>> +                       if (atomic_xchg(&execlists->overload, 0))
>>> +                               intel_gt_pm_active_end(&engine->i915->gt);
>>
>> So this looses track if we preempt a dual-ELSP submission with a
>> single-ELSP submission (and never go back to dual).
>>
>
> Yes, good point.  You're right that if a dual-ELSP submission gets
> preempted by a single-ELSP submission "overload" will remain signaled
> until the first completion interrupt arrives (e.g. from the preempting
> submission).
>
>> If you move this to the end of the loop and check
>>
>> if (!execlists->active[1] && atomic_xchg(&execlists->overload, 0))
>> 	intel_gt_pm_active_end(engine->gt);
>>
>> so that it covers both preemption/promotion and completion.
>>
>
> That sounds reasonable.
>
>> However, that will fluctuate quite rapidly. (And runs the risk of
>> exceeding the sentinel.)
>>
>> An alternative approach would be to couple along
>> schedule_in/schedule_out
>>
>> atomic_set(overload, -1);
>>
>> __execlists_schedule_in:
>> 	if (!atomic_fetch_inc(overload)
>> 		intel_gt_pm_active_begin(engine->gt);
>> __execlists_schedule_out:
>> 	if (!atomic_dec_return(overload)
>> 		intel_gt_pm_active_end(engine->gt);
>>
>> which would mean we are overloaded as soon as we try to submit an
>> overlapping ELSP.
>>
>
> That sounds good to me too, and AFAICT would have roughly the same
> behavior as this metric except for the preemption corner case you
> mention above.  I'll try this and verify that I get approximately the
> same performance numbers.
>

This suggestion seems to lead to some minor regressions, I'm
investigating the issue.  Will send a v2 as soon as I have something
along the lines of what you suggested running with equivalent
performance to v1.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

  reply	other threads:[~2020-03-18 19:42 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10 21:41 [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Francisco Jerez
2020-03-10 21:41 ` [PATCH 01/10] PM: QoS: Add CPU_RESPONSE_FREQUENCY global PM QoS limit Francisco Jerez
2020-03-11 12:42   ` Peter Zijlstra
2020-03-11 19:23     ` Francisco Jerez
2020-03-11 19:23       ` [PATCHv2 " Francisco Jerez
2020-03-19 10:25         ` Rafael J. Wysocki
2020-03-10 21:41 ` [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load Francisco Jerez
2020-03-10 22:26   ` [Intel-gfx] " Chris Wilson
2020-03-11  0:34     ` Francisco Jerez
2020-03-18 19:42       ` Francisco Jerez [this message]
2020-03-20  2:46         ` Francisco Jerez
2020-03-20 10:06           ` Chris Wilson
2020-03-11 10:00     ` Tvrtko Ursulin
2020-03-11 10:21       ` Chris Wilson
2020-03-11 19:54       ` Francisco Jerez
2020-03-12 11:52         ` Tvrtko Ursulin
2020-03-13  7:39           ` Francisco Jerez
2020-03-16 20:54             ` Francisco Jerez
2020-03-10 21:41 ` [PATCH 03/10] OPTIONAL: drm/i915: Expose PM QoS control parameters via debugfs Francisco Jerez
2020-03-10 21:41 ` [PATCH 04/10] Revert "cpufreq: intel_pstate: Drop ->update_util from pstate_funcs" Francisco Jerez
2020-03-19 10:45   ` Rafael J. Wysocki
2020-03-10 21:41 ` [PATCH 05/10] cpufreq: intel_pstate: Implement VLP controller statistics and status calculation Francisco Jerez
2020-03-19 11:06   ` Rafael J. Wysocki
2020-03-10 21:41 ` [PATCH 06/10] cpufreq: intel_pstate: Implement VLP controller target P-state range estimation Francisco Jerez
2020-03-19 11:12   ` Rafael J. Wysocki
2020-03-10 21:42 ` [PATCH 07/10] cpufreq: intel_pstate: Implement VLP controller for HWP parts Francisco Jerez
2020-03-17 23:59   ` Pandruvada, Srinivas
2020-03-18 19:51     ` Francisco Jerez
2020-03-18 20:10       ` Pandruvada, Srinivas
2020-03-18 20:22         ` Francisco Jerez
2020-03-23 20:13           ` Pandruvada, Srinivas
2020-03-10 21:42 ` [PATCH 08/10] cpufreq: intel_pstate: Enable VLP controller based on ACPI FADT profile and CPUID Francisco Jerez
2020-03-19 11:20   ` Rafael J. Wysocki
2020-03-10 21:42 ` [PATCH 09/10] OPTIONAL: cpufreq: intel_pstate: Add tracing of VLP controller status Francisco Jerez
2020-03-10 21:42 ` [PATCH 10/10] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller parameters via debugfs Francisco Jerez
2020-03-11  2:35 ` [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Pandruvada, Srinivas
2020-03-11  3:55   ` Francisco Jerez
2020-03-23 23:29 ` Pandruvada, Srinivas
2020-03-24  0:23   ` Francisco Jerez
2020-03-24 19:16     ` Francisco Jerez
2020-03-24 20:03       ` Pandruvada, Srinivas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k13h78mk.fsf@riseup.net \
    --to=currojerez@riseup.net \
    --cc=chris.p.wilson@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).