All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: John Harrison <john.c.harrison@intel.com>,
	Intel-GFX@Lists.FreeDesktop.Org
Cc: DRI-Devel@Lists.FreeDesktop.Org
Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915/gt: Make the heartbeat play nice with long pre-emption timeouts
Date: Thu, 3 Mar 2022 09:55:31 +0000	[thread overview]
Message-ID: <f9daab55-6bda-b359-352d-8e585bce899c@linux.intel.com> (raw)
In-Reply-To: <389c16df-f579-81df-8405-376fcf8ce613@intel.com>



On 02/03/2022 17:55, John Harrison wrote:

>> I was assuming 2.5s tP is enough and basing all calculation on that. 
>> Heartbeat or timeslicing regardless. I thought we established neither 
>> of us knows how long is enough.
>>
>> Are you now saying 2.5s is definitely not enough? How is that usable 
>> for a default out of the box desktop?
> Show me your proof that 2.5s is enough.
> 
> 7.5s is what we have been using internally for a very long time. It has 
> approval from all relevant parties. If you wish to pick a new random 
> number then please provide data to back it up along with buy in from all 
> UMD teams and project management.

And upstream disabled preemption has acks from compute. "Internally" is 
as far away from out of the box desktop experiences we have been arguing 
about. In fact you have argued compute disables the hearbeat anyway.

Lets jump to the end of this reply please for actions.

>> And I don't have a problem with extending the last pulse. It is 
>> fundamentally correct to do regardless of the backend. I just raised 
>> the question of whether to extend all heartbeats to account for 
>> preemption (and scheduling delays). (What is the point of bumping 
>> their priority and re-scheduling if we didn't give enough time to the 
>> engine to react? So opposite of the question you raise.)
> The point is that it we are giving enough time to react. Raising the 
> priority of a pre-emption that has already been triggered will have no 
> effect. So as long as the total time from when the pre-emption is 
> triggered (prio becomes sufficiently high) to the point when the reset 
> is decided is longer than the pre-emption timeout then it works. Given 
> that, it is unnecessary to increase the intermediate periods. It has no 
> advantage and has the disadvantage of making the total time unreasonably 
> long.
> 
> So again, what is the point of making every period longer? What benefit 
> does it *actually* give?

Less special casing and pointless prio bumps ahead of giving time to 
engine to even react. You wouldn't have to have the last pulse 2 * tP 
but normal tH + tP. So again, it is nicer for me to derive all heartbeat 
pulses from the same input parameters.

The whole "it is very long" argument is IMO moot because now proposed 
7.5s preempt period is I suspect wholly impractical for desktop. 
Combined with the argument that real compute disables heartbeats anyway 
even extra so.

> Fine. "tP(RCS) = 7500;" can I merge the patch now?
I could live with setting preempt timeout to 7.5s. The downside is 
slower time to restoring frozen desktops. Worst case today 5 * 2.5s, 
with changes 4 * 2.5s + 2 * 7.5s; so from 12.5s to 25s, doubling..

Actions:

1)
Get a number from compute/OpenCL people for what they say is minimum 
preempt timeout for default out of the box Linux desktop experience.

This does not mean them running some tests and can't be bothered to 
setup up the machine for the extreme use cases, but workloads average 
users can realistically be expected to run.

Say for instance some image manipulation software which is OpenCL 
accelerated or similar. How long unpreemptable sections are expected 
there. Or similar. I am not familiar what all OpenCL accelerated use 
cases there are on Linux.

And this number should be purely about minimum preempt timeout, not 
considering heartbeats. This is because preempt timeout may kick in 
sooner than stopped heartbeat if the user workload is low priority.

2)
Commit message should explain the effect on the worst case time until 
engine reset.

3)
OpenCL/compute should ack the change publicly as well since they acked 
the disabling of preemption.

4)
I really want overflows_type in the first patch.

My position is that with the above satisfied it is okay to merge.

Regards,

Tvrtko

  reply	other threads:[~2022-03-03  9:55 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-18 21:33 [PATCH 0/3] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
2022-02-18 21:33 ` [Intel-gfx] " John.C.Harrison
2022-02-18 21:33 ` [PATCH 1/3] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
2022-02-18 21:33   ` [Intel-gfx] " John.C.Harrison
2022-02-22  9:52   ` Tvrtko Ursulin
2022-02-22 10:39     ` Tvrtko Ursulin
2022-02-23  2:11     ` John Harrison
2022-02-23 12:13       ` Tvrtko Ursulin
2022-02-23 19:03         ` John Harrison
2022-02-24  9:59           ` Tvrtko Ursulin
2022-02-24 19:19             ` John Harrison
2022-02-24 19:51               ` John Harrison
2022-02-25 17:44                 ` Tvrtko Ursulin
2022-02-25 17:06               ` Tvrtko Ursulin
2022-02-25 17:39                 ` John Harrison
2022-02-28 16:11                   ` Tvrtko Ursulin
2022-02-28 18:32                     ` John Harrison
2022-03-01 10:50                       ` Tvrtko Ursulin
2022-03-01 19:57                         ` John Harrison
2022-03-02  9:20                           ` Tvrtko Ursulin
2022-03-02 18:07                             ` John Harrison
2022-02-23  0:52   ` Ceraolo Spurio, Daniele
2022-02-23  2:15     ` John Harrison
2022-02-18 21:33 ` [PATCH 2/3] drm/i915/gt: Make the heartbeat play nice with long pre-emption timeouts John.C.Harrison
2022-02-18 21:33   ` [Intel-gfx] " John.C.Harrison
2022-02-22 11:19   ` Tvrtko Ursulin
2022-02-23  2:45     ` John Harrison
2022-02-23 13:58       ` Tvrtko Ursulin
2022-02-23 20:00         ` John Harrison
2022-02-24 11:41           ` Tvrtko Ursulin
2022-02-24 19:45             ` John Harrison
2022-02-25 18:14               ` Tvrtko Ursulin
2022-02-25 18:48                 ` John Harrison
2022-02-28 17:12                   ` Tvrtko Ursulin
2022-02-28 18:55                     ` John Harrison
2022-03-01 12:09                       ` Tvrtko Ursulin
2022-03-01 20:59                         ` John Harrison
2022-03-02 11:07                           ` Tvrtko Ursulin
2022-03-02 17:55                             ` John Harrison
2022-03-03  9:55                               ` Tvrtko Ursulin [this message]
2022-03-03 19:09                                 ` John Harrison
2022-03-04 12:36                                   ` Tvrtko Ursulin
2022-02-18 21:33 ` [PATCH 3/3] drm/i915: Improve long running OCL w/a for GuC submission John.C.Harrison
2022-02-18 21:33   ` [Intel-gfx] " John.C.Harrison
2022-02-19  2:54 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Improve anti-pre-emption w/a for compute workloads Patchwork
2022-02-19  3:33 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2022-02-22  9:53 ` [Intel-gfx] [PATCH 0/3] " Tvrtko Ursulin
2022-02-23  2:22   ` John Harrison
2022-02-23 12:00     ` Tvrtko Ursulin
2022-02-24 20:02       ` John Harrison
2022-02-25 16:36         ` Tvrtko Ursulin
2022-02-25 17:11           ` John Harrison
2022-02-25 17:39             ` Tvrtko Ursulin
2022-02-25 18:01               ` John Harrison
2022-02-25 18:29                 ` Tvrtko Ursulin
2022-02-25 19:03                   ` John Harrison
2022-02-28 15:32                     ` Tvrtko Ursulin
2022-02-28 19:17                       ` John Harrison
2022-03-02 11:21                         ` Tvrtko Ursulin
2022-03-02 17:40                           ` John Harrison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9daab55-6bda-b359-352d-8e585bce899c@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=DRI-Devel@Lists.FreeDesktop.Org \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=john.c.harrison@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.