From: Chris Wilson <chris@chris-wilson.co.uk> To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, intel-gfx@lists.freedesktop.org Subject: Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles Date: Tue, 19 Nov 2019 16:42:28 +0000 [thread overview] Message-ID: <157418174819.12093.10574958764232498040@skylake-alporthouse-com> (raw) In-Reply-To: <f8d09a9a-b45a-7960-d584-3315ca0c80f3@linux.intel.com> Quoting Tvrtko Ursulin (2019-11-19 16:33:18) > > On 19/11/2019 16:20, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2019-11-19 15:04:46) > >> > >> On 18/11/2019 23:02, Chris Wilson wrote: > >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > >>> index 33ce258d484f..f7c8fec436a9 100644 > >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > >>> @@ -142,6 +142,7 @@ > >>> #include "intel_engine_pm.h" > >>> #include "intel_gt.h" > >>> #include "intel_gt_pm.h" > >>> +#include "intel_gt_requests.h" > >>> #include "intel_lrc_reg.h" > >>> #include "intel_mocs.h" > >>> #include "intel_reset.h" > >>> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data) > >>> if (timeout && preempt_timeout(engine)) > >>> preempt_reset(engine); > >>> } > >>> + > >>> + /* > >>> + * If the GPU is currently idle, retire the outstanding completed > >>> + * requests. This will allow us to enter soft-rc6 as soon as possible, > >>> + * albeit at the cost of running the retire worker much more frequently > >>> + * (over the entire GT not just this engine) and emitting more idle > >>> + * barriers (i.e. kernel context switches unpinning all that went > >>> + * before) which may add some extra latency. > >>> + */ > >>> + if (intel_engine_pm_is_awake(engine) && > >>> + !execlists_active(&engine->execlists)) > >>> + intel_gt_schedule_retire_requests(engine->gt); > >> > >> I am still not a fan of doing this for all platforms. > > > > I understand. I think it makes a fair amount of sense to do early > > retires, and wish to pursue that if I can show there is no harm. > > It's also a bit of a layering problem. Them's fighting words! :) > >> Its not just the cost of retirement but there is > >> intel_engine_flush_submission on all engines in there as well which we > >> cannot avoid triggering from this path. > >> > >> Would it be worth experimenting with additional per-engine retire > >> workers? Most of the code could be shared, just a little bit of > >> specialization to filter on engine. > > > > I haven't sketched out anything more than peeking at the last request on > > the timeline and doing a rq->engine == engine filter. Walking the global > > timeline.active_list in that case is also a nuisance. > > That together with: > > flush_submission(gt, engine ? engine->mask : ALL_ENGINES); > > Might be enough? At least to satisfy my concern. Aye, flushing all other when we know we only care about being idle is definitely a weak point of the current scheme. > Apart layering is still bad.. And I'd still limit it to when RC6 WA is > active unless it can be shown there is no perf/power impact across > GPU/CPU to do this everywhere. Bah, keep tuning until it's a win for everyone! > At which point it becomes easier to just limit it because we have to > have it there. > > I also wonder if the current flush_submission wasn't the reason for > performance regression you were seeing with this? It makes this tasklet > wait for all other engines, if they are busy. But not sure.. perhaps it > is work which would be done anyway. I haven't finished yet; but the baseline took a big nose dive so it might be enough to hide a lot of evil. Too bad I don't have an Icelake with to cross check with an unaffected platform. > > There's definitely scope here for us using some more information from > > process_csb() about which context completed and limit work to that > > timeline. Hmm, something along those lines maybe... > > But you want to retire all timelines which have work on this particular > physical engine. Otherwise it doesn't get parked, no? There I was suggesting being even more proactive, and say keeping an llist of completed timelines. Nothing concrete yet, plenty of existing races found already that need fixing. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk> To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles Date: Tue, 19 Nov 2019 16:42:28 +0000 [thread overview] Message-ID: <157418174819.12093.10574958764232498040@skylake-alporthouse-com> (raw) Message-ID: <20191119164228.y8Kgp6g1LAq_r_jPq0etNLu8Fkd9QK1DXkXmPRaVgXo@z> (raw) In-Reply-To: <f8d09a9a-b45a-7960-d584-3315ca0c80f3@linux.intel.com> Quoting Tvrtko Ursulin (2019-11-19 16:33:18) > > On 19/11/2019 16:20, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2019-11-19 15:04:46) > >> > >> On 18/11/2019 23:02, Chris Wilson wrote: > >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > >>> index 33ce258d484f..f7c8fec436a9 100644 > >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > >>> @@ -142,6 +142,7 @@ > >>> #include "intel_engine_pm.h" > >>> #include "intel_gt.h" > >>> #include "intel_gt_pm.h" > >>> +#include "intel_gt_requests.h" > >>> #include "intel_lrc_reg.h" > >>> #include "intel_mocs.h" > >>> #include "intel_reset.h" > >>> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data) > >>> if (timeout && preempt_timeout(engine)) > >>> preempt_reset(engine); > >>> } > >>> + > >>> + /* > >>> + * If the GPU is currently idle, retire the outstanding completed > >>> + * requests. This will allow us to enter soft-rc6 as soon as possible, > >>> + * albeit at the cost of running the retire worker much more frequently > >>> + * (over the entire GT not just this engine) and emitting more idle > >>> + * barriers (i.e. kernel context switches unpinning all that went > >>> + * before) which may add some extra latency. > >>> + */ > >>> + if (intel_engine_pm_is_awake(engine) && > >>> + !execlists_active(&engine->execlists)) > >>> + intel_gt_schedule_retire_requests(engine->gt); > >> > >> I am still not a fan of doing this for all platforms. > > > > I understand. I think it makes a fair amount of sense to do early > > retires, and wish to pursue that if I can show there is no harm. > > It's also a bit of a layering problem. Them's fighting words! :) > >> Its not just the cost of retirement but there is > >> intel_engine_flush_submission on all engines in there as well which we > >> cannot avoid triggering from this path. > >> > >> Would it be worth experimenting with additional per-engine retire > >> workers? Most of the code could be shared, just a little bit of > >> specialization to filter on engine. > > > > I haven't sketched out anything more than peeking at the last request on > > the timeline and doing a rq->engine == engine filter. Walking the global > > timeline.active_list in that case is also a nuisance. > > That together with: > > flush_submission(gt, engine ? engine->mask : ALL_ENGINES); > > Might be enough? At least to satisfy my concern. Aye, flushing all other when we know we only care about being idle is definitely a weak point of the current scheme. > Apart layering is still bad.. And I'd still limit it to when RC6 WA is > active unless it can be shown there is no perf/power impact across > GPU/CPU to do this everywhere. Bah, keep tuning until it's a win for everyone! > At which point it becomes easier to just limit it because we have to > have it there. > > I also wonder if the current flush_submission wasn't the reason for > performance regression you were seeing with this? It makes this tasklet > wait for all other engines, if they are busy. But not sure.. perhaps it > is work which would be done anyway. I haven't finished yet; but the baseline took a big nose dive so it might be enough to hide a lot of evil. Too bad I don't have an Icelake with to cross check with an unaffected platform. > > There's definitely scope here for us using some more information from > > process_csb() about which context completed and limit work to that > > timeline. Hmm, something along those lines maybe... > > But you want to retire all timelines which have work on this particular > physical engine. Otherwise it doesn't get parked, no? There I was suggesting being even more proactive, and say keeping an llist of completed timelines. Nothing concrete yet, plenty of existing races found already that need fixing. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2019-11-19 16:42 UTC|newest] Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-18 23:02 Fast soft-rc6 Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 01/19] drm/i915/selftests: Force bonded submission to overlap Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 02/19] drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 14:15 ` Tvrtko Ursulin 2019-11-19 14:15 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-19 14:41 ` Chris Wilson 2019-11-19 14:41 ` [Intel-gfx] " Chris Wilson 2019-11-20 11:39 ` Tvrtko Ursulin 2019-11-20 11:39 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-20 11:51 ` Chris Wilson 2019-11-20 11:51 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 14:35 ` Tvrtko Ursulin 2019-11-19 14:35 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-19 14:50 ` Chris Wilson 2019-11-19 14:50 ` [Intel-gfx] " Chris Wilson 2019-11-19 15:03 ` [PATCH] " Chris Wilson 2019-11-19 15:03 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 05/19] drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 14:54 ` Tvrtko Ursulin 2019-11-19 14:54 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-18 23:02 ` [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 15:04 ` Tvrtko Ursulin 2019-11-19 15:04 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-19 16:20 ` Chris Wilson 2019-11-19 16:20 ` [Intel-gfx] " Chris Wilson 2019-11-19 16:33 ` Tvrtko Ursulin 2019-11-19 16:33 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-19 16:42 ` Chris Wilson [this message] 2019-11-19 16:42 ` Chris Wilson 2019-11-19 18:58 ` Chris Wilson 2019-11-19 18:58 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put() Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 15:57 ` Tvrtko Ursulin 2019-11-19 15:57 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-19 16:12 ` Chris Wilson 2019-11-19 16:12 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 08/19] drm/i915/gem: Merge GGTT vma flush into a single loop Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 09/19] drm/i915/gt: Only wait for register chipset flush if active Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 10/19] drm/i915: Protect the obj->vma.list during iteration Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 11/19] drm/i915: Wait until the intel_wakeref idle callback is complete Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 16:15 ` Tvrtko Ursulin 2019-11-19 16:15 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-18 23:02 ` [PATCH 12/19] drm/i915/gt: Declare timeline.lock to be irq-free Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 15:58 ` Tvrtko Ursulin 2019-11-19 15:58 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-18 23:02 ` [PATCH 13/19] drm/i915/gt: Move new timelines to the end of active_list Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 16:02 ` Tvrtko Ursulin 2019-11-19 16:02 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-18 23:02 ` [PATCH 14/19] drm/i915/gt: Schedule next retirement worker first Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 16:07 ` Tvrtko Ursulin 2019-11-19 16:07 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-18 23:02 ` [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-19 16:12 ` Tvrtko Ursulin 2019-11-19 16:12 ` [Intel-gfx] " Tvrtko Ursulin 2019-11-19 17:22 ` Chris Wilson 2019-11-19 17:22 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 16/19] drm/i915/selftests: Flush the active callbacks Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 17/19] drm/i915/selftests: Be explicit in ERR_PTR handling Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 18/19] drm/i915/selftests: Exercise rc6 handling Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:02 ` [PATCH 19/19] drm/i915/gt: Track engine round-trip times Chris Wilson 2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson 2019-11-18 23:21 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap Patchwork 2019-11-18 23:21 ` [Intel-gfx] " Patchwork 2019-11-19 0:04 ` ✓ Fi.CI.BAT: success " Patchwork 2019-11-19 0:04 ` [Intel-gfx] " Patchwork 2019-11-19 9:08 ` ✗ Fi.CI.IGT: failure " Patchwork 2019-11-19 9:08 ` [Intel-gfx] " Patchwork 2019-11-19 19:04 ` ✗ Fi.CI.BUILD: failure for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap (rev2) Patchwork 2019-11-19 19:04 ` [Intel-gfx] " Patchwork
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=157418174819.12093.10574958764232498040@skylake-alporthouse-com \ --to=chris@chris-wilson.co.uk \ --cc=intel-gfx@lists.freedesktop.org \ --cc=tvrtko.ursulin@linux.intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.