* [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch
@ 2018-08-17 12:24 Chris Wilson
2018-08-17 12:47 ` ✓ Fi.CI.BAT: success for " Patchwork
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Chris Wilson @ 2018-08-17 12:24 UTC (permalink / raw)
To: intel-gfx
On gen9, we see an effect where when we perform an element switch just
as the first context completes execution that switch takes twice as
long, as if it first reloads the completed context. That is we observe
the cost of
context1 -> idle -> context1 -> context2
as being twice the cost of the same operation as on gen8. The impact of
this is incredibly rare outside of microbenchmarks that are focused on
assessing the throughput of context switches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
I think is a microbenchmark too far, as there is no real world impact of
this as both the likelihood of submission at that precise point of time,
and the context switch being a significant fraction of the batch
runtime make the effect miniscule in practise.
It is also not foolproof for even gem_ctx_switch:
kbl ctx1 -> idle -> ctx2: ~25us;
ctx1 -> idle -> ctx1 -> ctx2 (unpatched): ~53us
ctx1 -> idle -> ctx1 -> ctx2 (patched): 30-40us
bxt ctx1 -> idle -> ctx2: ~40us
ctx1 -> idle -> ctx1 -> ctx2 (unpatched): ~80
ctx1 -> idle -> ctx1 -> ctx2 (patched): 60-70us
So consider this as more of a plea for ideas; why does bdw behaviour
better? Are we missing a flag, a fox or a chicken?
-Chris
---
drivers/gpu/drm/i915/intel_lrc.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 36050f085071..682268d4249d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -711,6 +711,24 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
GEM_BUG_ON(last->hw_context == rq->hw_context);
+ /*
+ * Avoid reloading the previous context if we
+ * know it has just completed and we want
+ * to switch over to a new context. The CS
+ * interrupt is likely waiting for us to
+ * release the local irq lock and so we will
+ * proceed with the submission momentarily,
+ * which is quicker than reloading the context
+ * on the gpu.
+ */
+ if (!submit &&
+ intel_engine_signaled(engine,
+ last->global_seqno)) {
+ GEM_BUG_ON(!list_is_first(&rq->sched.link,
+ &p->requests));
+ return;
+ }
+
if (submit)
port_assign(port, last);
port++;
--
2.18.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 5+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915/execlists: Micro-optimise "idle" context switch
2018-08-17 12:24 [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch Chris Wilson
@ 2018-08-17 12:47 ` Patchwork
2018-08-17 13:43 ` [PATCH] " kbuild test robot
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2018-08-17 12:47 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/execlists: Micro-optimise "idle" context switch
URL : https://patchwork.freedesktop.org/series/48392/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4685 -> Patchwork_9969 =
== Summary - SUCCESS ==
No regressions found.
External URL: https://patchwork.freedesktop.org/api/1.0/series/48392/revisions/1/mbox/
== Known issues ==
Here are the changes found in Patchwork_9969 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@kms_pipe_crc_basic@read-crc-pipe-a:
{fi-byt-clapper}: PASS -> FAIL (fdo#107362) +1
igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
fi-skl-guc: PASS -> FAIL (fdo#103191)
{igt@pm_rpm@module-reload}:
fi-byt-j1900: NOTRUN -> WARN (fdo#107602)
==== Possible fixes ====
igt@drv_selftest@live_hangcheck:
fi-skl-guc: DMESG-FAIL (fdo#107174) -> PASS
igt@gem_exec_basic@readonly-blt:
fi-byt-n2820: FAIL (fdo#105900) -> PASS
igt@kms_frontbuffer_tracking@basic:
fi-hsw-peppy: DMESG-FAIL (fdo#102614) -> PASS
{igt@kms_psr@primary_mmap_gtt}:
fi-cnl-psr: DMESG-WARN (fdo#107372) -> PASS
igt@prime_vgem@basic-fence-flip:
fi-ivb-3770: FAIL (fdo#104008) -> PASS
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008
fdo#105900 https://bugs.freedesktop.org/show_bug.cgi?id=105900
fdo#107174 https://bugs.freedesktop.org/show_bug.cgi?id=107174
fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
fdo#107372 https://bugs.freedesktop.org/show_bug.cgi?id=107372
fdo#107602 https://bugs.freedesktop.org/show_bug.cgi?id=107602
== Participating hosts (52 -> 49) ==
Additional (1): fi-byt-j1900
Missing (4): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-hsw-4200u
== Build changes ==
* Linux: CI_DRM_4685 -> Patchwork_9969
CI_DRM_4685: df7e8eddc3830216d3fec15e2c7d0b6ec97e7bae @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4606: 38a44003774e35c587c67c8766b35e75dbb993b8 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_9969: 5b94bc4048d887a036b23884b8aa18b99a3ba891 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
5b94bc4048d8 drm/i915/execlists: Micro-optimise "idle" context switch
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9969/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch
2018-08-17 12:24 [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch Chris Wilson
2018-08-17 12:47 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2018-08-17 13:43 ` kbuild test robot
2018-08-17 14:23 ` kbuild test robot
2018-08-17 15:57 ` ✗ Fi.CI.IGT: failure for " Patchwork
3 siblings, 0 replies; 5+ messages in thread
From: kbuild test robot @ 2018-08-17 13:43 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
[-- Attachment #1: Type: text/plain, Size: 10174 bytes --]
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.18 next-20180817]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-execlists-Micro-optimise-idle-context-switch/20180817-205726
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x005-201832 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/gpu/drm/i915/intel_lrc.c: In function 'execlists_dequeue':
>> drivers/gpu/drm/i915/intel_lrc.c:730:9: error: implicit declaration of function 'intel_engine_signaled'; did you mean 'intel_engine_is_idle'? [-Werror=implicit-function-declaration]
intel_engine_signaled(engine,
^~~~~~~~~~~~~~~~~~~~~
intel_engine_is_idle
cc1: some warnings being treated as errors
vim +730 drivers/gpu/drm/i915/intel_lrc.c
580
581 static void execlists_dequeue(struct intel_engine_cs *engine)
582 {
583 struct intel_engine_execlists * const execlists = &engine->execlists;
584 struct execlist_port *port = execlists->port;
585 const struct execlist_port * const last_port =
586 &execlists->port[execlists->port_mask];
587 struct i915_request *last = port_request(port);
588 struct rb_node *rb;
589 bool submit = false;
590
591 /*
592 * Hardware submission is through 2 ports. Conceptually each port
593 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
594 * static for a context, and unique to each, so we only execute
595 * requests belonging to a single context from each ring. RING_HEAD
596 * is maintained by the CS in the context image, it marks the place
597 * where it got up to last time, and through RING_TAIL we tell the CS
598 * where we want to execute up to this time.
599 *
600 * In this list the requests are in order of execution. Consecutive
601 * requests from the same context are adjacent in the ringbuffer. We
602 * can combine these requests into a single RING_TAIL update:
603 *
604 * RING_HEAD...req1...req2
605 * ^- RING_TAIL
606 * since to execute req2 the CS must first execute req1.
607 *
608 * Our goal then is to point each port to the end of a consecutive
609 * sequence of requests as being the most optimal (fewest wake ups
610 * and context switches) submission.
611 */
612
613 if (last) {
614 /*
615 * Don't resubmit or switch until all outstanding
616 * preemptions (lite-restore) are seen. Then we
617 * know the next preemption status we see corresponds
618 * to this ELSP update.
619 */
620 GEM_BUG_ON(!execlists_is_active(execlists,
621 EXECLISTS_ACTIVE_USER));
622 GEM_BUG_ON(!port_count(&port[0]));
623
624 /*
625 * If we write to ELSP a second time before the HW has had
626 * a chance to respond to the previous write, we can confuse
627 * the HW and hit "undefined behaviour". After writing to ELSP,
628 * we must then wait until we see a context-switch event from
629 * the HW to indicate that it has had a chance to respond.
630 */
631 if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
632 return;
633
634 if (need_preempt(engine, last, execlists->queue_priority)) {
635 inject_preempt_context(engine);
636 return;
637 }
638
639 /*
640 * In theory, we could coalesce more requests onto
641 * the second port (the first port is active, with
642 * no preemptions pending). However, that means we
643 * then have to deal with the possible lite-restore
644 * of the second port (as we submit the ELSP, there
645 * may be a context-switch) but also we may complete
646 * the resubmission before the context-switch. Ergo,
647 * coalescing onto the second port will cause a
648 * preemption event, but we cannot predict whether
649 * that will affect port[0] or port[1].
650 *
651 * If the second port is already active, we can wait
652 * until the next context-switch before contemplating
653 * new requests. The GPU will be busy and we should be
654 * able to resubmit the new ELSP before it idles,
655 * avoiding pipeline bubbles (momentary pauses where
656 * the driver is unable to keep up the supply of new
657 * work). However, we have to double check that the
658 * priorities of the ports haven't been switch.
659 */
660 if (port_count(&port[1]))
661 return;
662
663 /*
664 * WaIdleLiteRestore:bdw,skl
665 * Apply the wa NOOPs to prevent
666 * ring:HEAD == rq:TAIL as we resubmit the
667 * request. See gen8_emit_breadcrumb() for
668 * where we prepare the padding after the
669 * end of the request.
670 */
671 last->tail = last->wa_tail;
672 }
673
674 while ((rb = rb_first_cached(&execlists->queue))) {
675 struct i915_priolist *p = to_priolist(rb);
676 struct i915_request *rq, *rn;
677
678 list_for_each_entry_safe(rq, rn, &p->requests, sched.link) {
679 /*
680 * Can we combine this request with the current port?
681 * It has to be the same context/ringbuffer and not
682 * have any exceptions (e.g. GVT saying never to
683 * combine contexts).
684 *
685 * If we can combine the requests, we can execute both
686 * by updating the RING_TAIL to point to the end of the
687 * second request, and so we never need to tell the
688 * hardware about the first.
689 */
690 if (last &&
691 !can_merge_ctx(rq->hw_context, last->hw_context)) {
692 /*
693 * If we are on the second port and cannot
694 * combine this request with the last, then we
695 * are done.
696 */
697 if (port == last_port) {
698 __list_del_many(&p->requests,
699 &rq->sched.link);
700 goto done;
701 }
702
703 /*
704 * If GVT overrides us we only ever submit
705 * port[0], leaving port[1] empty. Note that we
706 * also have to be careful that we don't queue
707 * the same context (even though a different
708 * request) to the second port.
709 */
710 if (ctx_single_port_submission(last->hw_context) ||
711 ctx_single_port_submission(rq->hw_context)) {
712 __list_del_many(&p->requests,
713 &rq->sched.link);
714 goto done;
715 }
716
717 GEM_BUG_ON(last->hw_context == rq->hw_context);
718
719 /*
720 * Avoid reloading the previous context if we
721 * know it has just completed and we want
722 * to switch over to a new context. The CS
723 * interrupt is likely waiting for us to
724 * release the local irq lock and so we will
725 * proceed with the submission momentarily,
726 * which is quicker than reloading the context
727 * on the gpu.
728 */
729 if (!submit &&
> 730 intel_engine_signaled(engine,
731 last->global_seqno)) {
732 GEM_BUG_ON(!list_is_first(&rq->sched.link,
733 &p->requests));
734 return;
735 }
736
737 if (submit)
738 port_assign(port, last);
739 port++;
740
741 GEM_BUG_ON(port_isset(port));
742 }
743
744 INIT_LIST_HEAD(&rq->sched.link);
745 __i915_request_submit(rq);
746 trace_i915_request_in(rq, port_index(port, execlists));
747 last = rq;
748 submit = true;
749 }
750
751 rb_erase_cached(&p->node, &execlists->queue);
752 INIT_LIST_HEAD(&p->requests);
753 if (p->priority != I915_PRIORITY_NORMAL)
754 kmem_cache_free(engine->i915->priorities, p);
755 }
756
757 done:
758 /*
759 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
760 *
761 * We choose queue_priority such that if we add a request of greater
762 * priority than this, we kick the submission tasklet to decide on
763 * the right order of submitting the requests to hardware. We must
764 * also be prepared to reorder requests as they are in-flight on the
765 * HW. We derive the queue_priority then as the first "hole" in
766 * the HW submission ports and if there are no available slots,
767 * the priority of the lowest executing request, i.e. last.
768 *
769 * When we do receive a higher priority request ready to run from the
770 * user, see queue_request(), the queue_priority is bumped to that
771 * request triggering preemption on the next dequeue (or subsequent
772 * interrupt for secondary ports).
773 */
774 execlists->queue_priority =
775 port != execlists->port ? rq_prio(last) : INT_MIN;
776
777 if (submit) {
778 port_assign(port, last);
779 execlists_submit_ports(engine);
780 }
781
782 /* We must always keep the beast fed if we have work piled up */
783 GEM_BUG_ON(rb_first_cached(&execlists->queue) &&
784 !port_isset(execlists->port));
785
786 /* Re-evaluate the executing context setup after each preemptive kick */
787 if (last)
788 execlists_user_begin(execlists, execlists->port);
789
790 /* If the engine is now idle, so should be the flag; and vice versa. */
791 GEM_BUG_ON(execlists_is_active(&engine->execlists,
792 EXECLISTS_ACTIVE_USER) ==
793 !port_isset(engine->execlists.port));
794 }
795
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32053 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch
2018-08-17 12:24 [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch Chris Wilson
2018-08-17 12:47 ` ✓ Fi.CI.BAT: success for " Patchwork
2018-08-17 13:43 ` [PATCH] " kbuild test robot
@ 2018-08-17 14:23 ` kbuild test robot
2018-08-17 15:57 ` ✗ Fi.CI.IGT: failure for " Patchwork
3 siblings, 0 replies; 5+ messages in thread
From: kbuild test robot @ 2018-08-17 14:23 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
[-- Attachment #1: Type: text/plain, Size: 10136 bytes --]
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.18 next-20180817]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-execlists-Micro-optimise-idle-context-switch/20180817-205726
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-s0-201832 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/gpu//drm/i915/intel_lrc.c: In function 'execlists_dequeue':
>> drivers/gpu//drm/i915/intel_lrc.c:730:9: error: implicit declaration of function 'intel_engine_signaled' [-Werror=implicit-function-declaration]
intel_engine_signaled(engine,
^~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/intel_engine_signaled +730 drivers/gpu//drm/i915/intel_lrc.c
580
581 static void execlists_dequeue(struct intel_engine_cs *engine)
582 {
583 struct intel_engine_execlists * const execlists = &engine->execlists;
584 struct execlist_port *port = execlists->port;
585 const struct execlist_port * const last_port =
586 &execlists->port[execlists->port_mask];
587 struct i915_request *last = port_request(port);
588 struct rb_node *rb;
589 bool submit = false;
590
591 /*
592 * Hardware submission is through 2 ports. Conceptually each port
593 * has a (RING_START, RING_HEAD, RING_TAIL) tuple. RING_START is
594 * static for a context, and unique to each, so we only execute
595 * requests belonging to a single context from each ring. RING_HEAD
596 * is maintained by the CS in the context image, it marks the place
597 * where it got up to last time, and through RING_TAIL we tell the CS
598 * where we want to execute up to this time.
599 *
600 * In this list the requests are in order of execution. Consecutive
601 * requests from the same context are adjacent in the ringbuffer. We
602 * can combine these requests into a single RING_TAIL update:
603 *
604 * RING_HEAD...req1...req2
605 * ^- RING_TAIL
606 * since to execute req2 the CS must first execute req1.
607 *
608 * Our goal then is to point each port to the end of a consecutive
609 * sequence of requests as being the most optimal (fewest wake ups
610 * and context switches) submission.
611 */
612
613 if (last) {
614 /*
615 * Don't resubmit or switch until all outstanding
616 * preemptions (lite-restore) are seen. Then we
617 * know the next preemption status we see corresponds
618 * to this ELSP update.
619 */
620 GEM_BUG_ON(!execlists_is_active(execlists,
621 EXECLISTS_ACTIVE_USER));
622 GEM_BUG_ON(!port_count(&port[0]));
623
624 /*
625 * If we write to ELSP a second time before the HW has had
626 * a chance to respond to the previous write, we can confuse
627 * the HW and hit "undefined behaviour". After writing to ELSP,
628 * we must then wait until we see a context-switch event from
629 * the HW to indicate that it has had a chance to respond.
630 */
631 if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
632 return;
633
634 if (need_preempt(engine, last, execlists->queue_priority)) {
635 inject_preempt_context(engine);
636 return;
637 }
638
639 /*
640 * In theory, we could coalesce more requests onto
641 * the second port (the first port is active, with
642 * no preemptions pending). However, that means we
643 * then have to deal with the possible lite-restore
644 * of the second port (as we submit the ELSP, there
645 * may be a context-switch) but also we may complete
646 * the resubmission before the context-switch. Ergo,
647 * coalescing onto the second port will cause a
648 * preemption event, but we cannot predict whether
649 * that will affect port[0] or port[1].
650 *
651 * If the second port is already active, we can wait
652 * until the next context-switch before contemplating
653 * new requests. The GPU will be busy and we should be
654 * able to resubmit the new ELSP before it idles,
655 * avoiding pipeline bubbles (momentary pauses where
656 * the driver is unable to keep up the supply of new
657 * work). However, we have to double check that the
658 * priorities of the ports haven't been switch.
659 */
660 if (port_count(&port[1]))
661 return;
662
663 /*
664 * WaIdleLiteRestore:bdw,skl
665 * Apply the wa NOOPs to prevent
666 * ring:HEAD == rq:TAIL as we resubmit the
667 * request. See gen8_emit_breadcrumb() for
668 * where we prepare the padding after the
669 * end of the request.
670 */
671 last->tail = last->wa_tail;
672 }
673
674 while ((rb = rb_first_cached(&execlists->queue))) {
675 struct i915_priolist *p = to_priolist(rb);
676 struct i915_request *rq, *rn;
677
678 list_for_each_entry_safe(rq, rn, &p->requests, sched.link) {
679 /*
680 * Can we combine this request with the current port?
681 * It has to be the same context/ringbuffer and not
682 * have any exceptions (e.g. GVT saying never to
683 * combine contexts).
684 *
685 * If we can combine the requests, we can execute both
686 * by updating the RING_TAIL to point to the end of the
687 * second request, and so we never need to tell the
688 * hardware about the first.
689 */
690 if (last &&
691 !can_merge_ctx(rq->hw_context, last->hw_context)) {
692 /*
693 * If we are on the second port and cannot
694 * combine this request with the last, then we
695 * are done.
696 */
697 if (port == last_port) {
698 __list_del_many(&p->requests,
699 &rq->sched.link);
700 goto done;
701 }
702
703 /*
704 * If GVT overrides us we only ever submit
705 * port[0], leaving port[1] empty. Note that we
706 * also have to be careful that we don't queue
707 * the same context (even though a different
708 * request) to the second port.
709 */
710 if (ctx_single_port_submission(last->hw_context) ||
711 ctx_single_port_submission(rq->hw_context)) {
712 __list_del_many(&p->requests,
713 &rq->sched.link);
714 goto done;
715 }
716
717 GEM_BUG_ON(last->hw_context == rq->hw_context);
718
719 /*
720 * Avoid reloading the previous context if we
721 * know it has just completed and we want
722 * to switch over to a new context. The CS
723 * interrupt is likely waiting for us to
724 * release the local irq lock and so we will
725 * proceed with the submission momentarily,
726 * which is quicker than reloading the context
727 * on the gpu.
728 */
729 if (!submit &&
> 730 intel_engine_signaled(engine,
731 last->global_seqno)) {
732 GEM_BUG_ON(!list_is_first(&rq->sched.link,
733 &p->requests));
734 return;
735 }
736
737 if (submit)
738 port_assign(port, last);
739 port++;
740
741 GEM_BUG_ON(port_isset(port));
742 }
743
744 INIT_LIST_HEAD(&rq->sched.link);
745 __i915_request_submit(rq);
746 trace_i915_request_in(rq, port_index(port, execlists));
747 last = rq;
748 submit = true;
749 }
750
751 rb_erase_cached(&p->node, &execlists->queue);
752 INIT_LIST_HEAD(&p->requests);
753 if (p->priority != I915_PRIORITY_NORMAL)
754 kmem_cache_free(engine->i915->priorities, p);
755 }
756
757 done:
758 /*
759 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
760 *
761 * We choose queue_priority such that if we add a request of greater
762 * priority than this, we kick the submission tasklet to decide on
763 * the right order of submitting the requests to hardware. We must
764 * also be prepared to reorder requests as they are in-flight on the
765 * HW. We derive the queue_priority then as the first "hole" in
766 * the HW submission ports and if there are no available slots,
767 * the priority of the lowest executing request, i.e. last.
768 *
769 * When we do receive a higher priority request ready to run from the
770 * user, see queue_request(), the queue_priority is bumped to that
771 * request triggering preemption on the next dequeue (or subsequent
772 * interrupt for secondary ports).
773 */
774 execlists->queue_priority =
775 port != execlists->port ? rq_prio(last) : INT_MIN;
776
777 if (submit) {
778 port_assign(port, last);
779 execlists_submit_ports(engine);
780 }
781
782 /* We must always keep the beast fed if we have work piled up */
783 GEM_BUG_ON(rb_first_cached(&execlists->queue) &&
784 !port_isset(execlists->port));
785
786 /* Re-evaluate the executing context setup after each preemptive kick */
787 if (last)
788 execlists_user_begin(execlists, execlists->port);
789
790 /* If the engine is now idle, so should be the flag; and vice versa. */
791 GEM_BUG_ON(execlists_is_active(&engine->execlists,
792 EXECLISTS_ACTIVE_USER) ==
793 !port_isset(engine->execlists.port));
794 }
795
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30728 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* ✗ Fi.CI.IGT: failure for drm/i915/execlists: Micro-optimise "idle" context switch
2018-08-17 12:24 [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch Chris Wilson
` (2 preceding siblings ...)
2018-08-17 14:23 ` kbuild test robot
@ 2018-08-17 15:57 ` Patchwork
3 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2018-08-17 15:57 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/execlists: Micro-optimise "idle" context switch
URL : https://patchwork.freedesktop.org/series/48392/
State : failure
== Summary ==
= CI Bug Log - changes from CI_DRM_4685_full -> Patchwork_9969_full =
== Summary - FAILURE ==
Serious unknown changes coming with Patchwork_9969_full absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_9969_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
== Possible new issues ==
Here are the unknown changes that may have been introduced in Patchwork_9969_full:
=== IGT changes ===
==== Possible regressions ====
igt@gem_eio@reset-stress:
shard-glk: PASS -> FAIL
shard-kbl: PASS -> FAIL
shard-apl: PASS -> FAIL
== Known issues ==
Here are the changes found in Patchwork_9969_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@drv_suspend@shrink:
shard-kbl: PASS -> INCOMPLETE (fdo#106886, fdo#103665)
igt@gem_ppgtt@blt-vs-render-ctxn:
shard-kbl: PASS -> INCOMPLETE (fdo#106023, fdo#103665)
igt@kms_flip@2x-flip-vs-expired-vblank:
shard-glk: PASS -> FAIL (fdo#105363)
igt@perf@blocking:
shard-hsw: PASS -> FAIL (fdo#102252)
igt@perf_pmu@rc6-runtime-pm-long:
shard-kbl: PASS -> FAIL (fdo#105010)
==== Possible fixes ====
igt@gem_ctx_isolation@rcs0-s3:
shard-glk: FAIL (fdo#103375) -> PASS +1
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
fdo#105010 https://bugs.freedesktop.org/show_bug.cgi?id=105010
fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
fdo#106023 https://bugs.freedesktop.org/show_bug.cgi?id=106023
fdo#106886 https://bugs.freedesktop.org/show_bug.cgi?id=106886
== Participating hosts (5 -> 5) ==
No changes in participating hosts
== Build changes ==
* Linux: CI_DRM_4685 -> Patchwork_9969
CI_DRM_4685: df7e8eddc3830216d3fec15e2c7d0b6ec97e7bae @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4606: 38a44003774e35c587c67c8766b35e75dbb993b8 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_9969: 5b94bc4048d887a036b23884b8aa18b99a3ba891 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9969/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-17 15:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-17 12:24 [PATCH] drm/i915/execlists: Micro-optimise "idle" context switch Chris Wilson
2018-08-17 12:47 ` ✓ Fi.CI.BAT: success for " Patchwork
2018-08-17 13:43 ` [PATCH] " kbuild test robot
2018-08-17 14:23 ` kbuild test robot
2018-08-17 15:57 ` ✗ Fi.CI.IGT: failure for " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.