All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Remove overzealous fence warn on runtime suspend
@ 2017-02-03 12:57 Chris Wilson
  2017-02-03 14:56 ` ✓ Fi.CI.BAT: success for " Patchwork
  2017-02-07 14:24 ` [PATCH] " Joonas Lahtinen
  0 siblings, 2 replies; 4+ messages in thread
From: Chris Wilson @ 2017-02-03 12:57 UTC (permalink / raw)
  To: intel-gfx; +Cc: Imre Deak, Daniel Vetter, # v4 . 10-rc1+

The goal of the WARN was to catch when we are still actively using the
fence as we go into the runtime suspend. However, the reg->pin_count is
too coarse as it does not distinguish between exclusive ownership of the
fence register from activity.

I've not improved on the WARN, nor have we captured this WARN in an
exact igt, but it is showing up regularly in the wild:

[ 1915.935332] WARNING: CPU: 1 PID: 10861 at drivers/gpu/drm/i915/i915_gem.c:2022 i915_gem_runtime_suspend+0x116/0x130 [i915]
[ 1915.935383] WARN_ON(reg->pin_count)[ 1915.935399] Modules linked in:
 snd_hda_intel i915 drm_kms_helper vgem netconsole scsi_transport_iscsi fuse vfat fat x86_pkg_temp_thermal coretemp intel_cstate intel_uncore snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mei_me mei serio_raw intel_rapl_perf intel_pch_thermal soundcore wmi acpi_pad i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops drm r8169 mii video [last unloaded: drm_kms_helper]
[ 1915.935785] CPU: 1 PID: 10861 Comm: kworker/1:0 Tainted: G     U  W       4.9.0-rc5+ #170
[ 1915.935799] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015
[ 1915.935822] Workqueue: pm pm_runtime_work
[ 1915.935845]  ffffc900044fbbf0 ffffffffac3220bc ffffc900044fbc40 0000000000000000
[ 1915.935890]  ffffc900044fbc30 ffffffffac059bcb 000007e6044fbc60 ffff8801626e3198
[ 1915.935937]  ffff8801626e0000 0000000000000002 ffffffffc05e5d4e 0000000000000000
[ 1915.935985] Call Trace:
[ 1915.936013]  [<ffffffffac3220bc>] dump_stack+0x4f/0x73
[ 1915.936038]  [<ffffffffac059bcb>] __warn+0xcb/0xf0
[ 1915.936060]  [<ffffffffac059c4f>] warn_slowpath_fmt+0x5f/0x80
[ 1915.936158]  [<ffffffffc052d916>] i915_gem_runtime_suspend+0x116/0x130 [i915]
[ 1915.936251]  [<ffffffffc04f1c74>] intel_runtime_suspend+0x64/0x280 [i915]
[ 1915.936277]  [<ffffffffac0926f1>] ? dequeue_entity+0x241/0xbc0
[ 1915.936298]  [<ffffffffac36bb85>] pci_pm_runtime_suspend+0x55/0x180
[ 1915.936317]  [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
[ 1915.936339]  [<ffffffffac4514e2>] __rpm_callback+0x32/0x70
[ 1915.936356]  [<ffffffffac451544>] rpm_callback+0x24/0x80
[ 1915.936375]  [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
[ 1915.936392]  [<ffffffffac45222d>] rpm_suspend+0x12d/0x680
[ 1915.936415]  [<ffffffffac69f6d7>] ? _raw_spin_unlock_irq+0x17/0x30
[ 1915.936435]  [<ffffffffac0810b8>] ? finish_task_switch+0x88/0x220
[ 1915.936455]  [<ffffffffac4534bf>] pm_runtime_work+0x6f/0xb0
[ 1915.936477]  [<ffffffffac074353>] process_one_work+0x1f3/0x4d0
[ 1915.936501]  [<ffffffffac074678>] worker_thread+0x48/0x4e0
[ 1915.936523]  [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
[ 1915.936542]  [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
[ 1915.936559]  [<ffffffffac07a2c9>] kthread+0xd9/0xf0
[ 1915.936580]  [<ffffffffac07a1f0>] ? kthread_park+0x60/0x60
[ 1915.936600]  [<ffffffffac69fe62>] ret_from_fork+0x22/0x30

In the case the register is pinned, it should be present and we will
need to invalidate them to be restored upon resume as we cannot expect
the owner of the pin to call get_fence prior to use after resume.

Fixes: 7c108fd8feac ("drm/i915: Move fence cancellation to runtime suspend")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98804
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Imre Deak <imre.deak@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+
---
 drivers/gpu/drm/i915/i915_drv.c |  1 +
 drivers/gpu/drm/i915/i915_gem.c | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index dacbe981f182..320d4e4f9db9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2488,6 +2488,7 @@ static int intel_runtime_resume(struct device *kdev)
 	 * we can do is to hope that things will still work (and disable RPM).
 	 */
 	i915_gem_init_swizzling(dev_priv);
+	i915_gem_restore_fences(dev_priv);
 
 	intel_runtime_pm_enable_interrupts(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 64ddf7a0da76..48a37e185542 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2024,8 +2024,16 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
 
-		if (WARN_ON(reg->pin_count))
-			continue;
+		/* Ideally we want to assert that the fence register is not
+		 * live at this point (i.e. that no piece of code will is
+		 * trying to write through fence + GTT, as that both violates
+		 * our tracking of activity and associated locking/barriers,
+		 * but also is illegal given that the hw is powered down).
+		 *
+		 * Previously we used reg->pin_count as a "liveness" indicator.
+		 * That is not sufficient, and we need a more fine-grained
+		 * tool if we want to have a sanity check here.
+		 */
 
 		if (!reg->vma)
 			continue;
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Remove overzealous fence warn on runtime suspend
  2017-02-03 12:57 [PATCH] drm/i915: Remove overzealous fence warn on runtime suspend Chris Wilson
@ 2017-02-03 14:56 ` Patchwork
  2017-02-07 14:24 ` [PATCH] " Joonas Lahtinen
  1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-02-03 14:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Remove overzealous fence warn on runtime suspend
URL   : https://patchwork.freedesktop.org/series/19055/
State : success

== Summary ==

Series 19055v1 drm/i915: Remove overzealous fence warn on runtime suspend
https://patchwork.freedesktop.org/api/1.0/series/19055/revisions/1/mbox/


fi-bdw-5557u     total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:247  pass:208  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700     total:78   pass:65   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:247  pass:220  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:247  pass:216  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:247  pass:228  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r     total:247  pass:228  dwarn:0   dfail:0   fail:0   skip:19 
fi-ilk-650       total:14   pass:13   dwarn:0   dfail:0   fail:0   skip:0  
fi-ivb-3520m     total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770      total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u     total:247  pass:224  dwarn:0   dfail:0   fail:2   skip:21 
fi-skl-6260u     total:247  pass:234  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:247  pass:222  dwarn:4   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:247  pass:234  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m     total:247  pass:216  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600      total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 

e7f4379a8f59bebb1bdefe0584e128dfdd27a86a drm-tip: 2017y-02m-03d-13h-03m-49s UTC integration manifest
5988188 drm/i915: Remove overzealous fence warn on runtime suspend

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3691/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Remove overzealous fence warn on runtime suspend
  2017-02-03 12:57 [PATCH] drm/i915: Remove overzealous fence warn on runtime suspend Chris Wilson
  2017-02-03 14:56 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-02-07 14:24 ` Joonas Lahtinen
  2017-02-07 14:39   ` Chris Wilson
  1 sibling, 1 reply; 4+ messages in thread
From: Joonas Lahtinen @ 2017-02-07 14:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Imre Deak, Daniel Vetter, # v4 . 10-rc1+

On pe, 2017-02-03 at 12:57 +0000, Chris Wilson wrote:
> The goal of the WARN was to catch when we are still actively using the
> fence as we go into the runtime suspend. However, the reg->pin_count is
> too coarse as it does not distinguish between exclusive ownership of the
> fence register from activity.
> 
> I've not improved on the WARN, nor have we captured this WARN in an
> exact igt, but it is showing up regularly in the wild:
> 
> [ 1915.935332] WARNING: CPU: 1 PID: 10861 at drivers/gpu/drm/i915/i915_gem.c:2022 i915_gem_runtime_suspend+0x116/0x130 [i915]
> [ 1915.935383] WARN_ON(reg->pin_count)[ 1915.935399] Modules linked in:
>  snd_hda_intel i915 drm_kms_helper vgem netconsole scsi_transport_iscsi fuse vfat fat x86_pkg_temp_thermal coretemp intel_cstate intel_uncore snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mei_me mei serio_raw intel_rapl_perf intel_pch_thermal soundcore wmi acpi_pad i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops drm r8169 mii video [last unloaded: drm_kms_helper]
> [ 1915.935785] CPU: 1 PID: 10861 Comm: kworker/1:0 Tainted: G     U  W       4.9.0-rc5+ #170
> [ 1915.935799] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015
> [ 1915.935822] Workqueue: pm pm_runtime_work
> [ 1915.935845]  ffffc900044fbbf0 ffffffffac3220bc ffffc900044fbc40 0000000000000000
> [ 1915.935890]  ffffc900044fbc30 ffffffffac059bcb 000007e6044fbc60 ffff8801626e3198
> [ 1915.935937]  ffff8801626e0000 0000000000000002 ffffffffc05e5d4e 0000000000000000
> [ 1915.935985] Call Trace:
> [ 1915.936013]  [<ffffffffac3220bc>] dump_stack+0x4f/0x73
> [ 1915.936038]  [<ffffffffac059bcb>] __warn+0xcb/0xf0
> [ 1915.936060]  [<ffffffffac059c4f>] warn_slowpath_fmt+0x5f/0x80
> [ 1915.936158]  [<ffffffffc052d916>] i915_gem_runtime_suspend+0x116/0x130 [i915]
> [ 1915.936251]  [<ffffffffc04f1c74>] intel_runtime_suspend+0x64/0x280 [i915]
> [ 1915.936277]  [<ffffffffac0926f1>] ? dequeue_entity+0x241/0xbc0
> [ 1915.936298]  [<ffffffffac36bb85>] pci_pm_runtime_suspend+0x55/0x180
> [ 1915.936317]  [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
> [ 1915.936339]  [<ffffffffac4514e2>] __rpm_callback+0x32/0x70
> [ 1915.936356]  [<ffffffffac451544>] rpm_callback+0x24/0x80
> [ 1915.936375]  [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
> [ 1915.936392]  [<ffffffffac45222d>] rpm_suspend+0x12d/0x680
> [ 1915.936415]  [<ffffffffac69f6d7>] ? _raw_spin_unlock_irq+0x17/0x30
> [ 1915.936435]  [<ffffffffac0810b8>] ? finish_task_switch+0x88/0x220
> [ 1915.936455]  [<ffffffffac4534bf>] pm_runtime_work+0x6f/0xb0
> [ 1915.936477]  [<ffffffffac074353>] process_one_work+0x1f3/0x4d0
> [ 1915.936501]  [<ffffffffac074678>] worker_thread+0x48/0x4e0
> [ 1915.936523]  [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
> [ 1915.936542]  [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
> [ 1915.936559]  [<ffffffffac07a2c9>] kthread+0xd9/0xf0
> [ 1915.936580]  [<ffffffffac07a1f0>] ? kthread_park+0x60/0x60
> [ 1915.936600]  [<ffffffffac69fe62>] ret_from_fork+0x22/0x30
> 
> In the case the register is pinned, it should be present and we will
> need to invalidate them to be restored upon resume as we cannot expect
> the owner of the pin to call get_fence prior to use after resume.
> 
> Fixes: 7c108fd8feac ("drm/i915: Move fence cancellation to runtime suspend")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98804
> Reported-by: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Imre Deak <imre.deak@linux.intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.10-rc1+

<SNIP>

> @@ -2024,8 +2024,16 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
>  	for (i = 0; i < dev_priv->num_fence_regs; i++) {
>  		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
>  
> -		if (WARN_ON(reg->pin_count))
> -			continue;
> +		/* Ideally we want to assert that the fence register is not
> +		 * live at this point (i.e. that no piece of code will is

s/will is/will be/

With that,

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Remove overzealous fence warn on runtime suspend
  2017-02-07 14:24 ` [PATCH] " Joonas Lahtinen
@ 2017-02-07 14:39   ` Chris Wilson
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2017-02-07 14:39 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: Imre Deak, Daniel Vetter, intel-gfx, # v4 . 10-rc1+

On Tue, Feb 07, 2017 at 04:24:25PM +0200, Joonas Lahtinen wrote:
> On pe, 2017-02-03 at 12:57 +0000, Chris Wilson wrote:
> > @@ -2024,8 +2024,16 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
> >  	for (i = 0; i < dev_priv->num_fence_regs; i++) {
> >  		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
> >  
> > -		if (WARN_ON(reg->pin_count))
> > -			continue;
> > +		/* Ideally we want to assert that the fence register is not
> > +		 * live at this point (i.e. that no piece of code will is
> 
> s/will is/will be/
> 
> With that,
> 
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Fixed up and applied, thanks.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-02-07 14:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-03 12:57 [PATCH] drm/i915: Remove overzealous fence warn on runtime suspend Chris Wilson
2017-02-03 14:56 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-02-07 14:24 ` [PATCH] " Joonas Lahtinen
2017-02-07 14:39   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.