All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort
@ 2018-07-26  8:50 Chris Wilson
  2018-07-26  8:50 ` [PATCH 2/3] drm/i915: Restore sane defaults for KMS on GEM error load Chris Wilson
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Chris Wilson @ 2018-07-26  8:50 UTC (permalink / raw)
  To: intel-gfx

Prevent
[  397.873143] general protection fault: 0000 [#1] PREEMPT SMP PTI
[  397.873154] CPU: 4 PID: 4799 Comm: drv_module_relo Tainted: G     U            4.18.0-rc6-CI-CI_DRM_4534+ #1
[  397.873162] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
[  397.873175] RIP: 0010:__lock_acquire+0xf6/0x1b50
[  397.873179] Code: 85 c0 4c 8b 9d 40 ff ff ff 8b 8d 38 ff ff ff 44 8b 8d 30 ff ff ff 4c 8b 85 28 ff ff ff 44 8b 95 24 ff ff ff 0f 84 54 03 00 00 <f0> ff 80 38 01 00 00 8b 15 45 8c 59 02 45 8b bc 24 70 08 00 00 85
[  397.873240] RSP: 0018:ffffc90000497b40 EFLAGS: 00010002
[  397.873246] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
[  397.873252] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
[  397.873258] RBP: ffffc90000497c20 R08: ffffffff810a25e9 R09: 0000000000000000
[  397.873264] R10: 0000000000000000 R11: ffff880255c63c28 R12: ffff8801093b2840
[  397.873270] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000246
[  397.873277] FS:  00007faf88d71980(0000) GS:ffff880266300000(0000) knlGS:0000000000000000
[  397.873284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  397.873289] CR2: 000055d866c9ca10 CR3: 000000025472e006 CR4: 00000000003606e0
[  397.873295] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  397.873301] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  397.873308] Call Trace:
[  397.873318]  ? lock_acquire+0xa6/0x210
[  397.873323]  lock_acquire+0xa6/0x210
[  397.873331]  ? drain_workqueue+0x19/0x180
[  397.873339]  __mutex_lock+0x89/0x980
[  397.873346]  ? drain_workqueue+0x19/0x180
[  397.873352]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
[  397.873359]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[  397.873365]  ? drain_workqueue+0x19/0x180
[  397.873373]  ? debug_object_active_state+0x127/0x150
[  397.873381]  ? drain_workqueue+0x19/0x180
[  397.873387]  drain_workqueue+0x19/0x180
[  397.873395]  destroy_workqueue+0x12/0x1f0
[  397.873476]  intel_guc_fini_misc+0x36/0x90 [i915]
[  397.873540]  i915_gem_fini+0x91/0x100 [i915]
[  397.873588]  i915_driver_unload+0xd2/0x110 [i915]
[  397.873638]  i915_pci_remove+0x19/0x30 [i915]
[  397.873646]  pci_device_remove+0x36/0xb0
[  397.873653]  device_release_driver_internal+0x185/0x250
[  397.873660]  driver_detach+0x35/0x70
[  397.873668]  bus_remove_driver+0x53/0xd0
[  397.873675]  pci_unregister_driver+0x25/0xa0
[  397.873683]  __se_sys_delete_module+0x162/0x210
[  397.873691]  ? do_syscall_64+0xd/0x190
[  397.873697]  do_syscall_64+0x55/0x190
[  397.873704]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  397.873710] RIP: 0033:0x7faf884231b7
[  397.873714] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
[  397.873775] RSP: 002b:00007ffda4e98cf8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  397.873784] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faf884231b7
[  397.873790] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055fbb18f1bd8
[  397.873796] RBP: 000055fbb18f1b70 R08: 000055fbb18f1bdc R09: 00007ffda4e98d38
[  397.873802] R10: 00007ffda4e97cf4 R11: 0000000000000206 R12: 000055fbb0d32470
[  397.873808] R13: 00007ffda4e992e0 R14: 0000000000000000 R15: 0000000000000000

v2: It's use-after-free; not a NULL pointer.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/intel_guc.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 846d693ecb53..3082d7670f05 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -128,13 +128,15 @@ static int guc_init_wq(struct intel_guc *guc)
 
 static void guc_fini_wq(struct intel_guc *guc)
 {
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct workqueue_struct *wq;
 
-	if (HAS_LOGICAL_RING_PREEMPTION(dev_priv) &&
-	    USES_GUC_SUBMISSION(dev_priv))
-		destroy_workqueue(guc->preempt_wq);
+	wq = fetch_and_zero(&guc->preempt_wq);
+	if (wq)
+		destroy_workqueue(wq);
 
-	destroy_workqueue(guc->log.relay.flush_wq);
+	wq = fetch_and_zero(&guc->log.relay.flush_wq);
+	if (wq)
+		destroy_workqueue(wq);
 }
 
 int intel_guc_init_misc(struct intel_guc *guc)
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] drm/i915: Restore sane defaults for KMS on GEM error load
  2018-07-26  8:50 [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort Chris Wilson
@ 2018-07-26  8:50 ` Chris Wilson
  2018-07-26  8:50 ` [PATCH 3/3] drm/i915: Don't disable the GPU for older gen on wedging Chris Wilson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2018-07-26  8:50 UTC (permalink / raw)
  To: intel-gfx

If we fail during GEM initialisation, we scrub the HW state by
performing a device level GPU resuet. However, we want to leave the
system in a usable state (with functioning KMS but no GEM) so after
scrubbing the HW state, we need to restore some sane defaults and
re-enable the low-level common parts of the GPU (such as the GMCH).

v2: Restore GTT entries.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a4031fab57b0..d9cc4820d224 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5598,6 +5598,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		i915_gem_cleanup_userptr(dev_priv);
 
 	if (ret == -EIO) {
+		mutex_lock(&dev_priv->drm.struct_mutex);
+
 		/*
 		 * Allow engine initialisation to fail by marking the GPU as
 		 * wedged. But we only want to do this where the GPU is angry,
@@ -5608,7 +5610,14 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 					"Failed to initialize GPU, declaring it wedged!\n");
 			i915_gem_set_wedged(dev_priv);
 		}
-		ret = 0;
+
+		/* Minimal basic recovery for KMS */
+		ret = i915_ggtt_enable_hw(dev_priv);
+		i915_gem_restore_gtt_mappings(dev_priv);
+		i915_gem_restore_fences(dev_priv);
+		intel_init_clock_gating(dev_priv);
+
+		mutex_unlock(&dev_priv->drm.struct_mutex);
 	}
 
 	i915_gem_drain_freed_objects(dev_priv);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] drm/i915: Don't disable the GPU for older gen on wedging
  2018-07-26  8:50 [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort Chris Wilson
  2018-07-26  8:50 ` [PATCH 2/3] drm/i915: Restore sane defaults for KMS on GEM error load Chris Wilson
@ 2018-07-26  8:50 ` Chris Wilson
  2018-07-26  9:43 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort Patchwork
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2018-07-26  8:50 UTC (permalink / raw)
  To: intel-gfx

If we issue a device level GPU reset on the older gen, it will disable
key components of the GMCH and the display engine. The purpose of
wedging is to simply prevent further GEM usage without disabling KMS, so
we need to be careful when we do issue the reset on wedging.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9cc4820d224..1c82d5a61c7e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3329,7 +3329,8 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
 	i915->caps.scheduler = 0;
 
 	/* Even if the GPU reset fails, it should still stop the engines */
-	intel_gpu_reset(i915, ALL_ENGINES);
+	if (INTEL_GEN(i915) >= 5)
+		intel_gpu_reset(i915, ALL_ENGINES);
 
 	/*
 	 * Make sure no one is running the old callback before we proceed with
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort
  2018-07-26  8:50 [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort Chris Wilson
  2018-07-26  8:50 ` [PATCH 2/3] drm/i915: Restore sane defaults for KMS on GEM error load Chris Wilson
  2018-07-26  8:50 ` [PATCH 3/3] drm/i915: Don't disable the GPU for older gen on wedging Chris Wilson
@ 2018-07-26  9:43 ` Patchwork
  2018-07-26 10:32 ` ✓ Fi.CI.IGT: " Patchwork
  2018-07-26 11:44 ` [PATCH 1/3] " Michał Winiarski
  4 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2018-07-26  9:43 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort
URL   : https://patchwork.freedesktop.org/series/47272/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4548 -> Patchwork_9777 =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_9777 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9777, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/47272/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9777:

  === IGT changes ===

    ==== Warnings ====

    igt@drv_selftest@live_gtt:
      fi-kbl-guc:         SKIP -> PASS +28

    igt@drv_selftest@mock_timelines:
      fi-cfl-guc:         SKIP -> PASS +28

    
== Known issues ==

  Here are the changes found in Patchwork_9777 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_selftest@live_guc:
      fi-cfl-guc:         SKIP -> DMESG-WARN (fdo#107258)
      fi-kbl-guc:         SKIP -> DMESG-WARN (fdo#107258)

    
    ==== Possible fixes ====

    igt@drv_module_reload@basic-reload:
      fi-glk-j4005:       DMESG-WARN (fdo#106248, fdo#106725) -> PASS

    igt@drv_module_reload@basic-reload-inject:
      fi-kbl-guc:         DMESG-FAIL -> PASS
      fi-bwr-2160:        INCOMPLETE -> PASS
      fi-gdg-551:         INCOMPLETE -> PASS
      fi-glk-j4005:       DMESG-WARN (fdo#105719) -> PASS
      fi-cfl-guc:         DMESG-FAIL -> PASS
      fi-blb-e6850:       INCOMPLETE -> PASS

    igt@drv_selftest@live_workarounds:
      {fi-bsw-kefka}:     DMESG-FAIL (fdo#107292) -> PASS

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-glk-j4005:       FAIL (fdo#100368) -> PASS

    {igt@kms_psr@primary_mmap_gtt}:
      fi-cnl-psr:         DMESG-WARN (fdo#107372) -> PASS

    
    ==== Warnings ====

    igt@drv_selftest@live_workarounds:
      fi-cnl-psr:         DMESG-WARN (fdo#105395) -> DMESG-FAIL (fdo#107292)

    
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#105395 https://bugs.freedesktop.org/show_bug.cgi?id=105395
  fdo#105719 https://bugs.freedesktop.org/show_bug.cgi?id=105719
  fdo#106248 https://bugs.freedesktop.org/show_bug.cgi?id=106248
  fdo#106725 https://bugs.freedesktop.org/show_bug.cgi?id=106725
  fdo#107258 https://bugs.freedesktop.org/show_bug.cgi?id=107258
  fdo#107292 https://bugs.freedesktop.org/show_bug.cgi?id=107292
  fdo#107372 https://bugs.freedesktop.org/show_bug.cgi?id=107372


== Participating hosts (51 -> 43) ==

  Missing    (8): fi-ilk-m540 fi-hsw-4200u fi-skl-guc fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


== Build changes ==

    * Linux: CI_DRM_4548 -> Patchwork_9777

  CI_DRM_4548: 1ccdb8d0bf55621006a4ac04e8e5e964480382ef @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4576: bcb37a9b20eeec97f15fac2222408cc2e0b77631 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9777: ede9acf7135daa0ccf9720edf079f3510b2178fc @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ede9acf7135d drm/i915: Don't disable the GPU for older gen on wedging
763456b885f4 drm/i915: Restore sane defaults for KMS on GEM error load
f0c7645b883b drm/i915: Protect guc_fini_wq() against module load abort

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9777/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort
  2018-07-26  8:50 [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort Chris Wilson
                   ` (2 preceding siblings ...)
  2018-07-26  9:43 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort Patchwork
@ 2018-07-26 10:32 ` Patchwork
  2018-07-26 11:44 ` [PATCH 1/3] " Michał Winiarski
  4 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2018-07-26 10:32 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort
URL   : https://patchwork.freedesktop.org/series/47272/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4548_full -> Patchwork_9777_full =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_9777_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9777_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9777_full:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_mocs_settings@mocs-rc6-bsd2:
      shard-kbl:          PASS -> SKIP

    
== Known issues ==

  Here are the changes found in Patchwork_9777_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes:
      shard-kbl:          PASS -> INCOMPLETE (fdo#103665)

    igt@kms_setmode@basic:
      shard-glk:          PASS -> FAIL (fdo#99912)

    igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend:
      shard-glk:          PASS -> FAIL (fdo#103375)

    
    ==== Possible fixes ====

    igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
      shard-hsw:          FAIL (fdo#102887) -> PASS

    igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:
      shard-apl:          FAIL (fdo#103375) -> PASS

    
  fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
  fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4548 -> Patchwork_9777

  CI_DRM_4548: 1ccdb8d0bf55621006a4ac04e8e5e964480382ef @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4576: bcb37a9b20eeec97f15fac2222408cc2e0b77631 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9777: ede9acf7135daa0ccf9720edf079f3510b2178fc @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9777/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort
  2018-07-26  8:50 [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort Chris Wilson
                   ` (3 preceding siblings ...)
  2018-07-26 10:32 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-07-26 11:44 ` Michał Winiarski
  2018-07-26 12:21   ` Chris Wilson
  4 siblings, 1 reply; 7+ messages in thread
From: Michał Winiarski @ 2018-07-26 11:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, Jul 26, 2018 at 09:50:31AM +0100, Chris Wilson wrote:
> Prevent
> [  397.873143] general protection fault: 0000 [#1] PREEMPT SMP PTI
> [  397.873154] CPU: 4 PID: 4799 Comm: drv_module_relo Tainted: G     U            4.18.0-rc6-CI-CI_DRM_4534+ #1
> [  397.873162] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
> [  397.873175] RIP: 0010:__lock_acquire+0xf6/0x1b50
> [  397.873179] Code: 85 c0 4c 8b 9d 40 ff ff ff 8b 8d 38 ff ff ff 44 8b 8d 30 ff ff ff 4c 8b 85 28 ff ff ff 44 8b 95 24 ff ff ff 0f 84 54 03 00 00 <f0> ff 80 38 01 00 00 8b 15 45 8c 59 02 45 8b bc 24 70 08 00 00 85
> [  397.873240] RSP: 0018:ffffc90000497b40 EFLAGS: 00010002
> [  397.873246] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
> [  397.873252] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
> [  397.873258] RBP: ffffc90000497c20 R08: ffffffff810a25e9 R09: 0000000000000000
> [  397.873264] R10: 0000000000000000 R11: ffff880255c63c28 R12: ffff8801093b2840
> [  397.873270] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000246
> [  397.873277] FS:  00007faf88d71980(0000) GS:ffff880266300000(0000) knlGS:0000000000000000
> [  397.873284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  397.873289] CR2: 000055d866c9ca10 CR3: 000000025472e006 CR4: 00000000003606e0
> [  397.873295] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  397.873301] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  397.873308] Call Trace:
> [  397.873318]  ? lock_acquire+0xa6/0x210
> [  397.873323]  lock_acquire+0xa6/0x210
> [  397.873331]  ? drain_workqueue+0x19/0x180
> [  397.873339]  __mutex_lock+0x89/0x980
> [  397.873346]  ? drain_workqueue+0x19/0x180
> [  397.873352]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> [  397.873359]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> [  397.873365]  ? drain_workqueue+0x19/0x180
> [  397.873373]  ? debug_object_active_state+0x127/0x150
> [  397.873381]  ? drain_workqueue+0x19/0x180
> [  397.873387]  drain_workqueue+0x19/0x180
> [  397.873395]  destroy_workqueue+0x12/0x1f0
> [  397.873476]  intel_guc_fini_misc+0x36/0x90 [i915]
> [  397.873540]  i915_gem_fini+0x91/0x100 [i915]
> [  397.873588]  i915_driver_unload+0xd2/0x110 [i915]
> [  397.873638]  i915_pci_remove+0x19/0x30 [i915]
> [  397.873646]  pci_device_remove+0x36/0xb0
> [  397.873653]  device_release_driver_internal+0x185/0x250
> [  397.873660]  driver_detach+0x35/0x70
> [  397.873668]  bus_remove_driver+0x53/0xd0
> [  397.873675]  pci_unregister_driver+0x25/0xa0
> [  397.873683]  __se_sys_delete_module+0x162/0x210
> [  397.873691]  ? do_syscall_64+0xd/0x190
> [  397.873697]  do_syscall_64+0x55/0x190
> [  397.873704]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  397.873710] RIP: 0033:0x7faf884231b7
> [  397.873714] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
> [  397.873775] RSP: 002b:00007ffda4e98cf8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> [  397.873784] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faf884231b7
> [  397.873790] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055fbb18f1bd8
> [  397.873796] RBP: 000055fbb18f1b70 R08: 000055fbb18f1bdc R09: 00007ffda4e98d38
> [  397.873802] R10: 00007ffda4e97cf4 R11: 0000000000000206 R12: 000055fbb0d32470
> [  397.873808] R13: 00007ffda4e992e0 R14: 0000000000000000 R15: 0000000000000000
> 
> v2: It's use-after-free; not a NULL pointer.
> 
> Testcase: igt/drv_module_reload/basic-reload-inject
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>

Whole series:

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

-Michał

> ---
>  drivers/gpu/drm/i915/intel_guc.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort
  2018-07-26 11:44 ` [PATCH 1/3] " Michał Winiarski
@ 2018-07-26 12:21   ` Chris Wilson
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2018-07-26 12:21 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

Quoting Michał Winiarski (2018-07-26 12:44:08)
> On Thu, Jul 26, 2018 at 09:50:31AM +0100, Chris Wilson wrote:
> > Prevent
> > [  397.873143] general protection fault: 0000 [#1] PREEMPT SMP PTI
> > [  397.873154] CPU: 4 PID: 4799 Comm: drv_module_relo Tainted: G     U            4.18.0-rc6-CI-CI_DRM_4534+ #1
> > [  397.873162] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
> > [  397.873175] RIP: 0010:__lock_acquire+0xf6/0x1b50
> > [  397.873179] Code: 85 c0 4c 8b 9d 40 ff ff ff 8b 8d 38 ff ff ff 44 8b 8d 30 ff ff ff 4c 8b 85 28 ff ff ff 44 8b 95 24 ff ff ff 0f 84 54 03 00 00 <f0> ff 80 38 01 00 00 8b 15 45 8c 59 02 45 8b bc 24 70 08 00 00 85
> > [  397.873240] RSP: 0018:ffffc90000497b40 EFLAGS: 00010002
> > [  397.873246] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
> > [  397.873252] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
> > [  397.873258] RBP: ffffc90000497c20 R08: ffffffff810a25e9 R09: 0000000000000000
> > [  397.873264] R10: 0000000000000000 R11: ffff880255c63c28 R12: ffff8801093b2840
> > [  397.873270] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000246
> > [  397.873277] FS:  00007faf88d71980(0000) GS:ffff880266300000(0000) knlGS:0000000000000000
> > [  397.873284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  397.873289] CR2: 000055d866c9ca10 CR3: 000000025472e006 CR4: 00000000003606e0
> > [  397.873295] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  397.873301] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  397.873308] Call Trace:
> > [  397.873318]  ? lock_acquire+0xa6/0x210
> > [  397.873323]  lock_acquire+0xa6/0x210
> > [  397.873331]  ? drain_workqueue+0x19/0x180
> > [  397.873339]  __mutex_lock+0x89/0x980
> > [  397.873346]  ? drain_workqueue+0x19/0x180
> > [  397.873352]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> > [  397.873359]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> > [  397.873365]  ? drain_workqueue+0x19/0x180
> > [  397.873373]  ? debug_object_active_state+0x127/0x150
> > [  397.873381]  ? drain_workqueue+0x19/0x180
> > [  397.873387]  drain_workqueue+0x19/0x180
> > [  397.873395]  destroy_workqueue+0x12/0x1f0
> > [  397.873476]  intel_guc_fini_misc+0x36/0x90 [i915]
> > [  397.873540]  i915_gem_fini+0x91/0x100 [i915]
> > [  397.873588]  i915_driver_unload+0xd2/0x110 [i915]
> > [  397.873638]  i915_pci_remove+0x19/0x30 [i915]
> > [  397.873646]  pci_device_remove+0x36/0xb0
> > [  397.873653]  device_release_driver_internal+0x185/0x250
> > [  397.873660]  driver_detach+0x35/0x70
> > [  397.873668]  bus_remove_driver+0x53/0xd0
> > [  397.873675]  pci_unregister_driver+0x25/0xa0
> > [  397.873683]  __se_sys_delete_module+0x162/0x210
> > [  397.873691]  ? do_syscall_64+0xd/0x190
> > [  397.873697]  do_syscall_64+0x55/0x190
> > [  397.873704]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [  397.873710] RIP: 0033:0x7faf884231b7
> > [  397.873714] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
> > [  397.873775] RSP: 002b:00007ffda4e98cf8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> > [  397.873784] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faf884231b7
> > [  397.873790] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055fbb18f1bd8
> > [  397.873796] RBP: 000055fbb18f1b70 R08: 000055fbb18f1bdc R09: 00007ffda4e98d38
> > [  397.873802] R10: 00007ffda4e97cf4 R11: 0000000000000206 R12: 000055fbb0d32470
> > [  397.873808] R13: 00007ffda4e992e0 R14: 0000000000000000 R15: 0000000000000000
> > 
> > v2: It's use-after-free; not a NULL pointer.
> > 
> > Testcase: igt/drv_module_reload/basic-reload-inject
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> 
> Whole series:
> 
> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

And with that, the band aids should be in place. Hopefully the code will
heal miraculously as we take them off again...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-07-26 12:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-26  8:50 [PATCH 1/3] drm/i915: Protect guc_fini_wq() against module load abort Chris Wilson
2018-07-26  8:50 ` [PATCH 2/3] drm/i915: Restore sane defaults for KMS on GEM error load Chris Wilson
2018-07-26  8:50 ` [PATCH 3/3] drm/i915: Don't disable the GPU for older gen on wedging Chris Wilson
2018-07-26  9:43 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Protect guc_fini_wq() against module load abort Patchwork
2018-07-26 10:32 ` ✓ Fi.CI.IGT: " Patchwork
2018-07-26 11:44 ` [PATCH 1/3] " Michał Winiarski
2018-07-26 12:21   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.