All of lore.kernel.org
 help / color / mirror / Atom feed
* Oops with i915
@ 2018-06-07 10:06 Sudip Mukherjee
  2018-06-18  9:39 ` Sudip Mukherjee
  2018-06-18 12:09 ` Ville Syrjälä
  0 siblings, 2 replies; 5+ messages in thread
From: Sudip Mukherjee @ 2018-06-07 10:06 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi; +Cc: intel-gfx

Hi All,

We are running v4.14.47 kernel and recently in one of our test cycle
we saw the below trace. I know this is not the usual way to raise a
BUG report, but since this was seen only once in one of the automated
test cycle so I donot have anything else apart from this trace.
Is this a known issue? Will appreciate any help in understanding what
the problem might be.

[ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a
[ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142
[ 1176.922111] *pdpt = 000000003367a001 *pde = 0000000000000000
[ 1176.928534] Oops: 0002 [#1] PREEMPT SMP
[ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G     U     O    4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1
[ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work
[ 1177.030630] task: ef2ee200 task.stack: efbf4000
[ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142
[ 1177.041327] EFLAGS: 00010087 CPU: 2
[ 1177.045212] EAX: 8298fb0a EBX: 00003ba0 ECX: ee82489c EDX: f4656fc0
[ 1177.052215] ESI: 000c0000 EDI: 00000001 EBP: efbf5e88 ESP: efbf5e78
[ 1177.059217]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0
[ 1177.072240] Call Trace:
[ 1177.074973]  _raw_spin_lock_irqsave+0x28/0x2d
[ 1177.079840]  complete_all+0x12/0x36
[ 1177.083737]  drm_atomic_helper_commit_hw_done+0x3c/0x43
[ 1177.089576]  intel_atomic_commit_tail+0xa5f/0xbd9
[ 1177.094832]  ? wait_woken+0x5a/0x5a
[ 1177.098727]  ? wait_woken+0x5a/0x5a
[ 1177.102622]  intel_atomic_commit_work+0xb/0xd
[ 1177.107489]  ? intel_atomic_commit_work+0xb/0xd
[ 1177.112551]  process_one_work+0x109/0x1ee
[ 1177.117029]  worker_thread+0x1a4/0x257
[ 1177.121215]  kthread+0xee/0xf3
[ 1177.124625]  ? rescuer_thread+0x207/0x207
[ 1177.129103]  ? kthread_create_on_node+0x1a/0x1a
[ 1177.134165]  ret_from_fork+0x2e/0x38
[ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d
[ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78
[ 1177.166983] CR2: 000000008298fb0a


--
Regards
Sudip
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops with i915
  2018-06-07 10:06 Oops with i915 Sudip Mukherjee
@ 2018-06-18  9:39 ` Sudip Mukherjee
  2018-06-18 12:09 ` Ville Syrjälä
  1 sibling, 0 replies; 5+ messages in thread
From: Sudip Mukherjee @ 2018-06-18  9:39 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi; +Cc: intel-gfx

On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote:
> Hi All,
> 
> We are running v4.14.47 kernel and recently in one of our test cycle
> we saw the below trace. I know this is not the usual way to raise a
> BUG report, but since this was seen only once in one of the automated
> test cycle so I donot have anything else apart from this trace.
> Is this a known issue? Will appreciate any help in understanding what
> the problem might be.
> 
> [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a
> [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142
> [ 1176.922111] *pdpt = 000000003367a001 *pde = 0000000000000000
> [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP
> [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G     U     O    4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1
> [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work
> [ 1177.030630] task: ef2ee200 task.stack: efbf4000
> [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142
> [ 1177.041327] EFLAGS: 00010087 CPU: 2
> [ 1177.045212] EAX: 8298fb0a EBX: 00003ba0 ECX: ee82489c EDX: f4656fc0
> [ 1177.052215] ESI: 000c0000 EDI: 00000001 EBP: efbf5e88 ESP: efbf5e78
> [ 1177.059217]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0
> [ 1177.072240] Call Trace:
> [ 1177.074973]  _raw_spin_lock_irqsave+0x28/0x2d
> [ 1177.079840]  complete_all+0x12/0x36
> [ 1177.083737]  drm_atomic_helper_commit_hw_done+0x3c/0x43
> [ 1177.089576]  intel_atomic_commit_tail+0xa5f/0xbd9
> [ 1177.094832]  ? wait_woken+0x5a/0x5a
> [ 1177.098727]  ? wait_woken+0x5a/0x5a
> [ 1177.102622]  intel_atomic_commit_work+0xb/0xd
> [ 1177.107489]  ? intel_atomic_commit_work+0xb/0xd
> [ 1177.112551]  process_one_work+0x109/0x1ee
> [ 1177.117029]  worker_thread+0x1a4/0x257
> [ 1177.121215]  kthread+0xee/0xf3
> [ 1177.124625]  ? rescuer_thread+0x207/0x207
> [ 1177.129103]  ? kthread_create_on_node+0x1a/0x1a
> [ 1177.134165]  ret_from_fork+0x2e/0x38
> [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d
> [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78
> [ 1177.166983] CR2: 000000008298fb0a

A gentile ping on this issue. Can anyone please help me on this.

--
Regards
Sudip
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops with i915
  2018-06-07 10:06 Oops with i915 Sudip Mukherjee
  2018-06-18  9:39 ` Sudip Mukherjee
@ 2018-06-18 12:09 ` Ville Syrjälä
  2018-06-18 12:29   ` Sudip Mukherjee
  1 sibling, 1 reply; 5+ messages in thread
From: Ville Syrjälä @ 2018-06-18 12:09 UTC (permalink / raw)
  To: Sudip Mukherjee; +Cc: intel-gfx, Rodrigo Vivi

On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote:
> Hi All,
> 
> We are running v4.14.47 kernel and recently in one of our test cycle
> we saw the below trace. I know this is not the usual way to raise a
> BUG report, but since this was seen only once in one of the automated
> test cycle so I donot have anything else apart from this trace.
> Is this a known issue? Will appreciate any help in understanding what
> the problem might be.
> 
> [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a
> [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142
> [ 1176.922111] *pdpt = 000000003367a001 *pde = 0000000000000000
> [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP
> [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G     U     O    4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1
> [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work
> [ 1177.030630] task: ef2ee200 task.stack: efbf4000
> [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142
> [ 1177.041327] EFLAGS: 00010087 CPU: 2
> [ 1177.045212] EAX: 8298fb0a EBX: 00003ba0 ECX: ee82489c EDX: f4656fc0
> [ 1177.052215] ESI: 000c0000 EDI: 00000001 EBP: efbf5e88 ESP: efbf5e78
> [ 1177.059217]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0
> [ 1177.072240] Call Trace:
> [ 1177.074973]  _raw_spin_lock_irqsave+0x28/0x2d
> [ 1177.079840]  complete_all+0x12/0x36
> [ 1177.083737]  drm_atomic_helper_commit_hw_done+0x3c/0x43
> [ 1177.089576]  intel_atomic_commit_tail+0xa5f/0xbd9
> [ 1177.094832]  ? wait_woken+0x5a/0x5a
> [ 1177.098727]  ? wait_woken+0x5a/0x5a
> [ 1177.102622]  intel_atomic_commit_work+0xb/0xd
> [ 1177.107489]  ? intel_atomic_commit_work+0xb/0xd
> [ 1177.112551]  process_one_work+0x109/0x1ee
> [ 1177.117029]  worker_thread+0x1a4/0x257
> [ 1177.121215]  kthread+0xee/0xf3
> [ 1177.124625]  ? rescuer_thread+0x207/0x207
> [ 1177.129103]  ? kthread_create_on_node+0x1a/0x1a
> [ 1177.134165]  ret_from_fork+0x2e/0x38
> [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d
> [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78
> [ 1177.166983] CR2: 000000008298fb0a

Presumably a use after free in atomic. Possibly 21a01abbe32a
("drm/atomic: Fix freeing connector/plane state too early by tracking
commits, v3.") But there may have been other similar fixes.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops with i915
  2018-06-18 12:09 ` Ville Syrjälä
@ 2018-06-18 12:29   ` Sudip Mukherjee
  2018-06-18 12:34     ` Ville Syrjälä
  0 siblings, 1 reply; 5+ messages in thread
From: Sudip Mukherjee @ 2018-06-18 12:29 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Rodrigo Vivi

Hi Ville,

On Mon, Jun 18, 2018 at 03:09:15PM +0300, Ville Syrjälä wrote:
> On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote:
> > Hi All,
> > 
> > We are running v4.14.47 kernel and recently in one of our test cycle
> > we saw the below trace. I know this is not the usual way to raise a
> > BUG report, but since this was seen only once in one of the automated
> > test cycle so I donot have anything else apart from this trace.
> > Is this a known issue? Will appreciate any help in understanding what
> > the problem might be.
> > 
> > [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a
> > [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142
> > [ 1176.922111] *pdpt = 000000003367a001 *pde = 0000000000000000
> > [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP
> > [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G     U     O    4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1
> > [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work
> > [ 1177.030630] task: ef2ee200 task.stack: efbf4000
> > [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142
> > [ 1177.041327] EFLAGS: 00010087 CPU: 2
> > [ 1177.045212] EAX: 8298fb0a EBX: 00003ba0 ECX: ee82489c EDX: f4656fc0
> > [ 1177.052215] ESI: 000c0000 EDI: 00000001 EBP: efbf5e88 ESP: efbf5e78
> > [ 1177.059217]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0
> > [ 1177.072240] Call Trace:
> > [ 1177.074973]  _raw_spin_lock_irqsave+0x28/0x2d
> > [ 1177.079840]  complete_all+0x12/0x36
> > [ 1177.083737]  drm_atomic_helper_commit_hw_done+0x3c/0x43
> > [ 1177.089576]  intel_atomic_commit_tail+0xa5f/0xbd9
> > [ 1177.094832]  ? wait_woken+0x5a/0x5a
> > [ 1177.098727]  ? wait_woken+0x5a/0x5a
> > [ 1177.102622]  intel_atomic_commit_work+0xb/0xd
> > [ 1177.107489]  ? intel_atomic_commit_work+0xb/0xd
> > [ 1177.112551]  process_one_work+0x109/0x1ee
> > [ 1177.117029]  worker_thread+0x1a4/0x257
> > [ 1177.121215]  kthread+0xee/0xf3
> > [ 1177.124625]  ? rescuer_thread+0x207/0x207
> > [ 1177.129103]  ? kthread_create_on_node+0x1a/0x1a
> > [ 1177.134165]  ret_from_fork+0x2e/0x38
> > [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d
> > [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78
> > [ 1177.166983] CR2: 000000008298fb0a
> 
> Presumably a use after free in atomic. Possibly 21a01abbe32a
> ("drm/atomic: Fix freeing connector/plane state too early by tracking
> commits, v3.") But there may have been other similar fixes.

Thanks for your reply. I also thought so as the stacktrace showed it was
using an invalid memory for the old_state. And so I applied:
21a01abbe32a ("drm/atomic: Fix freeing connector/plane state too early by tracking commits, v3.")
on top of v4.14.47. It also needed:
1) f46640b931e5 ("drm/atomic: Return commit in drm_crtc_commit_get for better annotation")
2) 163bcc2c74a2 ("drm/atomic: Move drm_crtc_commit to drm_crtc_state, v4.")

to apply cleanly. But after that the occurance rate increased.
Did I miss something else also?
Will apprecate your help in finding a fix to this.

--
Regards
Sudip
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Oops with i915
  2018-06-18 12:29   ` Sudip Mukherjee
@ 2018-06-18 12:34     ` Ville Syrjälä
  0 siblings, 0 replies; 5+ messages in thread
From: Ville Syrjälä @ 2018-06-18 12:34 UTC (permalink / raw)
  To: Sudip Mukherjee; +Cc: intel-gfx, Rodrigo Vivi

On Mon, Jun 18, 2018 at 01:29:02PM +0100, Sudip Mukherjee wrote:
> Hi Ville,
> 
> On Mon, Jun 18, 2018 at 03:09:15PM +0300, Ville Syrjälä wrote:
> > On Thu, Jun 07, 2018 at 11:06:33AM +0100, Sudip Mukherjee wrote:
> > > Hi All,
> > > 
> > > We are running v4.14.47 kernel and recently in one of our test cycle
> > > we saw the below trace. I know this is not the usual way to raise a
> > > BUG report, but since this was seen only once in one of the automated
> > > test cycle so I donot have anything else apart from this trace.
> > > Is this a known issue? Will appreciate any help in understanding what
> > > the problem might be.
> > > 
> > > [ 1176.909543] BUG: unable to handle kernel paging request at 8298fb0a
> > > [ 1176.916565] IP: queued_spin_lock_slowpath+0xfc/0x142
> > > [ 1176.922111] *pdpt = 000000003367a001 *pde = 0000000000000000
> > > [ 1176.928534] Oops: 0002 [#1] PREEMPT SMP
> > > [ 1177.002434] CPU: 2 PID: 24688 Comm: kworker/u8:4 Tainted: G     U     O    4.14.47-20180606-a6b8390e8cc1de032b8314d1a5b193fe9e21f325 #1
> > > [ 1177.024120] Workqueue: events_unbound intel_atomic_commit_work
> > > [ 1177.030630] task: ef2ee200 task.stack: efbf4000
> > > [ 1177.035685] EIP: queued_spin_lock_slowpath+0xfc/0x142
> > > [ 1177.041327] EFLAGS: 00010087 CPU: 2
> > > [ 1177.045212] EAX: 8298fb0a EBX: 00003ba0 ECX: ee82489c EDX: f4656fc0
> > > [ 1177.052215] ESI: 000c0000 EDI: 00000001 EBP: efbf5e88 ESP: efbf5e78
> > > [ 1177.059217]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > > [ 1177.065239] CR0: 80050033 CR2: 8298fb0a CR3: 2e8ed320 CR4: 001006f0
> > > [ 1177.072240] Call Trace:
> > > [ 1177.074973]  _raw_spin_lock_irqsave+0x28/0x2d
> > > [ 1177.079840]  complete_all+0x12/0x36
> > > [ 1177.083737]  drm_atomic_helper_commit_hw_done+0x3c/0x43
> > > [ 1177.089576]  intel_atomic_commit_tail+0xa5f/0xbd9
> > > [ 1177.094832]  ? wait_woken+0x5a/0x5a
> > > [ 1177.098727]  ? wait_woken+0x5a/0x5a
> > > [ 1177.102622]  intel_atomic_commit_work+0xb/0xd
> > > [ 1177.107489]  ? intel_atomic_commit_work+0xb/0xd
> > > [ 1177.112551]  process_one_work+0x109/0x1ee
> > > [ 1177.117029]  worker_thread+0x1a4/0x257
> > > [ 1177.121215]  kthread+0xee/0xf3
> > > [ 1177.124625]  ? rescuer_thread+0x207/0x207
> > > [ 1177.129103]  ? kthread_create_on_node+0x1a/0x1a
> > > [ 1177.134165]  ret_from_fork+0x2e/0x38
> > > [ 1177.138156] Code: 12 09 de 89 f0 89 75 f0 c1 e8 10 66 87 41 02 89 c3 c1 e3 10 74 51 83 e0 03 c1 eb 12 6b c0 0c 05 c0 1f 7e c1 03 04 9d d8 b1 6c c1 <89> 10 8b 42 04 85 c0 75 04 f3 90 eb f5 8b 1a 85 db 74 03 0f 0d
> > > [ 1177.159204] EIP: queued_spin_lock_slowpath+0xfc/0x142 SS:ESP: 0068:efbf5e78
> > > [ 1177.166983] CR2: 000000008298fb0a
> > 
> > Presumably a use after free in atomic. Possibly 21a01abbe32a
> > ("drm/atomic: Fix freeing connector/plane state too early by tracking
> > commits, v3.") But there may have been other similar fixes.
> 
> Thanks for your reply. I also thought so as the stacktrace showed it was
> using an invalid memory for the old_state. And so I applied:
> 21a01abbe32a ("drm/atomic: Fix freeing connector/plane state too early by tracking commits, v3.")
> on top of v4.14.47. It also needed:
> 1) f46640b931e5 ("drm/atomic: Return commit in drm_crtc_commit_get for better annotation")
> 2) 163bcc2c74a2 ("drm/atomic: Move drm_crtc_commit to drm_crtc_state, v4.")
> 
> to apply cleanly. But after that the occurance rate increased.
> Did I miss something else also?

No idea. I suggest a reverse bisect to find out when it got fixed in
upstream.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-06-18 12:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-07 10:06 Oops with i915 Sudip Mukherjee
2018-06-18  9:39 ` Sudip Mukherjee
2018-06-18 12:09 ` Ville Syrjälä
2018-06-18 12:29   ` Sudip Mukherjee
2018-06-18 12:34     ` Ville Syrjälä

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.