linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RT] i915 sleeping function from atomic in gen6_reset_engines()
@ 2023-01-23 16:04 Juri Lelli
  2023-07-10  7:21 ` Juri Lelli
  0 siblings, 1 reply; 4+ messages in thread
From: Juri Lelli @ 2023-01-23 16:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, LKML

Hi,

I've just noticed the following while testing v6.2-rc3-rt1.

# lspci -k | grep -EA3 'VGA|3D|Display'
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3 Processor Integrated Graphics Controller (rev 06)
        DeviceName:  Onboard IGD
        Subsystem: Dell Device 0620
        Kernel driver in use: i915
 
---

[   11.960670] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
[   11.960672] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 429, name: systemd-udevd
[   11.960673] preempt_count: 1, expected: 0
[   11.960674] RCU nest depth: 0, expected: 0
[   11.960674] 2 locks held by systemd-udevd/429:
[   11.960675]  #0: ffff8e1f022b71b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0xcb/0x1d0
[   11.960684]  #1: ffff8e1e27d32ed8 (&uncore->lock){+.+.}-{2:2}, at: gen6_reset_engines+0x1b/0x50 [i915]
[   11.960897] Preemption disabled at:
[   11.960898] [<ffffffffc07d5d62>] __intel_gt_reset+0x92/0x100 [i915]
[   11.961024] CPU: 3 PID: 429 Comm: systemd-udevd Not tainted 6.2.0-rc3-rt1 #1
[   11.961025] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS A03 11/25/2013
[   11.961026] Call Trace:
[   11.961027]  <TASK>
[   11.961029]  ? __intel_gt_reset+0x92/0x100 [i915]
[   11.961155]  dump_stack_lvl+0x57/0x81
[   11.961160]  __might_resched.cold+0xf4/0x12f
[   11.961165]  rt_spin_lock+0x4a/0xe0
[   11.961169]  ? gen6_reset_engines+0x1b/0x50 [i915]
[   11.961294]  ? __pfx_gen6_reset_engines+0x10/0x10 [i915]
[   11.961453]  gen6_reset_engines+0x1b/0x50 [i915]
[   11.961579]  __intel_gt_reset+0xa0/0x100 [i915]
[   11.961708]  i915_driver_mmio_probe+0x126/0x220 [i915]
[   11.961815]  ? intel_gt_probe_all+0x70/0x280 [i915]
[   11.961939]  i915_driver_probe+0xd4/0x3c0 [i915]
[   11.962047]  ? i915_pci_probe+0x87/0x1b0 [i915]
[   11.962160]  local_pci_probe+0x45/0x80
[   11.962167]  pci_call_probe+0x56/0x170
[   11.962173]  pci_device_probe+0x79/0xf0
[   11.962175]  ? driver_sysfs_add+0x73/0xd0
[   11.962179]  really_probe+0xe1/0x390
[   11.962181]  ? pm_runtime_barrier+0x50/0x90
[   11.962187]  __driver_probe_device+0x78/0x180
[   11.962191]  driver_probe_device+0x1e/0x90
[   11.962195]  __driver_attach+0xd6/0x1d0
[   11.962197]  ? __pfx___driver_attach+0x10/0x10
[   11.962201]  bus_for_each_dev+0x7b/0xc0
[   11.962206]  bus_add_driver+0x1b2/0x200
[   11.962212]  driver_register+0x8f/0xf0
[   11.962215]  i915_init+0x20/0x80 [i915]
[   11.962337]  ? __pfx_init_module+0x10/0x10 [i915]
[   11.962461]  do_one_initcall+0x47/0x190
[   11.962467]  ? rcu_read_lock_sched_held+0x43/0x80
[   11.962471]  ? trace_kmalloc+0x3c/0x160
[   11.962473]  ? kmalloc_trace+0x44/0x60
[   11.962476]  do_init_module+0x4c/0x200
[   11.962480]  __do_sys_finit_module+0xb4/0x130
[   11.962497]  do_syscall_64+0x5c/0x90
[   11.962505]  ? do_syscall_64+0x69/0x90
[   11.962507]  ? lockdep_hardirqs_on+0x79/0x100
[   11.962511]  ? do_syscall_64+0x69/0x90
[   11.962513]  ? do_syscall_64+0x69/0x90
[   11.962515]  ? lockdep_hardirqs_on+0x79/0x100
[   11.962518]  ? do_syscall_64+0x69/0x90
[   11.962520]  ? lockdep_hardirqs_on+0x79/0x100
[   11.962523]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   11.962526] RIP: 0033:0x7fa35d020e5d
[   11.962529] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89
                     d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
		     01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
[   11.962530] RSP: 002b:00007ffd62f23a78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   11.962532] RAX: ffffffffffffffda RBX: 000056058a1429e0 RCX: 00007fa35d020e5d
[   11.962533] RDX: 0000000000000000 RSI: 00007fa35d25232c RDI: 0000000000000017
[   11.962534] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
[   11.962534] R10: 0000000000000017 R11: 0000000000000246 R12: 00007fa35d25232c
[   11.962535] R13: 000056058a147360 R14: 0000000000000007 R15: 000056058a12b790
[   11.962548]  </TASK>
[   11.964024] i915 0000:00:02.0: vgaarb: deactivate vga console
[   11.974095] Console: switching to colour dummy device 80x25
[   11.981308] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[   12.041808] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0

---

Looks like this discussion might be relevant [1], but I'm not sure if
progress has been made after that.

Thanks!
Juri

1 - https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RT] i915 sleeping function from atomic in gen6_reset_engines()
  2023-01-23 16:04 [RT] i915 sleeping function from atomic in gen6_reset_engines() Juri Lelli
@ 2023-07-10  7:21 ` Juri Lelli
  2023-07-10 10:09   ` Mike Galbraith
  0 siblings, 1 reply; 4+ messages in thread
From: Juri Lelli @ 2023-07-10  7:21 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, LKML

Hi!

On 23/01/23 17:04, Juri Lelli wrote:
> Hi,
> 
> I've just noticed the following while testing v6.2-rc3-rt1.

I'm still seeing the following on v6.4-rt6.

I believe 20211006164628.s2mtsdd2jdbfyf7g@linutronix.de should cure it,
but I don't think it did go anywhere?

> # lspci -k | grep -EA3 'VGA|3D|Display'
> 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3 Processor Integrated Graphics Controller (rev 06)
>         DeviceName:  Onboard IGD
>         Subsystem: Dell Device 0620
>         Kernel driver in use: i915
>  
> ---
> 
> [   11.960670] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
> [   11.960672] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 429, name: systemd-udevd
> [   11.960673] preempt_count: 1, expected: 0
> [   11.960674] RCU nest depth: 0, expected: 0
> [   11.960674] 2 locks held by systemd-udevd/429:
> [   11.960675]  #0: ffff8e1f022b71b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0xcb/0x1d0
> [   11.960684]  #1: ffff8e1e27d32ed8 (&uncore->lock){+.+.}-{2:2}, at: gen6_reset_engines+0x1b/0x50 [i915]
> [   11.960897] Preemption disabled at:
> [   11.960898] [<ffffffffc07d5d62>] __intel_gt_reset+0x92/0x100 [i915]
> [   11.961024] CPU: 3 PID: 429 Comm: systemd-udevd Not tainted 6.2.0-rc3-rt1 #1
> [   11.961025] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS A03 11/25/2013
> [   11.961026] Call Trace:
> [   11.961027]  <TASK>
> [   11.961029]  ? __intel_gt_reset+0x92/0x100 [i915]
> [   11.961155]  dump_stack_lvl+0x57/0x81
> [   11.961160]  __might_resched.cold+0xf4/0x12f
> [   11.961165]  rt_spin_lock+0x4a/0xe0
> [   11.961169]  ? gen6_reset_engines+0x1b/0x50 [i915]
> [   11.961294]  ? __pfx_gen6_reset_engines+0x10/0x10 [i915]
> [   11.961453]  gen6_reset_engines+0x1b/0x50 [i915]
> [   11.961579]  __intel_gt_reset+0xa0/0x100 [i915]
> [   11.961708]  i915_driver_mmio_probe+0x126/0x220 [i915]
> [   11.961815]  ? intel_gt_probe_all+0x70/0x280 [i915]
> [   11.961939]  i915_driver_probe+0xd4/0x3c0 [i915]
> [   11.962047]  ? i915_pci_probe+0x87/0x1b0 [i915]
> [   11.962160]  local_pci_probe+0x45/0x80
> [   11.962167]  pci_call_probe+0x56/0x170
> [   11.962173]  pci_device_probe+0x79/0xf0
> [   11.962175]  ? driver_sysfs_add+0x73/0xd0
> [   11.962179]  really_probe+0xe1/0x390
> [   11.962181]  ? pm_runtime_barrier+0x50/0x90
> [   11.962187]  __driver_probe_device+0x78/0x180
> [   11.962191]  driver_probe_device+0x1e/0x90
> [   11.962195]  __driver_attach+0xd6/0x1d0
> [   11.962197]  ? __pfx___driver_attach+0x10/0x10
> [   11.962201]  bus_for_each_dev+0x7b/0xc0
> [   11.962206]  bus_add_driver+0x1b2/0x200
> [   11.962212]  driver_register+0x8f/0xf0
> [   11.962215]  i915_init+0x20/0x80 [i915]
> [   11.962337]  ? __pfx_init_module+0x10/0x10 [i915]
> [   11.962461]  do_one_initcall+0x47/0x190
> [   11.962467]  ? rcu_read_lock_sched_held+0x43/0x80
> [   11.962471]  ? trace_kmalloc+0x3c/0x160
> [   11.962473]  ? kmalloc_trace+0x44/0x60
> [   11.962476]  do_init_module+0x4c/0x200
> [   11.962480]  __do_sys_finit_module+0xb4/0x130
> [   11.962497]  do_syscall_64+0x5c/0x90
> [   11.962505]  ? do_syscall_64+0x69/0x90
> [   11.962507]  ? lockdep_hardirqs_on+0x79/0x100
> [   11.962511]  ? do_syscall_64+0x69/0x90
> [   11.962513]  ? do_syscall_64+0x69/0x90
> [   11.962515]  ? lockdep_hardirqs_on+0x79/0x100
> [   11.962518]  ? do_syscall_64+0x69/0x90
> [   11.962520]  ? lockdep_hardirqs_on+0x79/0x100
> [   11.962523]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [   11.962526] RIP: 0033:0x7fa35d020e5d
> [   11.962529] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89
>                      d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
> 		     01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
> [   11.962530] RSP: 002b:00007ffd62f23a78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [   11.962532] RAX: ffffffffffffffda RBX: 000056058a1429e0 RCX: 00007fa35d020e5d
> [   11.962533] RDX: 0000000000000000 RSI: 00007fa35d25232c RDI: 0000000000000017
> [   11.962534] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
> [   11.962534] R10: 0000000000000017 R11: 0000000000000246 R12: 00007fa35d25232c
> [   11.962535] R13: 000056058a147360 R14: 0000000000000007 R15: 000056058a12b790
> [   11.962548]  </TASK>
> [   11.964024] i915 0000:00:02.0: vgaarb: deactivate vga console
> [   11.974095] Console: switching to colour dummy device 80x25
> [   11.981308] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [   12.041808] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
> 
> ---

Thanks,
Juri


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RT] i915 sleeping function from atomic in gen6_reset_engines()
  2023-07-10  7:21 ` Juri Lelli
@ 2023-07-10 10:09   ` Mike Galbraith
  2023-07-10 11:56     ` Juri Lelli
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Galbraith @ 2023-07-10 10:09 UTC (permalink / raw)
  To: Juri Lelli, Sebastian Andrzej Siewior; +Cc: linux-rt-users, LKML

On Mon, 2023-07-10 at 09:21 +0200, Juri Lelli wrote:
> Hi!
>
> On 23/01/23 17:04, Juri Lelli wrote:
> > Hi,
> >
> > I've just noticed the following while testing v6.2-rc3-rt1.
>
> I'm still seeing the following on v6.4-rt6.
>
> I believe 20211006164628.s2mtsdd2jdbfyf7g@linutronix.de should cure it,
> but I don't think it did go anywhere?

The raw lock cyclictest deltas don't look great.  I shut it up with the
below.  For my i915 equipped lappy, a raw lock fix would be just fine,
its cyclictest numbers already being well south of 'oh dear'.

Subject: drm,i915: Don't disable preemption in __intel_gt_reset() for PREEMPT_RT
From: Mike Galbraith <efault@gmx.de>
Date: Wed Dec  7 10:13:21 CET 2022

 BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 330, name: systemd-udevd
 preempt_count: 1, expected: 0
 RCU nest depth: 0, expected: 0
 Preemption disabled at:
 [<ffffffffa090e08e>] __intel_gt_reset+0x6e/0xe0 [i915]
 CPU: 0 PID: 330 Comm: systemd-udevd Tainted: G          I E      6.1.0.g8ed710d-master-rt #27
 Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
 Call Trace:
  <TASK>
  dump_stack_lvl+0x33/0x46
  ? __intel_gt_reset+0x6e/0xe0 [i915]
  __might_resched+0x162/0x1b0
  ? __gen11_reset_engines.isra.21+0x2d0/0x2d0 [i915]
  rt_spin_lock+0x2d/0x70
  gen8_reset_engines+0x33/0x220 [i915]
  ? __gen11_reset_engines.isra.21+0x2d0/0x2d0 [i915]
  __intel_gt_reset+0x79/0xe0 [i915]
  sanitize_gpu.part.17+0x2d/0x40 [i915]
  i915_driver_probe+0x7b8/0xf20 [i915]
  ...

Replace the preempt_disable()/enable() with a local_lock()/unlock().

Signed-off-by: Mike Galbraith <efault@gmx.de>
---
 drivers/gpu/drm/i915/gt/intel_reset.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -6,6 +6,7 @@
 #include <linux/sched/mm.h>
 #include <linux/stop_machine.h>
 #include <linux/string_helpers.h>
+#include <linux/local_lock.h>

 #include "display/intel_display_reset.h"
 #include "display/intel_overlay.h"
@@ -766,6 +767,14 @@ wa_14015076503_end(struct intel_gt *gt,
 			 HECI_H_GS1_ER_PREP, 0);
 }

+struct reset_lock {
+	local_lock_t lock;
+};
+
+static DEFINE_PER_CPU(struct reset_lock, reset_lock) = {
+	.lock	= INIT_LOCAL_LOCK(lock),
+};
+
 int __intel_gt_reset(struct intel_gt *gt, intel_engine_mask_t engine_mask)
 {
 	const int retries = engine_mask == ALL_ENGINES ? RESET_MAX_RETRIES : 1;
@@ -788,9 +797,9 @@ int __intel_gt_reset(struct intel_gt *gt
 		reset_mask = wa_14015076503_start(gt, engine_mask, !retry);

 		GT_TRACE(gt, "engine_mask=%x\n", reset_mask);
-		preempt_disable();
+		local_lock(&reset_lock.lock);
 		ret = reset(gt, reset_mask, retry);
-		preempt_enable();
+		local_unlock(&reset_lock.lock);

 		wa_14015076503_end(gt, reset_mask);
 	}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RT] i915 sleeping function from atomic in gen6_reset_engines()
  2023-07-10 10:09   ` Mike Galbraith
@ 2023-07-10 11:56     ` Juri Lelli
  0 siblings, 0 replies; 4+ messages in thread
From: Juri Lelli @ 2023-07-10 11:56 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML

On 10/07/23 12:09, Mike Galbraith wrote:
> On Mon, 2023-07-10 at 09:21 +0200, Juri Lelli wrote:
> > Hi!
> >
> > On 23/01/23 17:04, Juri Lelli wrote:
> > > Hi,
> > >
> > > I've just noticed the following while testing v6.2-rc3-rt1.
> >
> > I'm still seeing the following on v6.4-rt6.
> >
> > I believe 20211006164628.s2mtsdd2jdbfyf7g@linutronix.de should cure it,
> > but I don't think it did go anywhere?
> 
> The raw lock cyclictest deltas don't look great.  I shut it up with the
> below.  For my i915 equipped lappy, a raw lock fix would be just fine,
> its cyclictest numbers already being well south of 'oh dear'.

Right, thanks for sharing!

Looks like Sebastian is on holiday, so I guess we'll need to wait till
he's back to see which solution he prefer.

Best,
Juri


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-10 11:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-23 16:04 [RT] i915 sleeping function from atomic in gen6_reset_engines() Juri Lelli
2023-07-10  7:21 ` Juri Lelli
2023-07-10 10:09   ` Mike Galbraith
2023-07-10 11:56     ` Juri Lelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).