linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ANNOUNCE] v5.9-rc3-rt3
@ 2020-09-02 15:55 Sebastian Andrzej Siewior
  2020-09-05  4:47 ` v5.9-rc3-rt3 boot time networking lockdep splat Mike Galbraith
  2020-09-09  3:12 ` [ANNOUNCE] v5.9-rc3-rt3 Mike Galbraith
  0 siblings, 2 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-02 15:55 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, linux-rt-users, Steven Rostedt

Dear RT folks!

I'm pleased to announce the v5.9-rc3-rt3 patch set. 

Changes since v5.9-rc3-rt2:

  - Correct a compile issue in the i915 driver. Reported by Carsten Emde
    and Daniel Wagner.

  - Mark Marshall reported a crash on PowerPC. The reason for the crash
    is a race in exec_mmap() vs a context switch and is not limited to
    PowerPC. This race is present since v5.4.3-rt1 and is addressed in
    two changes:

    - commit 38cf307c1f201 ("mm: fix kthread_use_mm() vs TLB invalidate")
      which is part of v5.9-rc1.

    - patch "mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race"
      by Nicholas Piggin which has been posted for review and is not yet
      merged upstream.

Known issues
     - It has been pointed out that due to changes to the printk code the
       internal buffer representation changed. This is only an issue if tools
       like `crash' are used to extract the printk buffer from a kernel memory
       image.

The delta patch against v5.9-rc3-rt2 is appended below and can be found here:
 
     https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/incr/patch-5.9-rc3-rt2-rt3.patch.xz

You can get this release via the git tree at:

    git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git v5.9-rc3-rt3

The RT patch against v5.9-rc3 can be found here:

    https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/older/patch-5.9-rc3-rt3.patch.xz

The split quilt queue is available at:

    https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/older/patches-5.9-rc3-rt3.tar.xz

Sebastian

diff --git a/arch/Kconfig b/arch/Kconfig
index 222e553f3cf50..5c8e173dc7c2b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -415,6 +415,13 @@ config MMU_GATHER_NO_GATHER
 	bool
 	depends on MMU_GATHER_TABLE_FREE
 
+config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
+	bool
+	help
+	  Temporary select until all architectures can be converted to have
+	  irqs disabled over activate_mm. Architectures that do IPI based TLB
+	  shootdowns should enable this.
+
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
 	bool
 
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h
index c5700f44422ec..e8f809161c75f 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -29,7 +29,6 @@
 #include <linux/async.h>
 #include <linux/i2c.h>
 #include <linux/sched/clock.h>
-#include <linux/local_lock.h>
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_crtc.h>
@@ -1150,7 +1149,6 @@ struct intel_crtc {
 #ifdef CONFIG_DEBUG_FS
 	struct intel_pipe_crc pipe_crc;
 #endif
-	local_lock_t pipe_update_lock;
 };
 
 struct intel_plane {
diff --git a/drivers/gpu/drm/i915/display/intel_sprite.c b/drivers/gpu/drm/i915/display/intel_sprite.c
index 62b8248d2ee79..1b9d5e690a9f0 100644
--- a/drivers/gpu/drm/i915/display/intel_sprite.c
+++ b/drivers/gpu/drm/i915/display/intel_sprite.c
@@ -118,7 +118,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
 			"PSR idle timed out 0x%x, atomic update may fail\n",
 			psr_status);
 
-	local_lock_irq(&crtc->pipe_update_lock);
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_irq_disable();
 
 	crtc->debug.min_vbl = min;
 	crtc->debug.max_vbl = max;
@@ -143,11 +144,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
 			break;
 		}
 
-		local_unlock_irq(&crtc->pipe_update_lock);
+		if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+			local_irq_enable();
 
 		timeout = schedule_timeout(timeout);
 
-		local_lock_irq(&crtc->pipe_update_lock);
+		if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+			local_irq_disable();
 	}
 
 	finish_wait(wq, &wait);
@@ -180,7 +183,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state)
 	return;
 
 irq_disable:
-	local_lock_irq(&crtc->pipe_update_lock);
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_irq_disable();
 }
 
 /**
@@ -218,7 +222,8 @@ void intel_pipe_update_end(struct intel_crtc_state *new_crtc_state)
 		new_crtc_state->uapi.event = NULL;
 	}
 
-	local_unlock_irq(&crtc->pipe_update_lock);
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_irq_enable();
 
 	if (intel_vgpu_active(dev_priv))
 		return;
diff --git a/fs/exec.c b/fs/exec.c
index a91003e28eaae..d4fb18baf1fb1 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1130,11 +1130,24 @@ static int exec_mmap(struct mm_struct *mm)
 	}
 
 	task_lock(tsk);
-	active_mm = tsk->active_mm;
 	membarrier_exec_mmap(mm);
-	tsk->mm = mm;
+
+	local_irq_disable();
+	active_mm = tsk->active_mm;
 	tsk->active_mm = mm;
+	tsk->mm = mm;
+	/*
+	 * This prevents preemption while active_mm is being loaded and
+	 * it and mm are being updated, which could cause problems for
+	 * lazy tlb mm refcounting when these are updated by context
+	 * switches. Not all architectures can handle irqs off over
+	 * activate_mm yet.
+	 */
+	if (!IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
+		local_irq_enable();
 	activate_mm(active_mm, mm);
+	if (IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
+		local_irq_enable();
 	tsk->mm->vmacache_seqnum = 0;
 	vmacache_flush(tsk);
 	task_unlock(tsk);
diff --git a/localversion-rt b/localversion-rt
index c3054d08a1129..1445cd65885cd 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt2
+-rt3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-02 15:55 [ANNOUNCE] v5.9-rc3-rt3 Sebastian Andrzej Siewior
@ 2020-09-05  4:47 ` Mike Galbraith
  2020-09-05  5:19   ` Mike Galbraith
  2020-09-08 12:19   ` Sebastian Andrzej Siewior
  2020-09-09  3:12 ` [ANNOUNCE] v5.9-rc3-rt3 Mike Galbraith
  1 sibling, 2 replies; 17+ messages in thread
From: Mike Galbraith @ 2020-09-05  4:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

[   22.004225] r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
[   22.004450] br0: port 1(eth0) entered blocking state
[   22.004473] br0: port 1(eth0) entered forwarding state
[   22.006411] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready

[   22.024936] ======================================================
[   22.024936] WARNING: possible circular locking dependency detected
[   22.024937] 5.9.0.gc70672d-rt3-rt #8 Tainted: G            E
[   22.024938] ------------------------------------------------------
[   22.024939] ksoftirqd/0/10 is trying to acquire lock:
[   22.024941] ffff983475521278 (&sch->q.lock){+...}-{0:0}, at: sch_direct_xmit+0x81/0x2f0
[   22.024947]
               but task is already holding lock:
[   22.024947] ffff9834755212b8 (&s->seqcount#9){+...}-{0:0}, at: br_dev_queue_push_xmit+0x7d/0x180 [bridge]
[   22.024959]
               which lock already depends on the new lock.

[   22.024960]
               the existing dependency chain (in reverse order) is:
[   22.024961]
               -> #1 (&s->seqcount#9){+...}-{0:0}:
[   22.024963]        lock_acquire+0x92/0x3f0
[   22.024967]        __dev_queue_xmit+0xce7/0xe30
[   22.024969]        br_dev_queue_push_xmit+0x7d/0x180 [bridge]
[   22.024974]        br_forward_finish+0x10a/0x1b0 [bridge]
[   22.024980]        __br_forward+0x17d/0x300 [bridge]
[   22.024984]        br_dev_xmit+0x442/0x570 [bridge]
[   22.024990]        dev_hard_start_xmit+0xc5/0x3f0
[   22.024992]        __dev_queue_xmit+0x9db/0xe30
[   22.024993]        ip6_finish_output2+0x26a/0x990
[   22.024995]        ip6_output+0x6d/0x260
[   22.024996]        mld_sendpack+0x1d9/0x360
[   22.024999]        mld_ifc_timer_expire+0x1f7/0x370
[   22.025000]        call_timer_fn+0xa0/0x390
[   22.025003]        run_timer_softirq+0x59a/0x720
[   22.025004]        __do_softirq+0xc1/0x5b2
[   22.025006]        run_ksoftirqd+0x47/0x70
[   22.025007]        smpboot_thread_fn+0x266/0x320
[   22.025009]        kthread+0x171/0x190
[   22.025010]        ret_from_fork+0x1f/0x30
[   22.025013]
               -> #0 (&sch->q.lock){+...}-{0:0}:
[   22.025015]        validate_chain+0xa81/0x1230
[   22.025016]        __lock_acquire+0x880/0xbf0
[   22.025017]        lock_acquire+0x92/0x3f0
[   22.025018]        rt_spin_lock+0x78/0xd0
[   22.025020]        sch_direct_xmit+0x81/0x2f0
[   22.025022]        __dev_queue_xmit+0xd38/0xe30
[   22.025023]        br_dev_queue_push_xmit+0x7d/0x180 [bridge]
[   22.025029]        br_forward_finish+0x10a/0x1b0 [bridge]
[   22.025033]        __br_forward+0x17d/0x300 [bridge]
[   22.025039]        br_dev_xmit+0x442/0x570 [bridge]
[   22.025043]        dev_hard_start_xmit+0xc5/0x3f0
[   22.025044]        __dev_queue_xmit+0x9db/0xe30
[   22.025046]        ip6_finish_output2+0x26a/0x990
[   22.025047]        ip6_output+0x6d/0x260
[   22.025049]        mld_sendpack+0x1d9/0x360
[   22.025050]        mld_ifc_timer_expire+0x1f7/0x370
[   22.025052]        call_timer_fn+0xa0/0x390
[   22.025053]        run_timer_softirq+0x59a/0x720
[   22.025054]        __do_softirq+0xc1/0x5b2
[   22.025055]        run_ksoftirqd+0x47/0x70
[   22.025056]        smpboot_thread_fn+0x266/0x320
[   22.025058]        kthread+0x171/0x190
[   22.025059]        ret_from_fork+0x1f/0x30
[   22.025060]
               other info that might help us debug this:

[   22.025061]  Possible unsafe locking scenario:

[   22.025061]        CPU0                    CPU1
[   22.025061]        ----                    ----
[   22.025062]   lock(&s->seqcount#9);
[   22.025064]                                lock(&sch->q.lock);
[   22.025065]                                lock(&s->seqcount#9);
[   22.025065]   lock(&sch->q.lock);
[   22.025066]
                *** DEADLOCK ***

[   22.025066] 20 locks held by ksoftirqd/0/10:
[   22.025067]  #0: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xd0
[   22.025071]  #1: ffff98351ec1a6d0 (per_cpu_ptr(&bh_lock.l.lock, cpu)){....}-{3:3}, at: __local_bh_disable_ip+0xbf/0x230
[   22.025074]  #2: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230
[   22.025077]  #3: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xd0
[   22.025080]  #4: ffff98351ec1b338 (&base->expiry_lock){+...}-{0:0}, at: run_timer_softirq+0x3e6/0x720
[   22.025083]  #5: ffffb32e8007bd68 ((&idev->mc_ifc_timer)){+...}-{0:0}, at: call_timer_fn+0x5/0x390
[   22.025086]  #6: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: mld_sendpack+0x5/0x360
[   22.025090]  #7: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230
[   22.025093]  #8: ffffffff9a4c7100 (rcu_read_lock_bh){....}-{1:3}, at: ip6_finish_output2+0x73/0x990
[   22.025096]  #9: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230
[   22.025097]  #10: ffffffff9a4c7100 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x63/0xe30
[   22.025100]  #11: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: br_dev_xmit+0x5/0x570 [bridge]
[   22.025108]  #12: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230
[   22.025110]  #13: ffffffff9a4c7100 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x63/0xe30
[   22.025113]  #14: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xd0
[   22.025116]  #15: ffff9834755215f0 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{0:0}, at: __dev_queue_xmit+0x8a4/0xe30
[   22.025119]  #16: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xd0
[   22.025121]  #17: ffff983475521398 (dev->qdisc_running_key ?: &qdisc_running_key){+...}-{0:0}, at: __dev_queue_xmit+0xca6/0xe30
[   22.025124]  #18: ffff9834755212b8 (&s->seqcount#9){+...}-{0:0}, at: br_dev_queue_push_xmit+0x7d/0x180 [bridge]
[   22.025132]  #19: ffffffff9a4c7140 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x5/0xd0
[   22.025134]
               stack backtrace:
[   22.025134] CPU: 0 PID: 10 Comm: ksoftirqd/0 Kdump: loaded Tainted: G            E     5.9.0.gc70672d-rt3-rt #8
[   22.025135] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[   22.025136] Call Trace:
[   22.025138]  dump_stack+0x77/0x9b
[   22.025143]  check_noncircular+0x148/0x160
[   22.025147]  ? validate_chain+0xa81/0x1230
[   22.025148]  validate_chain+0xa81/0x1230
[   22.025153]  __lock_acquire+0x880/0xbf0
[   22.025157]  lock_acquire+0x92/0x3f0
[   22.025158]  ? sch_direct_xmit+0x81/0x2f0
[   22.025160]  ? rt_spin_unlock+0x39/0x90
[   22.025162]  rt_spin_lock+0x78/0xd0
[   22.025164]  ? sch_direct_xmit+0x81/0x2f0
[   22.025166]  sch_direct_xmit+0x81/0x2f0
[   22.025169]  __dev_queue_xmit+0xd38/0xe30
[   22.025173]  ? find_held_lock+0x2d/0x90
[   22.025176]  ? br_dev_queue_push_xmit+0x7d/0x180 [bridge]
[   22.025182]  br_dev_queue_push_xmit+0x7d/0x180 [bridge]
[   22.025190]  br_forward_finish+0x10a/0x1b0 [bridge]
[   22.025196]  ? __br_forward+0x151/0x300 [bridge]
[   22.025204]  __br_forward+0x17d/0x300 [bridge]
[   22.025211]  ? br_flood+0x98/0x120 [bridge]
[   22.025216]  br_dev_xmit+0x442/0x570 [bridge]
[   22.025224]  dev_hard_start_xmit+0xc5/0x3f0
[   22.025226]  ? netif_skb_features+0xb0/0x230
[   22.025228]  __dev_queue_xmit+0x9db/0xe30
[   22.025231]  ? eth_header+0x25/0xc0
[   22.025235]  ? ip6_finish_output2+0x26a/0x990
[   22.025236]  ip6_finish_output2+0x26a/0x990
[   22.025239]  ? ip6_mtu+0x135/0x1b0
[   22.025241]  ? ip6_output+0x6d/0x260
[   22.025243]  ip6_output+0x6d/0x260
[   22.025246]  ? __ip6_finish_output+0x210/0x210
[   22.025249]  mld_sendpack+0x1d9/0x360
[   22.025252]  ? mld_ifc_timer_expire+0x119/0x370
[   22.025254]  mld_ifc_timer_expire+0x1f7/0x370
[   22.025256]  ? mld_dad_timer_expire+0xb0/0xb0
[   22.025258]  ? mld_dad_timer_expire+0xb0/0xb0
[   22.025260]  call_timer_fn+0xa0/0x390
[   22.025263]  ? mld_dad_timer_expire+0xb0/0xb0
[   22.025264]  run_timer_softirq+0x59a/0x720
[   22.025268]  ? lock_acquire+0x92/0x3f0
[   22.025272]  __do_softirq+0xc1/0x5b2
[   22.025274]  ? smpboot_thread_fn+0x28/0x320
[   22.025276]  ? smpboot_thread_fn+0x28/0x320
[   22.025278]  ? smpboot_thread_fn+0x70/0x320
[   22.025279]  run_ksoftirqd+0x47/0x70
[   22.025281]  smpboot_thread_fn+0x266/0x320
[   22.025284]  ? smpboot_register_percpu_thread+0xe0/0xe0
[   22.025286]  kthread+0x171/0x190
[   22.025287]  ? kthread_park+0x90/0x90
[   22.025288]  ret_from_fork+0x1f/0x30
[   22.176416] NET: Registered protocol family 17


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-05  4:47 ` v5.9-rc3-rt3 boot time networking lockdep splat Mike Galbraith
@ 2020-09-05  5:19   ` Mike Galbraith
  2020-09-08 15:12     ` Sebastian Andrzej Siewior
  2020-09-08 12:19   ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2020-09-05  5:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

Lappy, which does not use bridge, boots clean... but lock leakage
pretty darn quickly inspires lockdep to craps its drawers.

[  209.001111] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[  209.001113] turning off the locking correctness validator.
[  209.001114] CPU: 2 PID: 3773 Comm: Socket Thread Tainted: G S        I E     5.9.0.gc70672d-rt3-rt #8
[  209.001117] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
[  209.001118] Call Trace:
[  209.001123]  dump_stack+0x77/0x9b
[  209.001129]  validate_chain+0xf60/0x1230
[  209.001135]  __lock_acquire+0x880/0xbf0
[  209.001139]  lock_acquire+0x92/0x3f0
[  209.001142]  ? rcu_note_context_switch+0x118/0x550
[  209.001146]  ? update_load_avg+0x5cc/0x6d0
[  209.001150]  _raw_spin_lock+0x2f/0x40
[  209.001153]  ? rcu_note_context_switch+0x118/0x550
[  209.001155]  rcu_note_context_switch+0x118/0x550
[  209.001157]  ? lockdep_hardirqs_off+0x6e/0xd0
[  209.001161]  __schedule+0xbe/0xb50
[  209.001163]  ? mark_held_locks+0x2d/0x80
[  209.001166]  preempt_schedule_irq+0x44/0xb0
[  209.001168]  irqentry_exit+0x5b/0x80
[  209.001170]  asm_sysvec_reschedule_ipi+0x12/0x20
[  209.001173] RIP: 0010:debug_lockdep_rcu_enabled+0x23/0x30
[  209.001175] Code: 0f 0b e9 6d ff ff ff 8b 05 0a a0 c5 00 85 c0 74 21 8b 05 cc da c5 00 85 c0 74 17 65 48 8b 04 25 c0 91 01 00 8b 80 8c 0a 00 00 <85> c0 0f 94 c0 0f b6 c0 f3 c3 cc cc cc 65 48 8b 04 25 c0 91 01 00
[  209.001178] RSP: 0018:ffffa00202a0f998 EFLAGS: 00000202
[  209.001179] RAX: 0000000000000000 RBX: ffff90a8a6d1da20 RCX: 0000000000000001
[  209.001180] RDX: 0000000000000002 RSI: ffffffff971308fc RDI: ffffffff9710b092
[  209.001181] RBP: 0000000000000048 R08: 0000000000000001 R09: 0000000000000001
[  209.001181] R10: ffff90a8a6d1da38 R11: 0000000000000006 R12: ffffffff97405280
[  209.001182] R13: 0000000000000008 R14: ffffffff97405240 R15: 0000000000000100
[  209.001188]  rt_spin_unlock+0x2c/0x90
[  209.001191]  __do_softirq+0xc1/0x5b2
[  209.001194]  ? ip_finish_output2+0x264/0xa10
[  209.001197]  __local_bh_enable_ip+0x230/0x290
[  209.001200]  ip_finish_output2+0x288/0xa10
[  209.001201]  ? rcu_read_lock_held+0x32/0x40
[  209.001206]  ? ip_output+0x70/0x200
[  209.001207]  ip_output+0x70/0x200
[  209.001210]  ? __ip_finish_output+0x320/0x320
[  209.001212]  __ip_queue_xmit+0x1f0/0x5d0
[  209.001216]  __tcp_transmit_skb+0xa7f/0xc70
[  209.001219]  ? __alloc_skb+0x7b/0x1b0
[  209.001222]  ? __kmalloc_node_track_caller+0x252/0x330
[  209.001230]  tcp_rcv_established+0x365/0x6d0
[  209.001233]  tcp_v4_do_rcv+0x7e/0x1b0
[  209.001236]  __release_sock+0x89/0x130
[  209.001239]  release_sock+0x3c/0xd0
[  209.001241]  tcp_recvmsg+0x2b9/0xa90
[  209.001247]  inet_recvmsg+0x6b/0x210
[  209.001252]  __sys_recvfrom+0xb8/0x110
[  209.001256]  ? poll_select_finish+0x1f0/0x1f0
[  209.001261]  ? syscall_enter_from_user_mode+0x37/0x340
[  209.001263]  ? syscall_enter_from_user_mode+0x3c/0x340
[  209.001265]  ? lockdep_hardirqs_on+0x78/0x100
[  209.001268]  __x64_sys_recvfrom+0x24/0x30
[  209.001269]  do_syscall_64+0x33/0x40
[  209.001271]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  209.001274] RIP: 0033:0x7ff2421a230a
[  209.001276] Code: 7c 24 08 4c 89 14 24 e8 44 f8 ff ff 45 31 c9 89 c3 45 31 c0 4c 8b 14 24 4c 89 e2 48 89 ee 48 8b 7c 24 08 b8 2d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 89 df 48 89 04 24 e8 73 f8 ff ff 48 8b 04
[  209.001278] RSP: 002b:00007ff24243a550 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[  209.001279] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff2421a230a
[  209.001280] RDX: 00000000000034da RSI: 00007ff21094fb37 RDI: 000000000000006b
[  209.001281] RBP: 00007ff21094fb37 R08: 0000000000000000 R09: 0000000000000000
[  209.001282] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000034da
[  209.001283] R13: 00007ff21094fb37 R14: 0000000000000000 R15: 00007ff20e8a4000


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-05  4:47 ` v5.9-rc3-rt3 boot time networking lockdep splat Mike Galbraith
  2020-09-05  5:19   ` Mike Galbraith
@ 2020-09-08 12:19   ` Sebastian Andrzej Siewior
  2020-09-08 14:56     ` Mike Galbraith
  1 sibling, 1 reply; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-08 12:19 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-09-05 06:47:29 [+0200], Mike Galbraith wrote:
> [   22.024936] ======================================================
> [   22.024936] WARNING: possible circular locking dependency detected
> [   22.024937] 5.9.0.gc70672d-rt3-rt #8 Tainted: G            E
> [   22.024938] ------------------------------------------------------
> [   22.024939] ksoftirqd/0/10 is trying to acquire lock:
> [   22.024941] ffff983475521278 (&sch->q.lock){+...}-{0:0}, at: sch_direct_xmit+0x81/0x2f0
> [   22.024947]
>                but task is already holding lock:
> [   22.024947] ffff9834755212b8 (&s->seqcount#9){+...}-{0:0}, at: br_dev_queue_push_xmit+0x7d/0x180 [bridge]
> [   22.024959]
>                which lock already depends on the new lock.
> 
> [   22.024960]
>                the existing dependency chain (in reverse order) is:
> [   22.024961]
>                -> #1 (&s->seqcount#9){+...}-{0:0}:
> [   22.024963]        lock_acquire+0x92/0x3f0
> [   22.024967]        __dev_queue_xmit+0xce7/0xe30
>                -> #0 (&sch->q.lock){+...}-{0:0}:
> [   22.025015]        validate_chain+0xa81/0x1230
> [   22.025016]        __lock_acquire+0x880/0xbf0
> [   22.025017]        lock_acquire+0x92/0x3f0
> [   22.025018]        rt_spin_lock+0x78/0xd0
> [   22.025020]        sch_direct_xmit+0x81/0x2f0
>                other info that might help us debug this:
> 
> [   22.025061]  Possible unsafe locking scenario:
> 
> [   22.025061]        CPU0                    CPU1
> [   22.025061]        ----                    ----
> [   22.025062]   lock(&s->seqcount#9);
> [   22.025064]                                lock(&sch->q.lock);
> [   22.025065]                                lock(&s->seqcount#9);
> [   22.025065]   lock(&sch->q.lock);
> [   22.025066]
>                 *** DEADLOCK ***

This has nothing to do with the bridge but with the fact that you use a
non standard queue class (something else than pfifo_fast).

The flow in CPU1 is the default flow but the second lock is a trylock.
CPU0 is from sch_direct_xmit() where it drops the the
root_lock/qdisc.lock and re-acquires it. This shouldn't fail because the
CPU1 a try-lock of the seqlock first and then the seqcount is "not
acquired". So if we annotate the seqcount as a try_acquire then it
should not do this anymore.

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-08 12:19   ` Sebastian Andrzej Siewior
@ 2020-09-08 14:56     ` Mike Galbraith
  2020-09-08 15:06       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2020-09-08 14:56 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On Tue, 2020-09-08 at 14:19 +0200, Sebastian Andrzej Siewior wrote:
>
> This has nothing to do with the bridge but with the fact that you use a
> non standard queue class (something else than pfifo_fast).

That must be SUSE, I don't muck about in network land.  I downloaded a
whole library of RFCs decades ago, but turns out that one of those is
all the bedtime story you'll ever need.  Huge waste of bandwidth :)

	-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-08 14:56     ` Mike Galbraith
@ 2020-09-08 15:06       ` Sebastian Andrzej Siewior
  2020-09-08 16:19         ` Mike Galbraith
  2020-09-09  2:39         ` Mike Galbraith
  0 siblings, 2 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-08 15:06 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-09-08 16:56:20 [+0200], Mike Galbraith wrote:
> On Tue, 2020-09-08 at 14:19 +0200, Sebastian Andrzej Siewior wrote:
> >
> > This has nothing to do with the bridge but with the fact that you use a
> > non standard queue class (something else than pfifo_fast).
> 
> That must be SUSE, I don't muck about in network land.  I downloaded a
> whole library of RFCs decades ago, but turns out that one of those is
> all the bedtime story you'll ever need.  Huge waste of bandwidth :)

I see.
This should cure it:

Subject: [PATCH] net: Properly annotate the try-lock for the seqlock

In patch
   ("net/Qdisc: use a seqlock instead seqcount")

the seqcount has been replaced with a seqlock to allow to reader to
boost the preempted writer.
The try_write_seqlock() acquired the lock with a try-lock but the
seqcount annotation was "lock".

Opencode write_seqcount_t_begin() and use the try-lock annotation for
lockdep.

Reported-by: Mike Galbraith <efault@gmx.de>
Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/net/net_seq_lock.h |  9 ---------
 include/net/sch_generic.h  | 10 +++++++++-
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/include/net/net_seq_lock.h b/include/net/net_seq_lock.h
index 95a497a72e511..67710bace7418 100644
--- a/include/net/net_seq_lock.h
+++ b/include/net/net_seq_lock.h
@@ -6,15 +6,6 @@
 # define net_seq_begin(__r)		read_seqbegin(__r)
 # define net_seq_retry(__r, __s)	read_seqretry(__r, __s)
 
-static inline int try_write_seqlock(seqlock_t *sl)
-{
-	if (spin_trylock(&sl->lock)) {
-		write_seqcount_begin(&sl->seqcount);
-		return 1;
-	}
-	return 0;
-}
-
 #else
 # define net_seqlock_t			seqcount_t
 # define net_seq_begin(__r)		read_seqcount_begin(__r)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 796ac453d9519..40be4443b6bdb 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -168,8 +168,16 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 		return false;
 	}
 #ifdef CONFIG_PREEMPT_RT
-	if (try_write_seqlock(&qdisc->running))
+	if (spin_trylock(&qdisc->running.lock)) {
+		seqcount_t *s = &qdisc->running.seqcount.seqcount;
+		/*
+		 * Variant of write_seqcount_t_begin() telling lockdep that a
+		 * trylock was attempted.
+		 */
+		raw_write_seqcount_t_begin(s);
+		seqcount_acquire(&s->dep_map, 0, 1, _RET_IP_);
 		return true;
+	}
 	return false;
 #else
 	/* Variant of write_seqcount_begin() telling lockdep a trylock
-- 
2.28.0


> 	-Mike

Sebastian

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-05  5:19   ` Mike Galbraith
@ 2020-09-08 15:12     ` Sebastian Andrzej Siewior
  2020-09-08 15:59       ` Mike Galbraith
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-08 15:12 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-09-05 07:19:10 [+0200], Mike Galbraith wrote:
> Lappy, which does not use bridge, boots clean... but lock leakage
> pretty darn quickly inspires lockdep to craps its drawers.
> 
> [  209.001111] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> [  209.001113] turning off the locking correctness validator.
> [  209.001114] CPU: 2 PID: 3773 Comm: Socket Thread Tainted: G S        I E     5.9.0.gc70672d-rt3-rt #8
> [  209.001117] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
> [  209.001118] Call Trace:
> [  209.001123]  dump_stack+0x77/0x9b
> [  209.001129]  validate_chain+0xf60/0x1230

I have no idea how to debug this based on this report. Can you narrow
it down to something?

Is Lappy new, got a new something or has a new config switch? I'm just
curious if this something or something that was always there but
remained undetected.
(Your other report was about something that was previously always "broken".)

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-08 15:12     ` Sebastian Andrzej Siewior
@ 2020-09-08 15:59       ` Mike Galbraith
  2020-09-08 16:02         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2020-09-08 15:59 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On Tue, 2020-09-08 at 17:12 +0200, Sebastian Andrzej Siewior wrote:
> On 2020-09-05 07:19:10 [+0200], Mike Galbraith wrote:
> > Lappy, which does not use bridge, boots clean... but lock leakage
> > pretty darn quickly inspires lockdep to craps its drawers.
> >
> > [  209.001111] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > [  209.001113] turning off the locking correctness validator.
> > [  209.001114] CPU: 2 PID: 3773 Comm: Socket Thread Tainted: G S        I E     5.9.0.gc70672d-rt3-rt #8
> > [  209.001117] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
> > [  209.001118] Call Trace:
> > [  209.001123]  dump_stack+0x77/0x9b
> > [  209.001129]  validate_chain+0xf60/0x1230
>
> I have no idea how to debug this based on this report. Can you narrow
> it down to something?

I instrumented what I presume is still this problem once upon a time,
structures containing locks are allocated/initialized/freed again and
again with no cleanup until we increment into the wall.

> Is Lappy new, got a new something or has a new config switch? I'm just
> curious if this something or something that was always there but
> remained undetected.

Nah, this is nothing new.  Turn lockdep on in RT, it's just a matter of
time before it turns itself off.  It's usually just not _that_ quick.

	-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-08 15:59       ` Mike Galbraith
@ 2020-09-08 16:02         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-08 16:02 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-09-08 17:59:31 [+0200], Mike Galbraith wrote:
> > I have no idea how to debug this based on this report. Can you narrow
> > it down to something?
> 
> I instrumented what I presume is still this problem once upon a time,
> structures containing locks are allocated/initialized/freed again and
> again with no cleanup until we increment into the wall.

Any idea what it is?

> > Is Lappy new, got a new something or has a new config switch? I'm just
> > curious if this something or something that was always there but
> > remained undetected.
> 
> Nah, this is nothing new.  Turn lockdep on in RT, it's just a matter of
> time before it turns itself off.  It's usually just not _that_ quick.

Okay. So I have few boxes which run over the weekend without this splat.

> 	-Mike

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-08 15:06       ` Sebastian Andrzej Siewior
@ 2020-09-08 16:19         ` Mike Galbraith
  2020-09-09  2:39         ` Mike Galbraith
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Galbraith @ 2020-09-08 16:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On Tue, 2020-09-08 at 17:06 +0200, Sebastian Andrzej Siewior wrote:
> On 2020-09-08 16:56:20 [+0200], Mike Galbraith wrote:
> > On Tue, 2020-09-08 at 14:19 +0200, Sebastian Andrzej Siewior wrote:
> > >
> > > This has nothing to do with the bridge but with the fact that you use a
> > > non standard queue class (something else than pfifo_fast).
> >
> > That must be SUSE, I don't muck about in network land.  I downloaded a
> > whole library of RFCs decades ago, but turns out that one of those is
> > all the bedtime story you'll ever need.  Huge waste of bandwidth :)
>
> I see.
> This should cure it:

I'll give that a go.

	-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: v5.9-rc3-rt3 boot time networking lockdep splat
  2020-09-08 15:06       ` Sebastian Andrzej Siewior
  2020-09-08 16:19         ` Mike Galbraith
@ 2020-09-09  2:39         ` Mike Galbraith
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Galbraith @ 2020-09-09  2:39 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On Tue, 2020-09-08 at 17:06 +0200, Sebastian Andrzej Siewior wrote:
>
> This should cure it:

It did.

	-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ANNOUNCE] v5.9-rc3-rt3
  2020-09-02 15:55 [ANNOUNCE] v5.9-rc3-rt3 Sebastian Andrzej Siewior
  2020-09-05  4:47 ` v5.9-rc3-rt3 boot time networking lockdep splat Mike Galbraith
@ 2020-09-09  3:12 ` Mike Galbraith
  2020-09-09  5:07   ` Mike Galbraith
  2020-09-09  5:45   ` Mike Galbraith
  1 sibling, 2 replies; 17+ messages in thread
From: Mike Galbraith @ 2020-09-09  3:12 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

On Wed, 2020-09-02 at 17:55 +0200, Sebastian Andrzej Siewior wrote:
>
> Known issues
>      - It has been pointed out that due to changes to the printk code the
>        internal buffer representation changed. This is only an issue if tools
>        like `crash' are used to extract the printk buffer from a kernel memory
>        image.

Ouch.  While installing -rt5 on lappy via nfs, -rt5 server box exploded
leaving nada in logs.  I have a nifty crash dump of the event, but...

	-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ANNOUNCE] v5.9-rc3-rt3
  2020-09-09  3:12 ` [ANNOUNCE] v5.9-rc3-rt3 Mike Galbraith
@ 2020-09-09  5:07   ` Mike Galbraith
  2020-09-09  5:45   ` Mike Galbraith
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Galbraith @ 2020-09-09  5:07 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

On Wed, 2020-09-09 at 05:12 +0200, Mike Galbraith wrote:
> On Wed, 2020-09-02 at 17:55 +0200, Sebastian Andrzej Siewior wrote:
> >
> > Known issues
> >      - It has been pointed out that due to changes to the printk code the
> >        internal buffer representation changed. This is only an issue if tools
> >        like `crash' are used to extract the printk buffer from a kernel memory
> >        image.
>
> Ouch.  While installing -rt5 on lappy via nfs, -rt5 server box exploded
> leaving nada in logs.  I have a nifty crash dump of the event, but...

I backed out 1ce98b8a0a1..463463c6fa3f so crash will work again, but
haven't as yet been able to convince box to explode.  Hohum, I'll give
it some time.

Lockdep did repeat dirtying of its diaper though, on both lappy and
desktop boxen at roughly the same uptime.

[  922.978106] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[  922.978112] turning off the locking correctness validator.
[  922.978116] CPU: 2 PID: 5837 Comm: kworker/u16:0 Kdump: loaded Tainted: G S          E     5.9.0.gf4d51df-rt5-rt #3
[  922.978120] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[  922.978127] Workqueue: writeback wb_workfn (flush-8:48)
[  922.978131] Call Trace:
[  922.978138]  dump_stack+0x77/0x9b
[  922.978143]  validate_chain+0xf60/0x1230
[  922.978147]  __lock_acquire+0x880/0xbf0
[  922.978151]  lock_acquire+0x92/0x3f0
[  922.978155]  ? rt_spin_lock_slowlock_locked+0x5d/0x2c0
[  922.978160]  _raw_spin_lock+0x2f/0x40
[  922.978163]  ? rt_spin_lock_slowlock_locked+0x5d/0x2c0
[  922.978169]  rt_spin_lock_slowlock_locked+0x5d/0x2c0
[  922.978173]  __read_rt_lock+0x97/0xc0
[  922.978194]  ext4_es_lookup_extent+0x4f/0x410 [ext4]
[  922.978205]  ext4_map_blocks+0x50/0x530 [ext4]
[  922.978209]  ? kmem_cache_alloc+0x636/0x8b0
[  922.978220]  ext4_writepages+0xa2c/0x1330 [ext4]
[  922.978228]  ? do_writepages+0x3c/0xe0
[  922.978231]  do_writepages+0x3c/0xe0
[  922.978236]  ? __writeback_single_inode+0x62/0x890
[  922.978240]  __writeback_single_inode+0x62/0x890
[  922.978244]  writeback_sb_inodes+0x217/0x580
[  922.978250]  __writeback_inodes_wb+0x5d/0xd0
[  922.978254]  wb_writeback+0x28c/0x620
[  922.978259]  ? wb_workfn+0x2bc/0x7f0
[  922.978262]  wb_workfn+0x2bc/0x7f0
[  922.978266]  ? lock_acquire+0x92/0x3f0
[  922.978270]  ? process_one_work+0x1fa/0x730
[  922.978274]  ? process_one_work+0x284/0x730
[  922.978278]  ? process_one_work+0x251/0x730
[  922.978281]  process_one_work+0x284/0x730
[  922.978285]  ? _raw_spin_lock_irq+0x16/0x50
[  922.978289]  ? process_one_work+0x730/0x730
[  922.978293]  worker_thread+0x39/0x3f0
[  922.978297]  ? process_one_work+0x730/0x730
[  922.978300]  kthread+0x171/0x190
[  922.978304]  ? kthread_park+0x90/0x90
[  922.978308]  ret_from_fork+0x1f/0x30


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ANNOUNCE] v5.9-rc3-rt3
  2020-09-09  3:12 ` [ANNOUNCE] v5.9-rc3-rt3 Mike Galbraith
  2020-09-09  5:07   ` Mike Galbraith
@ 2020-09-09  5:45   ` Mike Galbraith
  2020-09-09  8:20     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2020-09-09  5:45 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

On Wed, 2020-09-09 at 05:12 +0200, Mike Galbraith wrote:
> On Wed, 2020-09-02 at 17:55 +0200, Sebastian Andrzej Siewior wrote:
> >
> > Known issues
> >      - It has been pointed out that due to changes to the printk code the
> >        internal buffer representation changed. This is only an issue if tools
> >        like `crash' are used to extract the printk buffer from a kernel memory
> >        image.
>
> Ouch.  While installing -rt5 on lappy via nfs, -rt5 server box exploded
> leaving nada in logs.  I have a nifty crash dump of the event, but...

After convincing crash (with club) that it didn't _really_ need a
log_buf, nfs had nothing to do with the crash, it was nouveau.

      KERNEL: vmlinux-5.9.0.gf4d51df-rt5-rt.gz
    DUMPFILE: vmcore
        CPUS: 8
        DATE: Wed Sep  9 04:41:24 2020
      UPTIME: 00:08:10
LOAD AVERAGE: 3.17, 1.86, 0.99
       TASKS: 715
    NODENAME: homer
     RELEASE: 5.9.0.gf4d51df-rt5-rt
     VERSION: #1 SMP PREEMPT_RT Wed Sep 9 03:22:01 CEST 2020
     MACHINE: x86_64  (3591 Mhz)
      MEMORY: 16 GB
       PANIC: ""
         PID: 2146
     COMMAND: "X"
        TASK: ffff994c7fad0000  [THREAD_INFO: ffff994c7fad0000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt -l
PID: 2146   TASK: ffff994c7fad0000  CPU: 0   COMMAND: "X"
 #0 [ffffbfffc11a76c8] machine_kexec at ffffffffb7064879
    /backup/usr/local/src/kernel/linux-master-rt/./include/linux/ftrace.h: 792
 #1 [ffffbfffc11a7710] __crash_kexec at ffffffffb7173622
    /backup/usr/local/src/kernel/linux-master-rt/kernel/kexec_core.c: 963
 #2 [ffffbfffc11a77d0] crash_kexec at ffffffffb7174920
    /backup/usr/local/src/kernel/linux-master-rt/./arch/x86/include/asm/atomic.h: 41
 #3 [ffffbfffc11a77e0] oops_end at ffffffffb702716f
    /backup/usr/local/src/kernel/linux-master-rt/arch/x86/kernel/dumpstack.c: 342
 #4 [ffffbfffc11a7800] exc_general_protection at ffffffffb79a2fc6
    /backup/usr/local/src/kernel/linux-master-rt/arch/x86/kernel/traps.c: 82
 #5 [ffffbfffc11a7890] asm_exc_general_protection at ffffffffb7a00a1e
    /backup/usr/local/src/kernel/linux-master-rt/./arch/x86/include/asm/idtentry.h: 532
 #6 [ffffbfffc11a78a0] nvif_object_ctor at ffffffffc07ee6a7 [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nvif/object.c: 280
 #7 [ffffbfffc11a7918] __kmalloc at ffffffffb72eea12
    /backup/usr/local/src/kernel/linux-master-rt/mm/slub.c: 261
 #8 [ffffbfffc11a7980] nvif_object_ctor at ffffffffc07ee6a7 [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nvif/object.c: 280
 #9 [ffffbfffc11a79d0] nvif_mem_ctor_type at ffffffffc07eef48 [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nvif/mem.c: 74
#10 [ffffbfffc11a7aa8] nouveau_mem_vram at ffffffffc08b5291 [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nouveau_mem.c: 155
#11 [ffffbfffc11a7b10] nouveau_vram_manager_new at ffffffffc08b594d [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nouveau_ttm.c: 76
#12 [ffffbfffc11a7b30] ttm_bo_mem_space at ffffffffc05af2ac [ttm]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/ttm/ttm_bo.c: 1065
#13 [ffffbfffc11a7b88] ttm_bo_validate at ffffffffc05afaca [ttm]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/ttm/ttm_bo.c: 1137
#14 [ffffbfffc11a7c18] ttm_bo_init_reserved at ffffffffc05afe70 [ttm]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/ttm/ttm_bo.c: 1330
#15 [ffffbfffc11a7c60] ttm_bo_init at ffffffffc05afff7 [ttm]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/ttm/ttm_bo.c: 1364
#16 [ffffbfffc11a7cc8] nouveau_bo_init at ffffffffc08b0f7b [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nouveau_bo.c: 317
#17 [ffffbfffc11a7d38] nouveau_gem_new at ffffffffc08b2f7b [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nouveau_gem.c: 206
#18 [ffffbfffc11a7d70] nouveau_gem_ioctl_new at ffffffffc08b3001 [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nouveau_gem.c: 272
#19 [ffffbfffc11a7da0] drm_ioctl_kernel at ffffffffc066f564 [drm]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/drm_ioctl.c: 793
#20 [ffffbfffc11a7de0] drm_ioctl at ffffffffc066f88e [drm]
    /backup/usr/local/src/kernel/linux-master-rt/./include/linux/uaccess.h: 168
#21 [ffffbfffc11a7ed0] nouveau_drm_ioctl at ffffffffc08abf56 [nouveau]
    /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nouveau_drm.c: 1163
#22 [ffffbfffc11a7f08] __x64_sys_ioctl at ffffffffb733255e
    /backup/usr/local/src/kernel/linux-master-rt/fs/ioctl.c: 49
#23 [ffffbfffc11a7f40] do_syscall_64 at ffffffffb79a25c3
    /backup/usr/local/src/kernel/linux-master-rt/arch/x86/entry/common.c: 46
#24 [ffffbfffc11a7f50] entry_SYSCALL_64_after_hwframe at ffffffffb7a0008c
    /backup/usr/local/src/kernel/linux-master-rt/arch/x86/entry/entry_64.S: 125
    RIP: 00007f96707a6ac7  RSP: 00007ffc1cbc2998  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000055743cf152e0  RCX: 00007f96707a6ac7
    RDX: 00007ffc1cbc29f0  RSI: 00000000c0306480  RDI: 000000000000000e
    RBP: 00007ffc1cbc29f0   R8: 0000000000000000   R9: 0000000000000003
    R10: fffffffffffffd98  R11: 0000000000000246  R12: 00000000c0306480
    R13: 000000000000000e  R14: 000055743ce99040  R15: 000055743c60cfd0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ANNOUNCE] v5.9-rc3-rt3
  2020-09-09  5:45   ` Mike Galbraith
@ 2020-09-09  8:20     ` Sebastian Andrzej Siewior
  2020-09-09  8:56       ` Mike Galbraith
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-09  8:20 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-09-09 07:45:22 [+0200], Mike Galbraith wrote:
> On Wed, 2020-09-09 at 05:12 +0200, Mike Galbraith wrote:
> > On Wed, 2020-09-02 at 17:55 +0200, Sebastian Andrzej Siewior wrote:
> > >
> > > Known issues
> > >      - It has been pointed out that due to changes to the printk code the
> > >        internal buffer representation changed. This is only an issue if tools
> > >        like `crash' are used to extract the printk buffer from a kernel memory
> > >        image.
> >
> > Ouch.  While installing -rt5 on lappy via nfs, -rt5 server box exploded
> > leaving nada in logs.  I have a nifty crash dump of the event, but...
> 
> After convincing crash (with club) that it didn't _really_ need a
> log_buf, nfs had nothing to do with the crash, it was nouveau.

okay. Line 280 is hard to understand. My guess is that we got a pointer
and then the boom occurred but I can't tell why/how. A few lines later
there is args->x = y…
Do you see the lockdep splat without nouveau?

> crash> bt -l
> PID: 2146   TASK: ffff994c7fad0000  CPU: 0   COMMAND: "X"
>  #0 [ffffbfffc11a76c8] machine_kexec at ffffffffb7064879
>     /backup/usr/local/src/kernel/linux-master-rt/./include/linux/ftrace.h: 792
>  #1 [ffffbfffc11a7710] __crash_kexec at ffffffffb7173622
>     /backup/usr/local/src/kernel/linux-master-rt/kernel/kexec_core.c: 963
>  #2 [ffffbfffc11a77d0] crash_kexec at ffffffffb7174920
>     /backup/usr/local/src/kernel/linux-master-rt/./arch/x86/include/asm/atomic.h: 41
>  #3 [ffffbfffc11a77e0] oops_end at ffffffffb702716f
>     /backup/usr/local/src/kernel/linux-master-rt/arch/x86/kernel/dumpstack.c: 342
>  #4 [ffffbfffc11a7800] exc_general_protection at ffffffffb79a2fc6
>     /backup/usr/local/src/kernel/linux-master-rt/arch/x86/kernel/traps.c: 82
>  #5 [ffffbfffc11a7890] asm_exc_general_protection at ffffffffb7a00a1e
>     /backup/usr/local/src/kernel/linux-master-rt/./arch/x86/include/asm/idtentry.h: 532
>  #6 [ffffbfffc11a78a0] nvif_object_ctor at ffffffffc07ee6a7 [nouveau]
>     /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nvif/object.c: 280
>  #7 [ffffbfffc11a7918] __kmalloc at ffffffffb72eea12
>     /backup/usr/local/src/kernel/linux-master-rt/mm/slub.c: 261
>  #8 [ffffbfffc11a7980] nvif_object_ctor at ffffffffc07ee6a7 [nouveau]
>     /backup/usr/local/src/kernel/linux-master-rt/drivers/gpu/drm/nouveau/nvif/object.c: 280

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ANNOUNCE] v5.9-rc3-rt3
  2020-09-09  8:20     ` Sebastian Andrzej Siewior
@ 2020-09-09  8:56       ` Mike Galbraith
  2020-09-09  8:59         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2020-09-09  8:56 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On Wed, 2020-09-09 at 10:20 +0200, Sebastian Andrzej Siewior wrote:
>
> Do you see the lockdep splat without nouveau?

Yeah.  Lappy uses i915, but lockdep also shuts itself off.

BTW, methinks RT had nothing to do with the nouveau burp.

	-Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ANNOUNCE] v5.9-rc3-rt3
  2020-09-09  8:56       ` Mike Galbraith
@ 2020-09-09  8:59         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-09-09  8:59 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-09-09 10:56:41 [+0200], Mike Galbraith wrote:
> On Wed, 2020-09-09 at 10:20 +0200, Sebastian Andrzej Siewior wrote:
> >
> > Do you see the lockdep splat without nouveau?
> 
> Yeah.  Lappy uses i915, but lockdep also shuts itself off.

You sent the config, I will try to throw it later on kvm and actual
hardware and see what happens.

> BTW, methinks RT had nothing to do with the nouveau burp.

that is good to hear :)

> 	-Mike

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-09-09  8:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-02 15:55 [ANNOUNCE] v5.9-rc3-rt3 Sebastian Andrzej Siewior
2020-09-05  4:47 ` v5.9-rc3-rt3 boot time networking lockdep splat Mike Galbraith
2020-09-05  5:19   ` Mike Galbraith
2020-09-08 15:12     ` Sebastian Andrzej Siewior
2020-09-08 15:59       ` Mike Galbraith
2020-09-08 16:02         ` Sebastian Andrzej Siewior
2020-09-08 12:19   ` Sebastian Andrzej Siewior
2020-09-08 14:56     ` Mike Galbraith
2020-09-08 15:06       ` Sebastian Andrzej Siewior
2020-09-08 16:19         ` Mike Galbraith
2020-09-09  2:39         ` Mike Galbraith
2020-09-09  3:12 ` [ANNOUNCE] v5.9-rc3-rt3 Mike Galbraith
2020-09-09  5:07   ` Mike Galbraith
2020-09-09  5:45   ` Mike Galbraith
2020-09-09  8:20     ` Sebastian Andrzej Siewior
2020-09-09  8:56       ` Mike Galbraith
2020-09-09  8:59         ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).