Re: [ANNOUNCE] v5.9.1-rt18

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [ANNOUNCE] v5.9.1-rt18
       [not found] <20201021125324.ualpvrxvzyie6d7d@linutronix.de>
@ 2020-10-21 13:14 ` Sebastian Andrzej Siewior
  2020-10-27  6:53   ` Fernando Lopez-Lezcano
  2020-10-22  5:21 ` ltp or kvm triggerable lockdep alloc_pid() deadlock gripe Mike Galbraith
  2020-10-22  5:28 ` kvm+nouveau induced lockdep gripe Mike Galbraith
  2 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-21 13:14 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, linux-rt-users, Steven Rostedt

On 2020-10-21 14:53:27 [+0200], To Thomas Gleixner wrote:
> Dear RT folks!
> 
> I'm pleased to announce the v5.9.1-rt18 patch set. 
> 
> Changes since v5.9.1-rt17:
> 
>   - Update the migrate-disable series by Peter Zijlstra to v3. Include
>     also fixes discussed in the thread.
> 
>   - UP builds did not boot since the replace of the migrate-disable
>     code. Reported by Christian Egger. Fixed as a part of v3 by Peter
>     Zijlstra.
> 
>   - Rebase the printk code on top of the ringer buffer designed for
>     printk which was merged in the v5.10 merge window. Patches by John
>     Ogness.
> 
> Known issues
>      - It has been pointed out that due to changes to the printk code the
>        internal buffer representation changed. This is only an issue if tools
>        like `crash' are used to extract the printk buffer from a kernel memory
>        image.
> 
> The delta patch against v5.9.1-rt17 is appended below and can be found here:
>  
>      https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/incr/patch-5.9.1-rt17-rt18.patch.xz
> 
> You can get this release via the git tree at:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git v5.9.1-rt18
> 
> The RT patch against v5.9.1 can be found here:
> 
>     https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/older/patch-5.9.1-rt18.patch.xz
> 
> The split quilt queue is available at:
> 
>     https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/older/patches-5.9.1-rt18.tar.xz
> 

The attached diff was too large and the mail was dropped. It is
available at
	https://git.kernel.org/rt/linux-rt-devel/d/v5.9.1-rt18/v5.9.1-rt17

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* ltp or kvm triggerable lockdep alloc_pid()  deadlock gripe
       [not found] <20201021125324.ualpvrxvzyie6d7d@linutronix.de>
  2020-10-21 13:14 ` [ANNOUNCE] v5.9.1-rt18 Sebastian Andrzej Siewior
@ 2020-10-22  5:21 ` Mike Galbraith
  2020-10-22 16:44   ` Sebastian Andrzej Siewior
  2020-10-22  5:28 ` kvm+nouveau induced lockdep gripe Mike Galbraith
  2 siblings, 1 reply; 22+ messages in thread
From: Mike Galbraith @ 2020-10-22  5:21 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 5886 bytes --]

Greetings,

The gripe below is repeatable in two ways here, boot with nomodeset so
nouveau doesn't steal the lockdep show when I then fire up one of my
(oink) full distro VM's, or from an ltp directory ./runltp -f cpuset
with the attached subset of controllers file placed in ./runtest dir.

Method2 may lead to a real deal deadlock, I've got a crashdump of one,
stack traces of uninterruptible sleepers attached.

[  154.927302] ======================================================
[  154.927303] WARNING: possible circular locking dependency detected
[  154.927304] 5.9.1-rt18-rt #5 Tainted: G S          E
[  154.927305] ------------------------------------------------------
[  154.927306] cpuset_inherit_/4992 is trying to acquire lock:
[  154.927307] ffff9d334c5e64d8 (&s->seqcount){+.+.}-{0:0}, at: __slab_alloc.isra.87+0xad/0xc0
[  154.927317]
               but task is already holding lock:
[  154.927317] ffffffffac4052d0 (pidmap_lock){+.+.}-{2:2}, at: alloc_pid+0x1fb/0x510
[  154.927324]
               which lock already depends on the new lock.

[  154.927324]
               the existing dependency chain (in reverse order) is:
[  154.927325]
               -> #1 (pidmap_lock){+.+.}-{2:2}:
[  154.927328]        lock_acquire+0x92/0x410
[  154.927331]        rt_spin_lock+0x2b/0xc0
[  154.927335]        free_pid+0x27/0xc0
[  154.927338]        release_task+0x34a/0x640
[  154.927340]        do_exit+0x6e9/0xcf0
[  154.927342]        kthread+0x11c/0x190
[  154.927344]        ret_from_fork+0x1f/0x30
[  154.927347]
               -> #0 (&s->seqcount){+.+.}-{0:0}:
[  154.927350]        validate_chain+0x981/0x1250
[  154.927352]        __lock_acquire+0x86f/0xbd0
[  154.927354]        lock_acquire+0x92/0x410
[  154.927356]        ___slab_alloc+0x71b/0x820
[  154.927358]        __slab_alloc.isra.87+0xad/0xc0
[  154.927359]        kmem_cache_alloc+0x700/0x8c0
[  154.927361]        radix_tree_node_alloc.constprop.22+0xa2/0xf0
[  154.927365]        idr_get_free+0x207/0x2b0
[  154.927367]        idr_alloc_u32+0x54/0xa0
[  154.927369]        idr_alloc_cyclic+0x4f/0xa0
[  154.927370]        alloc_pid+0x22b/0x510
[  154.927372]        copy_process+0xeb5/0x1de0
[  154.927375]        _do_fork+0x52/0x750
[  154.927377]        __do_sys_clone+0x64/0x70
[  154.927379]        do_syscall_64+0x33/0x40
[  154.927382]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  154.927384]
               other info that might help us debug this:

[  154.927384]  Possible unsafe locking scenario:

[  154.927385]        CPU0                    CPU1
[  154.927386]        ----                    ----
[  154.927386]   lock(pidmap_lock);
[  154.927388]                                lock(&s->seqcount);
[  154.927389]                                lock(pidmap_lock);
[  154.927391]   lock(&s->seqcount);
[  154.927392]
                *** DEADLOCK ***

[  154.927393] 4 locks held by cpuset_inherit_/4992:
[  154.927394]  #0: ffff9d33decea5b0 ((lock).lock){+.+.}-{2:2}, at: __radix_tree_preload+0x52/0x3b0
[  154.927399]  #1: ffffffffac598fa0 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[  154.927405]  #2: ffffffffac4052d0 (pidmap_lock){+.+.}-{2:2}, at: alloc_pid+0x1fb/0x510
[  154.927409]  #3: ffffffffac598fa0 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[  154.927414]
               stack backtrace:
[  154.927416] CPU: 3 PID: 4992 Comm: cpuset_inherit_ Kdump: loaded Tainted: G S          E     5.9.1-rt18-rt #5
[  154.927418] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[  154.927419] Call Trace:
[  154.927422]  dump_stack+0x77/0x9b
[  154.927425]  check_noncircular+0x148/0x160
[  154.927432]  ? validate_chain+0x981/0x1250
[  154.927435]  validate_chain+0x981/0x1250
[  154.927441]  __lock_acquire+0x86f/0xbd0
[  154.927446]  lock_acquire+0x92/0x410
[  154.927449]  ? __slab_alloc.isra.87+0xad/0xc0
[  154.927452]  ? kmem_cache_alloc+0x648/0x8c0
[  154.927453]  ? lock_acquire+0x92/0x410
[  154.927458]  ___slab_alloc+0x71b/0x820
[  154.927460]  ? __slab_alloc.isra.87+0xad/0xc0
[  154.927463]  ? radix_tree_node_alloc.constprop.22+0xa2/0xf0
[  154.927468]  ? __slab_alloc.isra.87+0x83/0xc0
[  154.927472]  ? radix_tree_node_alloc.constprop.22+0xa2/0xf0
[  154.927474]  ? __slab_alloc.isra.87+0xad/0xc0
[  154.927476]  __slab_alloc.isra.87+0xad/0xc0
[  154.927480]  ? radix_tree_node_alloc.constprop.22+0xa2/0xf0
[  154.927482]  kmem_cache_alloc+0x700/0x8c0
[  154.927487]  radix_tree_node_alloc.constprop.22+0xa2/0xf0
[  154.927491]  idr_get_free+0x207/0x2b0
[  154.927495]  idr_alloc_u32+0x54/0xa0
[  154.927500]  idr_alloc_cyclic+0x4f/0xa0
[  154.927503]  alloc_pid+0x22b/0x510
[  154.927506]  ? copy_thread+0x88/0x200
[  154.927512]  copy_process+0xeb5/0x1de0
[  154.927520]  _do_fork+0x52/0x750
[  154.927523]  ? lock_acquire+0x92/0x410
[  154.927525]  ? __might_fault+0x3e/0x90
[  154.927530]  ? find_held_lock+0x2d/0x90
[  154.927535]  __do_sys_clone+0x64/0x70
[  154.927541]  do_syscall_64+0x33/0x40
[  154.927544]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  154.927546] RIP: 0033:0x7f0b357356e3
[  154.927548] Code: db 45 85 ed 0f 85 ad 01 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 f1 00 00 00 85 c0 41 89 c4 0f 85 fe 00 00
[  154.927550] RSP: 002b:00007ffdfd6d15f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[  154.927552] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0b357356e3
[  154.927554] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[  154.927555] RBP: 00007ffdfd6d1620 R08: 00007f0b36052b80 R09: 0000000000000072
[  154.927556] R10: 00007f0b36052e50 R11: 0000000000000246 R12: 0000000000000000
[  154.927557] R13: 0000000000000000 R14: 0000000000000000 R15: 00005614ef57ecf0

[-- Attachment #2: cpuset --]
[-- Type: text/plain, Size: 587 bytes --]

#DESCRIPTION:Resource Management testing

cpuset_base_ops	cpuset_base_ops_testset.sh
cpuset_inherit	cpuset_inherit_testset.sh
cpuset_exclusive	cpuset_exclusive_test.sh
cpuset_hierarchy	cpuset_hierarchy_test.sh
cpuset_syscall	cpuset_syscall_testset.sh
cpuset_sched_domains	cpuset_sched_domains_test.sh
cpuset_load_balance	cpuset_load_balance_test.sh
cpuset_hotplug	cpuset_hotplug_test.sh
cpuset_memory	cpuset_memory_testset.sh
cpuset_memory_pressure	cpuset_memory_pressure_testset.sh
cpuset_memory_spread	cpuset_memory_spread_testset.sh

cpuset_regression_test cpuset_regression_test.sh


[-- Attachment #3: deadlock-log --]
[-- Type: text/plain, Size: 14019 bytes --]

      1      0   2  ffffa09c87fd0000  UN   0.1  221032   9368  systemd
    627      1   0  ffffa09f733c0000  UN   0.0   68392   7004  systemd-udevd
   3322   3247   7  ffffa09f512051c0  UN   0.0  167624   2732  gpg-agent
   3841   3468   2  ffffa09f32250000  UN   0.1   19912   9828  bash
   4209   3468   2  ffffa09dd070d1c0  UN   0.1   19912   9796  bash
   4845   3335   3  ffffa09f2ffe1b40  UN   0.1  268172  24880  file.so
   4846   3335   5  ffffa09f2d418000  UN   0.1  268172  24880  file.so
   5657      1   3  ffffa09f222e3680  UN   1.5 2884604 260248  Thread (pooled)
   6716   5797   3  ffffa09f30da1b40  UN   0.0   14128   4168  cpuset_hotplug_
   6743      1   3  ffffa09f54151b40  UN   0.1  574864  18532  pool-/usr/lib/x
   6744      1   4  ffffa09f357f9b40  UN   0.0  489516   6096  pool-/usr/lib/x
PID: 1      TASK: ffffa09c87fd0000  CPU: 2   COMMAND: "systemd"
 #0 [ffffbfbb00033c50] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb00033cd8] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb00033ce8] __rt_mutex_slowlock+56 at ffffffffa59b3868
 #3 [ffffbfbb00033d30] rt_mutex_slowlock_locked+207 at ffffffffa59b3abf
 #4 [ffffbfbb00033d88] rt_mutex_slowlock.constprop.30+90 at ffffffffa59b3d3a
 #5 [ffffbfbb00033e00] proc_cgroup_show+74 at ffffffffa5184a7a
 #6 [ffffbfbb00033e48] proc_single_show+84 at ffffffffa53cd524
 #7 [ffffbfbb00033e80] seq_read+206 at ffffffffa534c30e
 #8 [ffffbfbb00033ed8] vfs_read+209 at ffffffffa531d281
 #9 [ffffbfbb00033f08] ksys_read+135 at ffffffffa531d637
#10 [ffffbfbb00033f40] do_syscall_64+51 at ffffffffa59a35c3
#11 [ffffbfbb00033f50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f1bb97dd1d8  RSP: 00007ffd5f424ac0  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000055e3b1224cc0  RCX: 00007f1bb97dd1d8
    RDX: 0000000000000400  RSI: 000055e3b1224cc0  RDI: 0000000000000055
    RBP: 0000000000000400   R8: 0000000000000000   R9: 0000000000000000
    R10: 00007f1bbb1f9940  R11: 0000000000000246  R12: 00007f1bb9aa57a0
    R13: 00007f1bb9aa62e0  R14: 0000000000000000  R15: 000055e3b1377f20
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
PID: 627    TASK: ffffa09f733c0000  CPU: 0   COMMAND: "systemd-udevd"
 #0 [ffffbfbb00b6fc38] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb00b6fcc0] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb00b6fcc8] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb00b6fd28] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb00b6fd40] cgroup_can_fork+1321 at ffffffffa5185e69
 #5 [ffffbfbb00b6fd88] copy_process+4457 at ffffffffa508aab9
 #6 [ffffbfbb00b6fe30] _do_fork+82 at ffffffffa508b882
 #7 [ffffbfbb00b6fed0] __do_sys_clone+100 at ffffffffa508c054
 #8 [ffffbfbb00b6ff40] do_syscall_64+51 at ffffffffa59a35c3
 #9 [ffffbfbb00b6ff50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007fa6261286e3  RSP: 00007ffc0a16daf0  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007ffc0a16daf0  RCX: 00007fa6261286e3
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000001200011
    RBP: 00007ffc0a16db40   R8: 00007fa6272cfd40   R9: 0000000000000001
    R10: 00007fa6272d0010  R11: 0000000000000246  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000557a3d4a9a10
    ORIG_RAX: 0000000000000038  CS: 0033  SS: 002b
PID: 3322   TASK: ffffa09f512051c0  CPU: 7   COMMAND: "gpg-agent"
 #0 [ffffbfbb00ef7c38] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb00ef7cc0] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb00ef7cc8] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb00ef7d28] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb00ef7d40] cgroup_can_fork+1321 at ffffffffa5185e69
 #5 [ffffbfbb00ef7d88] copy_process+4457 at ffffffffa508aab9
 #6 [ffffbfbb00ef7e30] _do_fork+82 at ffffffffa508b882
 #7 [ffffbfbb00ef7ed0] __do_sys_clone+100 at ffffffffa508c054
 #8 [ffffbfbb00ef7f40] do_syscall_64+51 at ffffffffa59a35c3
 #9 [ffffbfbb00ef7f50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f4355ee7fb1  RSP: 00007ffdf3130358  RFLAGS: 00000202
    RAX: ffffffffffffffda  RBX: 00007f4355be7700  RCX: 00007f4355ee7fb1
    RDX: 00007f4355be79d0  RSI: 00007f4355be6fb0  RDI: 00000000003d0f00
    RBP: 00007ffdf3130760   R8: 00007f4355be7700   R9: 00007f4355be7700
    R10: 00007f4355be79d0  R11: 0000000000000202  R12: 00007ffdf31303fe
    R13: 00007ffdf31303ff  R14: 000055667da0f9e0  R15: 00007ffdf3130760
    ORIG_RAX: 0000000000000038  CS: 0033  SS: 002b
PID: 3841   TASK: ffffa09f32250000  CPU: 2   COMMAND: "bash"
 #0 [ffffbfbb02d0fc38] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb02d0fcc0] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb02d0fcc8] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb02d0fd28] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb02d0fd40] cgroup_can_fork+1321 at ffffffffa5185e69
 #5 [ffffbfbb02d0fd88] copy_process+4457 at ffffffffa508aab9
 #6 [ffffbfbb02d0fe30] _do_fork+82 at ffffffffa508b882
 #7 [ffffbfbb02d0fed0] __do_sys_clone+100 at ffffffffa508c054
 #8 [ffffbfbb02d0ff40] do_syscall_64+51 at ffffffffa59a35c3
 #9 [ffffbfbb02d0ff50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f023eaf36e3  RSP: 00007ffe80698a30  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f023eaf36e3
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000001200011
    RBP: 00007ffe80698a60   R8: 00007f023f410b80   R9: 0000000000000000
    R10: 00007f023f410e50  R11: 0000000000000246  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000562cd4ee6c90  R15: 0000562cd4ee6c90
    ORIG_RAX: 0000000000000038  CS: 0033  SS: 002b
PID: 4209   TASK: ffffa09dd070d1c0  CPU: 2   COMMAND: "bash"
 #0 [ffffbfbb0376fc38] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb0376fcc0] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb0376fcc8] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb0376fd28] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb0376fd40] cgroup_can_fork+1321 at ffffffffa5185e69
 #5 [ffffbfbb0376fd88] copy_process+4457 at ffffffffa508aab9
 #6 [ffffbfbb0376fe30] _do_fork+82 at ffffffffa508b882
 #7 [ffffbfbb0376fed0] __do_sys_clone+100 at ffffffffa508c054
 #8 [ffffbfbb0376ff40] do_syscall_64+51 at ffffffffa59a35c3
 #9 [ffffbfbb0376ff50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f81a96426e3  RSP: 00007ffd97e5ebc0  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f81a96426e3
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000001200011
    RBP: 00007ffd97e5ebf0   R8: 00007f81a9f5fb80   R9: 0000000000000000
    R10: 00007f81a9f5fe50  R11: 0000000000000246  R12: 0000000000000000
    R13: 0000000000000000  R14: 000055a56fcb9c90  R15: 000055a56fcb9c90
    ORIG_RAX: 0000000000000038  CS: 0033  SS: 002b
PID: 4845   TASK: ffffa09f2ffe1b40  CPU: 3   COMMAND: "file.so"
 #0 [ffffbfbb03223d88] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb03223e10] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb03223e18] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb03223e78] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb03223e90] exit_signals+711 at ffffffffa50a2f27
 #5 [ffffbfbb03223ea8] do_exit+216 at ffffffffa5093ef8
 #6 [ffffbfbb03223f10] do_group_exit+71 at ffffffffa5094bb7
 #7 [ffffbfbb03223f38] __x64_sys_exit_group+20 at ffffffffa5094c34
 #8 [ffffbfbb03223f40] do_syscall_64+51 at ffffffffa59a35c3
 #9 [ffffbfbb03223f50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f0ca6b20998  RSP: 00007ffc5e1eef48  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f0ca6b20998
    RDX: 0000000000000000  RSI: 000000000000003c  RDI: 0000000000000000
    RBP: 00007f0ca6e0d510   R8: 00000000000000e7   R9: ffffffffffffff60
    R10: 00007f0ca5fd10f8  R11: 0000000000000246  R12: 00007f0ca6e0d510
    R13: 00007f0ca6e0d8c0  R14: 00007ffc5e1eefe0  R15: 0000000000000020
    ORIG_RAX: 00000000000000e7  CS: 0033  SS: 002b
PID: 4846   TASK: ffffa09f2d418000  CPU: 5   COMMAND: "file.so"
 #0 [ffffbfbb0396fd88] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb0396fe10] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb0396fe18] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb0396fe78] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb0396fe90] exit_signals+711 at ffffffffa50a2f27
 #5 [ffffbfbb0396fea8] do_exit+216 at ffffffffa5093ef8
 #6 [ffffbfbb0396ff10] do_group_exit+71 at ffffffffa5094bb7
 #7 [ffffbfbb0396ff38] __x64_sys_exit_group+20 at ffffffffa5094c34
 #8 [ffffbfbb0396ff40] do_syscall_64+51 at ffffffffa59a35c3
 #9 [ffffbfbb0396ff50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f0ca6b20998  RSP: 00007ffc5e1eef48  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f0ca6b20998
    RDX: 0000000000000000  RSI: 000000000000003c  RDI: 0000000000000000
    RBP: 00007f0ca6e0d510   R8: 00000000000000e7   R9: ffffffffffffff60
    R10: 00007f0ca5fd10f8  R11: 0000000000000246  R12: 00007f0ca6e0d510
    R13: 00007f0ca6e0d8c0  R14: 00007ffc5e1eefe0  R15: 0000000000000020
    ORIG_RAX: 00000000000000e7  CS: 0033  SS: 002b
PID: 5657   TASK: ffffa09f222e3680  CPU: 3   COMMAND: "Thread (pooled)"
 #0 [ffffbfbb03a07db0] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb03a07e38] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb03a07e40] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb03a07ea0] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb03a07eb8] exit_signals+711 at ffffffffa50a2f27
 #5 [ffffbfbb03a07ed0] do_exit+216 at ffffffffa5093ef8
 #6 [ffffbfbb03a07f38] __x64_sys_exit+23 at ffffffffa5094b67
 #7 [ffffbfbb03a07f40] do_syscall_64+51 at ffffffffa59a35c3
 #8 [ffffbfbb03a07f50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f1f2f9265b6  RSP: 00007f1e95057d50  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f1e95058700  RCX: 00007f1f2f9265b6
    RDX: 000000000000003c  RSI: 00007f1f2fb38010  RDI: 0000000000000000
    RBP: 0000000000000000   R8: 00007f1e880029c0   R9: 0000000000000000
    R10: 0000000000000020  R11: 0000000000000246  R12: 00007ffd7582f27e
    R13: 00007ffd7582f27f  R14: 000055fce0818180  R15: 00007ffd7582f350
    ORIG_RAX: 000000000000003c  CS: 0033  SS: 002b
PID: 6716   TASK: ffffa09f30da1b40  CPU: 3   COMMAND: "cpuset_hotplug_"
 #0 [ffffbfbb03ac79d8] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb03ac7a60] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb03ac7a70] schedule_timeout+495 at ffffffffa59b4cbf
 #3 [ffffbfbb03ac7b08] wait_for_completion+165 at ffffffffa59b2f25
 #4 [ffffbfbb03ac7b48] affine_move_task+705 at ffffffffa50cc231
 #5 [ffffbfbb03ac7c88] __set_cpus_allowed_ptr+274 at ffffffffa50cc562
 #6 [ffffbfbb03ac7cc8] cpuset_attach+195 at ffffffffa518df73
 #7 [ffffbfbb03ac7d00] cgroup_migrate_execute+1133 at ffffffffa518075d
 #8 [ffffbfbb03ac7d58] cgroup_attach_task+524 at ffffffffa5180abc
 #9 [ffffbfbb03ac7e28] __cgroup1_procs_write.constprop.21+243 at ffffffffa5187843
#10 [ffffbfbb03ac7e68] cgroup_file_write+126 at ffffffffa517b07e
#11 [ffffbfbb03ac7ea0] kernfs_fop_write+275 at ffffffffa53e1c13
#12 [ffffbfbb03ac7ed8] vfs_write+240 at ffffffffa531d470
#13 [ffffbfbb03ac7f08] ksys_write+135 at ffffffffa531d737
#14 [ffffbfbb03ac7f40] do_syscall_64+51 at ffffffffa59a35c3
#15 [ffffbfbb03ac7f50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f0176b84244  RSP: 00007ffffcf7e2b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000005  RCX: 00007f0176b84244
    RDX: 0000000000000005  RSI: 000055b15203d890  RDI: 0000000000000001
    RBP: 000055b15203d890   R8: 000000000000000a   R9: 0000000000000000
    R10: 000000000000000a  R11: 0000000000000246  R12: 0000000000000005
    R13: 0000000000000001  R14: 00007f0176e4c5a0  R15: 0000000000000005
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
PID: 6743   TASK: ffffa09f54151b40  CPU: 3   COMMAND: "pool-/usr/lib/x"
 #0 [ffffbfbb08657db0] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb08657e38] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb08657e40] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb08657ea0] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb08657eb8] exit_signals+711 at ffffffffa50a2f27
 #5 [ffffbfbb08657ed0] do_exit+216 at ffffffffa5093ef8
 #6 [ffffbfbb08657f38] __x64_sys_exit+23 at ffffffffa5094b67
 #7 [ffffbfbb08657f40] do_syscall_64+51 at ffffffffa59a35c3
 #8 [ffffbfbb08657f50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007f0176d6b5b6  RSP: 00007f015d540dd0  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f015d541700  RCX: 00007f0176d6b5b6
    RDX: 000000000000003c  RSI: 00007f0176f7d010  RDI: 0000000000000000
    RBP: 0000000000000000   R8: 00007f01500008c0   R9: 0000000000000004
    R10: 000055ee65f185d0  R11: 0000000000000246  R12: 00007ffc71eef56e
    R13: 00007ffc71eef56f  R14: 00007f01640038f0  R15: 00007ffc71eef600
    ORIG_RAX: 000000000000003c  CS: 0033  SS: 002b
PID: 6744   TASK: ffffa09f357f9b40  CPU: 4   COMMAND: "pool-/usr/lib/x"
 #0 [ffffbfbb0865fdb0] __schedule+837 at ffffffffa59b15f5
 #1 [ffffbfbb0865fe38] schedule+86 at ffffffffa59b1d96
 #2 [ffffbfbb0865fe40] percpu_rwsem_wait+181 at ffffffffa5101b75
 #3 [ffffbfbb0865fea0] __percpu_down_read+114 at ffffffffa5101ec2
 #4 [ffffbfbb0865feb8] exit_signals+711 at ffffffffa50a2f27
 #5 [ffffbfbb0865fed0] do_exit+216 at ffffffffa5093ef8
 #6 [ffffbfbb0865ff38] __x64_sys_exit+23 at ffffffffa5094b67
 #7 [ffffbfbb0865ff40] do_syscall_64+51 at ffffffffa59a35c3
 #8 [ffffbfbb0865ff50] entry_SYSCALL_64_after_hwframe+68 at ffffffffa5a0008c
    RIP: 00007ff49b0c85b6  RSP: 00007ff493ffedd0  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007ff493fff700  RCX: 00007ff49b0c85b6
    RDX: 000000000000003c  RSI: 00007ff49b2da010  RDI: 0000000000000000
    RBP: 0000000000000000   R8: 00007ff488001600   R9: 0000000000000000
    R10: 0000000000000050  R11: 0000000000000246  R12: 00007ffdc88c817e
    R13: 00007ffdc88c817f  R14: 00007ff48c0069e0  R15: 00007ffdc88c8210
    ORIG_RAX: 000000000000003c  CS: 0033  SS: 002b

^ permalink raw reply	[flat|nested] 22+ messages in thread

* kvm+nouveau induced lockdep  gripe
       [not found] <20201021125324.ualpvrxvzyie6d7d@linutronix.de>
  2020-10-21 13:14 ` [ANNOUNCE] v5.9.1-rt18 Sebastian Andrzej Siewior
  2020-10-22  5:21 ` ltp or kvm triggerable lockdep alloc_pid() deadlock gripe Mike Galbraith
@ 2020-10-22  5:28 ` Mike Galbraith
  2020-10-23  9:01   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 22+ messages in thread
From: Mike Galbraith @ 2020-10-22  5:28 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: LKML, linux-rt-users, Steven Rostedt

I've only as yet seen nouveau lockdep gripage when firing up one of my
full distro KVM's.

[   91.655613] ======================================================
[   91.655614] WARNING: possible circular locking dependency detected
[   91.655614] 5.9.1-rt18-rt #5 Tainted: G S          E
[   91.655615] ------------------------------------------------------
[   91.655615] libvirtd/1868 is trying to acquire lock:
[   91.655616] ffff918554b801c0 (&mm->mmap_lock#2){++++}-{0:0}, at: mpol_rebind_mm+0x1e/0x50
[   91.655622]
               but task is already holding lock:
[   91.655622] ffffffff995b6c80 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x38/0x390
[   91.655625]
               which lock already depends on the new lock.

[   91.655625]
               the existing dependency chain (in reverse order) is:
[   91.655625]
               -> #3 (&cpuset_rwsem){++++}-{0:0}:
[   91.655626]        lock_acquire+0x92/0x410
[   91.655629]        cpuset_read_lock+0x39/0xf0
[   91.655630]        __sched_setscheduler+0x4be/0xaf0
[   91.655632]        _sched_setscheduler+0x69/0x70
[   91.655633]        __kthread_create_on_node+0x114/0x170
[   91.655634]        kthread_create_on_node+0x37/0x40
[   91.655635]        setup_irq_thread+0x37/0x90
[   91.655637]        __setup_irq+0x4de/0x7b0
[   91.655637]        request_threaded_irq+0xf8/0x160
[   91.655638]        nvkm_pci_oneinit+0x4c/0x70 [nouveau]
[   91.655674]        nvkm_subdev_init+0x60/0x1e0 [nouveau]
[   91.655689]        nvkm_device_init+0x10b/0x240 [nouveau]
[   91.655716]        nvkm_udevice_init+0x47/0x70 [nouveau]
[   91.655742]        nvkm_object_init+0x3d/0x180 [nouveau]
[   91.655755]        nvkm_ioctl_new+0x1a1/0x260 [nouveau]
[   91.655768]        nvkm_ioctl+0x10a/0x240 [nouveau]
[   91.655779]        nvif_object_ctor+0xeb/0x150 [nouveau]
[   91.655790]        nvif_device_ctor+0x1f/0x60 [nouveau]
[   91.655801]        nouveau_cli_init+0x1dc/0x5c0 [nouveau]
[   91.655826]        nouveau_drm_device_init+0x66/0x810 [nouveau]
[   91.655850]        nouveau_drm_probe+0xfb/0x200 [nouveau]
[   91.655873]        local_pci_probe+0x42/0x90
[   91.655875]        pci_device_probe+0xe7/0x1a0
[   91.655876]        really_probe+0xf7/0x4d0
[   91.655877]        driver_probe_device+0x5d/0x140
[   91.655878]        device_driver_attach+0x4f/0x60
[   91.655879]        __driver_attach+0xa2/0x140
[   91.655880]        bus_for_each_dev+0x67/0x90
[   91.655881]        bus_add_driver+0x192/0x230
[   91.655882]        driver_register+0x5b/0xf0
[   91.655883]        do_one_initcall+0x56/0x3c4
[   91.655884]        do_init_module+0x5b/0x21c
[   91.655886]        load_module+0x1cc7/0x2430
[   91.655887]        __do_sys_finit_module+0xa7/0xe0
[   91.655888]        do_syscall_64+0x33/0x40
[   91.655889]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   91.655890]
               -> #2 (&device->mutex){+.+.}-{0:0}:
[   91.655891]        lock_acquire+0x92/0x410
[   91.655893]        _mutex_lock+0x28/0x40
[   91.655895]        nvkm_udevice_fini+0x21/0x70 [nouveau]
[   91.655919]        nvkm_object_fini+0xb8/0x210 [nouveau]
[   91.655931]        nvkm_object_fini+0x73/0x210 [nouveau]
[   91.655943]        nvkm_ioctl_del+0x7e/0xa0 [nouveau]
[   91.655954]        nvkm_ioctl+0x10a/0x240 [nouveau]
[   91.655966]        nvif_object_dtor+0x4a/0x60 [nouveau]
[   91.655976]        nvif_client_dtor+0xe/0x40 [nouveau]
[   91.655986]        nouveau_cli_fini+0x78/0x90 [nouveau]
[   91.656010]        nouveau_drm_postclose+0xa6/0xe0 [nouveau]
[   91.656033]        drm_file_free.part.9+0x27e/0x2d0 [drm]
[   91.656045]        drm_release+0x6f/0xf0 [drm]
[   91.656052]        __fput+0xb2/0x260
[   91.656053]        task_work_run+0x73/0xc0
[   91.656055]        exit_to_user_mode_prepare+0x204/0x230
[   91.656056]        syscall_exit_to_user_mode+0x4a/0x330
[   91.656057]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   91.656058]
               -> #1 (&cli->lock){+.+.}-{0:0}:
[   91.656059]        lock_acquire+0x92/0x410
[   91.656060]        _mutex_lock+0x28/0x40
[   91.656061]        nouveau_mem_fini+0x4a/0x70 [nouveau]
[   91.656086]        ttm_tt_destroy+0x22/0x70 [ttm]
[   91.656089]        ttm_bo_cleanup_memtype_use+0x32/0xa0 [ttm]
[   91.656091]        ttm_bo_put+0xe7/0x670 [ttm]
[   91.656093]        ttm_bo_vm_close+0x15/0x30 [ttm]
[   91.656096]        remove_vma+0x3e/0x70
[   91.656097]        __do_munmap+0x2b7/0x4f0
[   91.656098]        __vm_munmap+0x5b/0xa0
[   91.656098]        __x64_sys_munmap+0x27/0x30
[   91.656099]        do_syscall_64+0x33/0x40
[   91.656100]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   91.656100]
               -> #0 (&mm->mmap_lock#2){++++}-{0:0}:
[   91.656102]        validate_chain+0x981/0x1250
[   91.656103]        __lock_acquire+0x86f/0xbd0
[   91.656104]        lock_acquire+0x92/0x410
[   91.656105]        down_write+0x3b/0x50
[   91.656106]        mpol_rebind_mm+0x1e/0x50
[   91.656108]        cpuset_attach+0x229/0x390
[   91.656109]        cgroup_migrate_execute+0x46d/0x490
[   91.656111]        cgroup_attach_task+0x20c/0x430
[   91.656112]        __cgroup1_procs_write.constprop.21+0xf3/0x150
[   91.656113]        cgroup_file_write+0x7e/0x1a0
[   91.656114]        kernfs_fop_write+0x113/0x1b0
[   91.656116]        vfs_write+0xf0/0x230
[   91.656116]        ksys_write+0x87/0xc0
[   91.656117]        do_syscall_64+0x33/0x40
[   91.656118]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   91.656118]
               other info that might help us debug this:

[   91.656119] Chain exists of:
                 &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem

[   91.656120]  Possible unsafe locking scenario:

[   91.656120]        CPU0                    CPU1
[   91.656120]        ----                    ----
[   91.656121]   lock(&cpuset_rwsem);
[   91.656121]                                lock(&device->mutex);
[   91.656122]                                lock(&cpuset_rwsem);
[   91.656122]   lock(&mm->mmap_lock#2);
[   91.656123]
                *** DEADLOCK ***

[   91.656123] 6 locks held by libvirtd/1868:
[   91.656124]  #0: ffff9186df6f5f88 (&f->f_pos_lock){+.+.}-{0:0}, at: __fdget_pos+0x46/0x50
[   91.656127]  #1: ffff918553b695f8 (sb_writers#7){.+.+}-{0:0}, at: vfs_write+0x1c1/0x230
[   91.656128]  #2: ffff91873fb950a8 (&of->mutex){+.+.}-{0:0}, at: kernfs_fop_write+0xde/0x1b0
[   91.656130]  #3: ffffffff995b2cc8 (cgroup_mutex){+.+.}-{3:3}, at: cgroup_kn_lock_live+0xe8/0x1d0
[   91.656133]  #4: ffffffff995b2900 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: cgroup_procs_write_start+0x6e/0x200
[   91.656135]  #5: ffffffff995b6c80 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x38/0x390
[   91.656137]
               stack backtrace:
[   91.656137] CPU: 6 PID: 1868 Comm: libvirtd Kdump: loaded Tainted: G S          E     5.9.1-rt18-rt #5
[   91.656138] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[   91.656139] Call Trace:
[   91.656141]  dump_stack+0x77/0x9b
[   91.656143]  check_noncircular+0x148/0x160
[   91.656144]  ? validate_chain+0x9d6/0x1250
[   91.656146]  ? validate_chain+0x981/0x1250
[   91.656147]  validate_chain+0x981/0x1250
[   91.656150]  __lock_acquire+0x86f/0xbd0
[   91.656151]  lock_acquire+0x92/0x410
[   91.656152]  ? mpol_rebind_mm+0x1e/0x50
[   91.656155]  down_write+0x3b/0x50
[   91.656156]  ? mpol_rebind_mm+0x1e/0x50
[   91.656157]  mpol_rebind_mm+0x1e/0x50
[   91.656158]  cpuset_attach+0x229/0x390
[   91.656160]  cgroup_migrate_execute+0x46d/0x490
[   91.656162]  cgroup_attach_task+0x20c/0x430
[   91.656165]  ? __cgroup1_procs_write.constprop.21+0xf3/0x150
[   91.656166]  __cgroup1_procs_write.constprop.21+0xf3/0x150
[   91.656168]  cgroup_file_write+0x7e/0x1a0
[   91.656169]  kernfs_fop_write+0x113/0x1b0
[   91.656171]  vfs_write+0xf0/0x230
[   91.656172]  ksys_write+0x87/0xc0
[   91.656173]  ? lockdep_hardirqs_on+0x78/0x100
[   91.656174]  do_syscall_64+0x33/0x40
[   91.656175]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   91.656176] RIP: 0033:0x7fbfe3bc0deb
[   91.656178] Code: 53 48 89 d5 48 89 f3 48 83 ec 18 48 89 7c 24 08 e8 5a fd ff ff 48 89 ea 41 89 c0 48 89 de 48 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 90 fd ff ff 48
[   91.656179] RSP: 002b:00007fbfd94f72f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[   91.656179] RAX: ffffffffffffffda RBX: 00007fbfc8048b20 RCX: 00007fbfe3bc0deb
[   91.656180] RDX: 0000000000000004 RSI: 00007fbfc8048b20 RDI: 000000000000001f
[   91.656180] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[   91.656181] R10: 0000000000000000 R11: 0000000000000293 R12: 00007fbfc8048b20
[   91.656181] R13: 0000000000000000 R14: 000000000000001f R15: 0000000000000214


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: ltp or kvm triggerable lockdep alloc_pid()  deadlock gripe
  2020-10-22  5:21 ` ltp or kvm triggerable lockdep alloc_pid() deadlock gripe Mike Galbraith
@ 2020-10-22 16:44   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-22 16:44 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-10-22 07:21:13 [+0200], Mike Galbraith wrote:
> Greetings,
Hi,

> The gripe below is repeatable in two ways here, boot with nomodeset so
> nouveau doesn't steal the lockdep show when I then fire up one of my
> (oink) full distro VM's, or from an ltp directory ./runltp -f cpuset
> with the attached subset of controllers file placed in ./runtest dir.
> 
> Method2 may lead to a real deal deadlock, I've got a crashdump of one,
> stack traces of uninterruptible sleepers attached.

Just added commit
	267580db047ef ("seqlock: Unbreak lockdep")

and it is gone.

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-22  5:28 ` kvm+nouveau induced lockdep gripe Mike Galbraith
@ 2020-10-23  9:01   ` Sebastian Andrzej Siewior
  2020-10-23 12:07     ` Mike Galbraith
       [not found]     ` <20201024022236.19608-1-hdanton@sina.com>
  0 siblings, 2 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-23  9:01 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-10-22 07:28:20 [+0200], Mike Galbraith wrote:
> I've only as yet seen nouveau lockdep gripage when firing up one of my
> full distro KVM's.

Could you please check !RT with the `threadirqs' command line option? I
don't think RT is doing here anything different (except for having
threaded interrupts enabled by default).

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-23  9:01   ` Sebastian Andrzej Siewior
@ 2020-10-23 12:07     ` Mike Galbraith
       [not found]     ` <20201024022236.19608-1-hdanton@sina.com>
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2020-10-23 12:07 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt,
	Ben Skeggs, nouveau

On Fri, 2020-10-23 at 11:01 +0200, Sebastian Andrzej Siewior wrote:
> On 2020-10-22 07:28:20 [+0200], Mike Galbraith wrote:
> > I've only as yet seen nouveau lockdep gripage when firing up one of my
> > full distro KVM's.
>
> Could you please check !RT with the `threadirqs' command line option? I
> don't think RT is doing here anything different (except for having
> threaded interrupts enabled by default).

Yup, you are correct, RT is innocent.


[   70.135201] ======================================================
[   70.135206] WARNING: possible circular locking dependency detected
[   70.135211] 5.9.0.gf989335-master #1 Tainted: G            E
[   70.135216] ------------------------------------------------------
[   70.135220] libvirtd/1838 is trying to acquire lock:
[   70.135225] ffff983590c2d5a8 (&mm->mmap_lock#2){++++}-{3:3}, at: mpol_rebind_mm+0x1e/0x50
[   70.135239]
               but task is already holding lock:
[   70.135244] ffffffff8a585410 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x38/0x390
[   70.135256]
               which lock already depends on the new lock.

[   70.135261]
               the existing dependency chain (in reverse order) is:
[   70.135266]
               -> #3 (&cpuset_rwsem){++++}-{0:0}:
[   70.135275]        cpuset_read_lock+0x39/0xd0
[   70.135282]        __sched_setscheduler+0x456/0xa90
[   70.135287]        _sched_setscheduler+0x69/0x70
[   70.135292]        __kthread_create_on_node+0x114/0x170
[   70.135297]        kthread_create_on_node+0x37/0x40
[   70.135306]        setup_irq_thread+0x37/0x90
[   70.135312]        __setup_irq+0x4e0/0x7c0
[   70.135318]        request_threaded_irq+0xf8/0x160
[   70.135371]        nvkm_pci_oneinit+0x4c/0x70 [nouveau]
[   70.135399]        nvkm_subdev_init+0x60/0x1e0 [nouveau]
[   70.135449]        nvkm_device_init+0x10b/0x240 [nouveau]
[   70.135506]        nvkm_udevice_init+0x49/0x70 [nouveau]
[   70.135531]        nvkm_object_init+0x3d/0x180 [nouveau]
[   70.135555]        nvkm_ioctl_new+0x1a1/0x260 [nouveau]
[   70.135578]        nvkm_ioctl+0x10a/0x240 [nouveau]
[   70.135600]        nvif_object_ctor+0xeb/0x150 [nouveau]
[   70.135622]        nvif_device_ctor+0x1f/0x60 [nouveau]
[   70.135668]        nouveau_cli_init+0x1ac/0x590 [nouveau]
[   70.135711]        nouveau_drm_device_init+0x68/0x800 [nouveau]
[   70.135753]        nouveau_drm_probe+0xfb/0x200 [nouveau]
[   70.135761]        local_pci_probe+0x42/0x90
[   70.135767]        pci_device_probe+0xe7/0x1a0
[   70.135773]        really_probe+0xf7/0x4d0
[   70.135779]        driver_probe_device+0x5d/0x140
[   70.135785]        device_driver_attach+0x4f/0x60
[   70.135790]        __driver_attach+0xa4/0x140
[   70.135796]        bus_for_each_dev+0x67/0x90
[   70.135801]        bus_add_driver+0x18c/0x230
[   70.135807]        driver_register+0x5b/0xf0
[   70.135813]        do_one_initcall+0x54/0x2f0
[   70.135819]        do_init_module+0x5b/0x21b
[   70.135825]        load_module+0x1e40/0x2370
[   70.135830]        __do_sys_finit_module+0x98/0xe0
[   70.135836]        do_syscall_64+0x33/0x40
[   70.135842]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   70.135847]
               -> #2 (&device->mutex){+.+.}-{3:3}:
[   70.135857]        __mutex_lock+0x90/0x9c0
[   70.135902]        nvkm_udevice_fini+0x23/0x70 [nouveau]
[   70.135927]        nvkm_object_fini+0xb8/0x210 [nouveau]
[   70.135951]        nvkm_object_fini+0x73/0x210 [nouveau]
[   70.135974]        nvkm_ioctl_del+0x7e/0xa0 [nouveau]
[   70.135997]        nvkm_ioctl+0x10a/0x240 [nouveau]
[   70.136019]        nvif_object_dtor+0x4a/0x60 [nouveau]
[   70.136040]        nvif_client_dtor+0xe/0x40 [nouveau]
[   70.136085]        nouveau_cli_fini+0x7a/0x90 [nouveau]
[   70.136128]        nouveau_drm_postclose+0xaa/0xe0 [nouveau]
[   70.136150]        drm_file_free.part.7+0x273/0x2c0 [drm]
[   70.136165]        drm_release+0x6e/0xf0 [drm]
[   70.136171]        __fput+0xb2/0x260
[   70.136177]        task_work_run+0x73/0xc0
[   70.136183]        exit_to_user_mode_prepare+0x1a5/0x1d0
[   70.136189]        syscall_exit_to_user_mode+0x46/0x2a0
[   70.136195]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   70.136200]
               -> #1 (&cli->lock){+.+.}-{3:3}:
[   70.136209]        __mutex_lock+0x90/0x9c0
[   70.136252]        nouveau_mem_fini+0x4c/0x70 [nouveau]
[   70.136294]        nouveau_sgdma_destroy+0x20/0x50 [nouveau]
[   70.136302]        ttm_bo_cleanup_memtype_use+0x3e/0x60 [ttm]
[   70.136310]        ttm_bo_release+0x29c/0x600 [ttm]
[   70.136317]        ttm_bo_vm_close+0x15/0x30 [ttm]
[   70.136324]        remove_vma+0x3e/0x70
[   70.136329]        __do_munmap+0x2b7/0x4f0
[   70.136333]        __vm_munmap+0x5b/0xa0
[   70.136338]        __x64_sys_munmap+0x27/0x30
[   70.136343]        do_syscall_64+0x33/0x40
[   70.136349]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   70.136354]
               -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
[   70.136365]        __lock_acquire+0x149d/0x1a70
[   70.136371]        lock_acquire+0x1a7/0x3b0
[   70.136376]        down_write+0x38/0x70
[   70.136382]        mpol_rebind_mm+0x1e/0x50
[   70.136387]        cpuset_attach+0x229/0x390
[   70.136393]        cgroup_migrate_execute+0x46d/0x490
[   70.136398]        cgroup_attach_task+0x20c/0x3b0
[   70.136404]        __cgroup1_procs_write.constprop.21+0xf3/0x150
[   70.136411]        cgroup_file_write+0x64/0x210
[   70.136416]        kernfs_fop_write+0x117/0x1b0
[   70.136422]        vfs_write+0xe8/0x240
[   70.136427]        ksys_write+0x87/0xc0
[   70.136432]        do_syscall_64+0x33/0x40
[   70.136438]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   70.136443]
               other info that might help us debug this:

[   70.136450] Chain exists of:
                 &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem

[   70.136463]  Possible unsafe locking scenario:

[   70.136469]        CPU0                    CPU1
[   70.136473]        ----                    ----
[   70.136477]   lock(&cpuset_rwsem);
[   70.136483]                                lock(&device->mutex);
[   70.136489]                                lock(&cpuset_rwsem);
[   70.136495]   lock(&mm->mmap_lock#2);
[   70.136501]
                *** DEADLOCK ***

[   70.136508] 6 locks held by libvirtd/1838:
[   70.136512]  #0: ffff98359eb27af0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x45/0x50
[   70.136524]  #1: ffff983591a58460 (sb_writers#7){.+.+}-{0:0}, at: vfs_write+0x1aa/0x240
[   70.136535]  #2: ffff9835bbf50488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write+0xe2/0x1b0
[   70.136545]  #3: ffffffff8a581848 (cgroup_mutex){+.+.}-{3:3}, at: cgroup_kn_lock_live+0xea/0x1d0
[   70.136556]  #4: ffffffff8a5816b0 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: cgroup_procs_write_start+0x6e/0x200
[   70.136567]  #5: ffffffff8a585410 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x38/0x390
[   70.136579]
               stack backtrace:
[   70.136585] CPU: 2 PID: 1838 Comm: libvirtd Kdump: loaded Tainted: G            E     5.9.0.gf989335-master #1
[   70.136592] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[   70.136598] Call Trace:
[   70.136605]  dump_stack+0x77/0x97
[   70.136611]  check_noncircular+0xe7/0x100
[   70.136618]  ? stack_trace_save+0x3b/0x50
[   70.136626]  ? __lock_acquire+0x149d/0x1a70
[   70.136631]  __lock_acquire+0x149d/0x1a70
[   70.136640]  lock_acquire+0x1a7/0x3b0
[   70.136645]  ? mpol_rebind_mm+0x1e/0x50
[   70.136652]  down_write+0x38/0x70
[   70.136657]  ? mpol_rebind_mm+0x1e/0x50
[   70.136663]  mpol_rebind_mm+0x1e/0x50
[   70.136669]  cpuset_attach+0x229/0x390
[   70.136675]  cgroup_migrate_execute+0x46d/0x490
[   70.136681]  ? _raw_spin_unlock_irq+0x2f/0x50
[   70.136688]  cgroup_attach_task+0x20c/0x3b0
[   70.136702]  ? __cgroup1_procs_write.constprop.21+0xf3/0x150
[   70.136712]  __cgroup1_procs_write.constprop.21+0xf3/0x150
[   70.136722]  cgroup_file_write+0x64/0x210
[   70.136728]  kernfs_fop_write+0x117/0x1b0
[   70.136735]  vfs_write+0xe8/0x240
[   70.136741]  ksys_write+0x87/0xc0
[   70.136746]  ? lockdep_hardirqs_on+0x85/0x110
[   70.136752]  do_syscall_64+0x33/0x40
[   70.136758]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   70.136764] RIP: 0033:0x7efc17533deb
[   70.136770] Code: 53 48 89 d5 48 89 f3 48 83 ec 18 48 89 7c 24 08 e8 5a fd ff ff 48 89 ea 41 89 c0 48 89 de 48 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 90 fd ff ff 48
[   70.136781] RSP: 002b:00007efc0d66b2f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[   70.136788] RAX: ffffffffffffffda RBX: 00007efbf80500f0 RCX: 00007efc17533deb
[   70.136794] RDX: 0000000000000004 RSI: 00007efbf80500f0 RDI: 000000000000001f
[   70.136799] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[   70.136805] R10: 0000000000000000 R11: 0000000000000293 R12: 00007efbf80500f0
[   70.136811] R13: 0000000000000000 R14: 000000000000001f R15: 0000000000000214


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
       [not found]     ` <20201024022236.19608-1-hdanton@sina.com>
@ 2020-10-24  3:38       ` Mike Galbraith
       [not found]       ` <20201024050000.8104-1-hdanton@sina.com>
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2020-10-24  3:38 UTC (permalink / raw)
  To: Hillf Danton, Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt,
	Ben Skeggs, nouveau

On Sat, 2020-10-24 at 10:22 +0800, Hillf Danton wrote:
>
> Looks like we can break the lock chain by moving ttm bo's release
> method out of mmap_lock, see diff below.

Ah, the perfect compliment to morning java, a patchlet to wedge in and
see what happens.

wedge/build/boot <schlurp... ahhh>

Mmm, box says no banana... a lot.

[   30.456921] ================================
[   30.456924] WARNING: inconsistent lock state
[   30.456928] 5.9.0.gf11901e-master #2 Tainted: G S          E
[   30.456932] --------------------------------
[   30.456935] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   30.456940] ksoftirqd/4/36 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   30.456944] ffff8e2c8bde9e40 (&mgr->vm_lock){++?+}-{2:2}, at: drm_vma_offset_remove+0x14/0x70 [drm]
[   30.456976] {SOFTIRQ-ON-W} state was registered at:
[   30.456982]   lock_acquire+0x1a7/0x3b0
[   30.456987]   _raw_write_lock+0x2f/0x40
[   30.457006]   drm_vma_offset_add+0x1c/0x60 [drm]
[   30.457013]   ttm_bo_init_reserved+0x28b/0x460 [ttm]
[   30.457020]   ttm_bo_init+0x57/0x110 [ttm]
[   30.457066]   nouveau_bo_init+0xb0/0xc0 [nouveau]
[   30.457108]   nouveau_bo_new+0x4d/0x60 [nouveau]
[   30.457145]   nv84_fence_create+0xb9/0x130 [nouveau]
[   30.457180]   nvc0_fence_create+0xe/0x47 [nouveau]
[   30.457221]   nouveau_drm_device_init+0x3d9/0x800 [nouveau]
[   30.457262]   nouveau_drm_probe+0xfb/0x200 [nouveau]
[   30.457268]   local_pci_probe+0x42/0x90
[   30.457272]   pci_device_probe+0xe7/0x1a0
[   30.457276]   really_probe+0xf7/0x4d0
[   30.457280]   driver_probe_device+0x5d/0x140
[   30.457284]   device_driver_attach+0x4f/0x60
[   30.457288]   __driver_attach+0xa4/0x140
[   30.457292]   bus_for_each_dev+0x67/0x90
[   30.457296]   bus_add_driver+0x18c/0x230
[   30.457299]   driver_register+0x5b/0xf0
[   30.457304]   do_one_initcall+0x54/0x2f0
[   30.457309]   do_init_module+0x5b/0x21b
[   30.457314]   load_module+0x1e40/0x2370
[   30.457317]   __do_sys_finit_module+0x98/0xe0
[   30.457321]   do_syscall_64+0x33/0x40
[   30.457326]   entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   30.457329] irq event stamp: 366850
[   30.457335] hardirqs last  enabled at (366850): [<ffffffffa11312ff>] rcu_nocb_unlock_irqrestore+0x4f/0x60
[   30.457342] hardirqs last disabled at (366849): [<ffffffffa11384ef>] rcu_do_batch+0x59f/0x990
[   30.457347] softirqs last  enabled at (366834): [<ffffffffa1c002d7>] __do_softirq+0x2d7/0x4a4
[   30.457357] softirqs last disabled at (366839): [<ffffffffa10928c2>] run_ksoftirqd+0x32/0x60
[   30.457363]
               other info that might help us debug this:
[   30.457369]  Possible unsafe locking scenario:

[   30.457375]        CPU0
[   30.457378]        ----
[   30.457381]   lock(&mgr->vm_lock);
[   30.457386]   <Interrupt>
[   30.457389]     lock(&mgr->vm_lock);
[   30.457394]
                *** DEADLOCK ***

<snips 999 lockdep lines and zillion ATOMIC_SLEEP gripes>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
       [not found]       ` <20201024050000.8104-1-hdanton@sina.com>
@ 2020-10-24  5:25         ` Mike Galbraith
       [not found]         ` <20201024094224.2804-1-hdanton@sina.com>
  2020-10-26 17:31         ` Sebastian Andrzej Siewior
  2 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2020-10-24  5:25 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Sebastian Andrzej Siewior, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On Sat, 2020-10-24 at 13:00 +0800, Hillf Danton wrote:
> On Sat, 24 Oct 2020 05:38:23 +0200 Mike Galbraith wrote:
> > On Sat, 2020-10-24 at 10:22 +0800, Hillf Danton wrote:
> > >
> > > Looks like we can break the lock chain by moving ttm bo's release
> > > method out of mmap_lock, see diff below.
> >
> > Ah, the perfect compliment to morning java, a patchlet to wedge in and
> > see what happens.
> >
> > wedge/build/boot <schlurp... ahhh>
> >
> > Mmm, box says no banana... a lot.
>
> Hmm...curious how that word went into your mind. And when?

There's a colloquial expression "close, but no cigar", and a variant
"close, but no banana".  The intended communication being "not quite".

	-Mike


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
       [not found]         ` <20201024094224.2804-1-hdanton@sina.com>
@ 2020-10-26 17:26           ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-26 17:26 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Mike Galbraith, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On 2020-10-24 17:42:24 [+0800], Hillf Danton wrote:
> Hmm...sounds like you learned English neither at high school nor
> college. When did you start learning it?

He learned enough to submit valid bug report where you could reply with
	Thank you Mike!

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
       [not found]       ` <20201024050000.8104-1-hdanton@sina.com>
  2020-10-24  5:25         ` Mike Galbraith
       [not found]         ` <20201024094224.2804-1-hdanton@sina.com>
@ 2020-10-26 17:31         ` Sebastian Andrzej Siewior
  2020-10-26 19:15           ` Mike Galbraith
  2 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-26 17:31 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Mike Galbraith, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On 2020-10-24 13:00:00 [+0800], Hillf Danton wrote:
> 
> Hmm...curious how that word went into your mind. And when?
> > [   30.457363]
> >                other info that might help us debug this:
> > [   30.457369]  Possible unsafe locking scenario:
> > 
> > [   30.457375]        CPU0
> > [   30.457378]        ----
> > [   30.457381]   lock(&mgr->vm_lock);
> > [   30.457386]   <Interrupt>
> > [   30.457389]     lock(&mgr->vm_lock);
> > [   30.457394]
> >                 *** DEADLOCK ***
> > 
> > <snips 999 lockdep lines and zillion ATOMIC_SLEEP gripes>

The backtrace contained the "normal" vm_lock. What should follow is the
backtrace of the in-softirq usage.

> 
> Dunno if blocking softint is a right cure.
> 
> --- a/drivers/gpu/drm/drm_vma_manager.c
> +++ b/drivers/gpu/drm/drm_vma_manager.c
> @@ -229,6 +229,7 @@ EXPORT_SYMBOL(drm_vma_offset_add);
>  void drm_vma_offset_remove(struct drm_vma_offset_manager *mgr,
>  			   struct drm_vma_offset_node *node)
>  {
> +	local_bh_disable();

There is write_lock_bh(). However changing only one will produce the
same backtrace somewhere else unless all other users already run BH
disabled region.

>  	write_lock(&mgr->vm_lock);
>  
>  	if (drm_mm_node_allocated(&node->vm_node)) {

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-26 17:31         ` Sebastian Andrzej Siewior
@ 2020-10-26 19:15           ` Mike Galbraith
  2020-10-26 19:53             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Galbraith @ 2020-10-26 19:15 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Hillf Danton
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt, Ben Skeggs

On Mon, 2020-10-26 at 18:31 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-10-24 13:00:00 [+0800], Hillf Danton wrote:
> >
> > Hmm...curious how that word went into your mind. And when?
> > > [   30.457363]
> > >                other info that might help us debug this:
> > > [   30.457369]  Possible unsafe locking scenario:
> > >
> > > [   30.457375]        CPU0
> > > [   30.457378]        ----
> > > [   30.457381]   lock(&mgr->vm_lock);
> > > [   30.457386]   <Interrupt>
> > > [   30.457389]     lock(&mgr->vm_lock);
> > > [   30.457394]
> > >                 *** DEADLOCK ***
> > >
> > > <snips 999 lockdep lines and zillion ATOMIC_SLEEP gripes>
>
> The backtrace contained the "normal" vm_lock. What should follow is the
> backtrace of the in-softirq usage.
>
> >
> > Dunno if blocking softint is a right cure.
> >
> > --- a/drivers/gpu/drm/drm_vma_manager.c
> > +++ b/drivers/gpu/drm/drm_vma_manager.c
> > @@ -229,6 +229,7 @@ EXPORT_SYMBOL(drm_vma_offset_add);
> >  void drm_vma_offset_remove(struct drm_vma_offset_manager *mgr,
> >  			   struct drm_vma_offset_node *node)
> >  {
> > +	local_bh_disable();
>
> There is write_lock_bh(). However changing only one will produce the
> same backtrace somewhere else unless all other users already run BH
> disabled region.

Since there doesn't _seems_ to be a genuine deadlock lurking, I just
asked lockdep to please not log the annoying initialization time
chain.

--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -116,7 +116,17 @@ nvkm_pci_oneinit(struct nvkm_subdev *sub
 			return ret;
 	}

+	/*
+	 * Scheduler code taking cpuset_rwsem during irq thread initialization sets
+	 * up a cpuset_rwsem vs mm->mmap_lock circular dependency gripe upon later
+	 * cpuset usage. It's harmless, tell lockdep there's nothing to see here.
+	 */
+	if (force_irqthreads)
+		lockdep_off();
 	ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
+	if (force_irqthreads)
+		lockdep_on();
+
 	if (ret)
 		return ret;



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-26 19:15           ` Mike Galbraith
@ 2020-10-26 19:53             ` Sebastian Andrzej Siewior
  2020-10-27  6:03               ` Mike Galbraith
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-26 19:53 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On 2020-10-26 20:15:23 [+0100], Mike Galbraith wrote:
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
> @@ -116,7 +116,17 @@ nvkm_pci_oneinit(struct nvkm_subdev *sub
>  			return ret;
>  	}
> 
> +	/*
> +	 * Scheduler code taking cpuset_rwsem during irq thread initialization sets
> +	 * up a cpuset_rwsem vs mm->mmap_lock circular dependency gripe upon later
> +	 * cpuset usage. It's harmless, tell lockdep there's nothing to see here.
> +	 */
> +	if (force_irqthreads)
> +		lockdep_off();
>  	ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
> +	if (force_irqthreads)
> +		lockdep_on();
> +
>  	if (ret)
>  		return ret;
> 
Could you try this, please?

--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1155,6 +1155,8 @@ static int irq_thread(void *data)
 	irqreturn_t (*handler_fn)(struct irq_desc *desc,
 			struct irqaction *action);
 
+	sched_set_fifo(current);
+
 	if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD,
 					&action->thread_flags))
 		handler_fn = irq_forced_thread_fn;
@@ -1320,8 +1322,6 @@ setup_irq_thread(struct irqaction *new,
 	if (IS_ERR(t))
 		return PTR_ERR(t);
 
-	sched_set_fifo(t);
-
 	/*
 	 * We keep the reference to the task struct even if
 	 * the thread dies to avoid that the interrupt code


Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-26 19:53             ` Sebastian Andrzej Siewior
@ 2020-10-27  6:03               ` Mike Galbraith
  2020-10-27  9:00                 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Galbraith @ 2020-10-27  6:03 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On Mon, 2020-10-26 at 20:53 +0100, Sebastian Andrzej Siewior wrote:
>
> Could you try this, please?

Nogo, first call of sched_setscheduler() is via kthread_create().  I
confirmed that nuking that (gratuitous user foot saving override) call
on top of moving sched_set_fifo() does shut it up, but that won't fly.

> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1155,6 +1155,8 @@ static int irq_thread(void *data)
>  	irqreturn_t (*handler_fn)(struct irq_desc *desc,
>  			struct irqaction *action);
>
> +	sched_set_fifo(current);
> +
>  	if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD,
>  					&action->thread_flags))
>  		handler_fn = irq_forced_thread_fn;
> @@ -1320,8 +1322,6 @@ setup_irq_thread(struct irqaction *new,
>  	if (IS_ERR(t))
>  		return PTR_ERR(t);
>
> -	sched_set_fifo(t);
> -
>  	/*
>  	 * We keep the reference to the task struct even if
>  	 * the thread dies to avoid that the interrupt code
>

[  150.926954] ======================================================
[  150.926967] WARNING: possible circular locking dependency detected
[  150.926981] 5.10.0.g3650b22-master #13 Tainted: G S          E
[  150.926993] ------------------------------------------------------
[  150.927005] libvirtd/1833 is trying to acquire lock:
[  150.927016] ffff921a45ed55a8 (&mm->mmap_lock#2){++++}-{3:3}, at: mpol_rebind_mm+0x1e/0x50
[  150.927052]
               but task is already holding lock:
[  150.927064] ffffffffad585410 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x38/0x390
[  150.927094]
               which lock already depends on the new lock.

[  150.927108]
               the existing dependency chain (in reverse order) is:
[  150.927122]
               -> #3 (&cpuset_rwsem){++++}-{0:0}:
[  150.927145]        cpuset_read_lock+0x39/0xd0
[  150.927160]        __sched_setscheduler+0x456/0xa90
[  150.927173]        _sched_setscheduler+0x69/0x70
[  150.927187]        __kthread_create_on_node+0x114/0x170
[  150.927205]        kthread_create_on_node+0x37/0x40
[  150.927223]        setup_irq_thread+0x33/0xb0
[  150.927238]        __setup_irq+0x4e0/0x7c0
[  150.927254]        request_threaded_irq+0xf8/0x160
[  150.927393]        nvkm_pci_oneinit+0x4c/0x70 [nouveau]
[  150.927469]        nvkm_subdev_init+0x60/0x1e0 [nouveau]
[  150.927603]        nvkm_device_init+0x10b/0x240 [nouveau]
[  150.927733]        nvkm_udevice_init+0x49/0x70 [nouveau]
[  150.927804]        nvkm_object_init+0x3d/0x180 [nouveau]
[  150.927871]        nvkm_ioctl_new+0x1a1/0x260 [nouveau]
[  150.927938]        nvkm_ioctl+0x10a/0x240 [nouveau]
[  150.928001]        nvif_object_ctor+0xeb/0x150 [nouveau]
[  150.928069]        nvif_device_ctor+0x1f/0x60 [nouveau]
[  150.928201]        nouveau_cli_init+0x1ac/0x590 [nouveau]
[  150.928332]        nouveau_drm_device_init+0x68/0x800 [nouveau]
[  150.928462]        nouveau_drm_probe+0xfb/0x200 [nouveau]
[  150.928483]        local_pci_probe+0x42/0x90
[  150.928501]        pci_device_probe+0xe7/0x1a0
[  150.928519]        really_probe+0xf7/0x4d0
[  150.928536]        driver_probe_device+0x5d/0x140
[  150.928552]        device_driver_attach+0x4f/0x60
[  150.928569]        __driver_attach+0xa4/0x140
[  150.928584]        bus_for_each_dev+0x67/0x90
[  150.928600]        bus_add_driver+0x18c/0x230
[  150.928616]        driver_register+0x5b/0xf0
[  150.928633]        do_one_initcall+0x54/0x2f0
[  150.928650]        do_init_module+0x5b/0x21b
[  150.928667]        load_module+0x1e40/0x2370
[  150.928682]        __do_sys_finit_module+0x98/0xe0
[  150.928698]        do_syscall_64+0x33/0x40
[  150.928716]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  150.928731]
               -> #2 (&device->mutex){+.+.}-{3:3}:
[  150.928761]        __mutex_lock+0x90/0x9c0
[  150.928895]        nvkm_udevice_fini+0x23/0x70 [nouveau]
[  150.928975]        nvkm_object_fini+0xb8/0x210 [nouveau]
[  150.929048]        nvkm_object_fini+0x73/0x210 [nouveau]
[  150.929119]        nvkm_ioctl_del+0x7e/0xa0 [nouveau]
[  150.929191]        nvkm_ioctl+0x10a/0x240 [nouveau]
[  150.929258]        nvif_object_dtor+0x4a/0x60 [nouveau]
[  150.929326]        nvif_client_dtor+0xe/0x40 [nouveau]
[  150.929455]        nouveau_cli_fini+0x7a/0x90 [nouveau]
[  150.929586]        nouveau_drm_postclose+0xaa/0xe0 [nouveau]
[  150.929638]        drm_file_free.part.7+0x273/0x2c0 [drm]
[  150.929680]        drm_release+0x6e/0xf0 [drm]
[  150.929697]        __fput+0xb2/0x260
[  150.929714]        task_work_run+0x73/0xc0
[  150.929732]        exit_to_user_mode_prepare+0x1a5/0x1d0
[  150.929749]        syscall_exit_to_user_mode+0x46/0x2a0
[  150.929767]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  150.929782]
               -> #1 (&cli->lock){+.+.}-{3:3}:
[  150.929811]        __mutex_lock+0x90/0x9c0
[  150.929936]        nouveau_mem_fini+0x4c/0x70 [nouveau]
[  150.930062]        nouveau_sgdma_destroy+0x20/0x50 [nouveau]
[  150.930086]        ttm_bo_cleanup_memtype_use+0x3e/0x60 [ttm]
[  150.930109]        ttm_bo_release+0x29c/0x600 [ttm]
[  150.930130]        ttm_bo_vm_close+0x15/0x30 [ttm]
[  150.930150]        remove_vma+0x3e/0x70
[  150.930166]        __do_munmap+0x2b7/0x4f0
[  150.930181]        __vm_munmap+0x5b/0xa0
[  150.930195]        __x64_sys_munmap+0x27/0x30
[  150.930210]        do_syscall_64+0x33/0x40
[  150.930227]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  150.930241]
               -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
[  150.930275]        __lock_acquire+0x149d/0x1a70
[  150.930292]        lock_acquire+0x1a7/0x3b0
[  150.930307]        down_write+0x38/0x70
[  150.930323]        mpol_rebind_mm+0x1e/0x50
[  150.930340]        cpuset_attach+0x229/0x390
[  150.930355]        cgroup_migrate_execute+0x46d/0x490
[  150.930371]        cgroup_attach_task+0x20c/0x3b0
[  150.930388]        __cgroup1_procs_write.constprop.21+0xf3/0x150
[  150.930407]        cgroup_file_write+0x64/0x210
[  150.930423]        kernfs_fop_write+0x117/0x1b0
[  150.930440]        vfs_write+0xe8/0x240
[  150.930456]        ksys_write+0x87/0xc0
[  150.930471]        do_syscall_64+0x33/0x40
[  150.930487]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  150.930501]
               other info that might help us debug this:

[  150.930522] Chain exists of:
                 &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem

[  150.930561]  Possible unsafe locking scenario:

[  150.930577]        CPU0                    CPU1
[  150.930589]        ----                    ----
[  150.930602]   lock(&cpuset_rwsem);
[  150.930617]                                lock(&device->mutex);
[  150.930635]                                lock(&cpuset_rwsem);
[  150.930653]   lock(&mm->mmap_lock#2);
[  150.930671]
                *** DEADLOCK ***

[  150.930690] 6 locks held by libvirtd/1833:
[  150.930703]  #0: ffff921a6f1690f0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x45/0x50
[  150.930736]  #1: ffff921c88c1d460 (sb_writers#7){.+.+}-{0:0}, at: vfs_write+0x1aa/0x240
[  150.930771]  #2: ffff921a48734488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write+0xe2/0x1b0
[  150.930801]  #3: ffffffffad581848 (cgroup_mutex){+.+.}-{3:3}, at: cgroup_kn_lock_live+0xea/0x1d0
[  150.930833]  #4: ffffffffad5816b0 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: cgroup_procs_write_start+0x6e/0x200
[  150.930866]  #5: ffffffffad585410 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x38/0x390
[  150.930899]
               stack backtrace:
[  150.930918] CPU: 3 PID: 1833 Comm: libvirtd Kdump: loaded Tainted: G S          E     5.10.0.g3650b22-master #13
[  150.930938] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[  150.930955] Call Trace:
[  150.930974]  dump_stack+0x77/0x97
[  150.930993]  check_noncircular+0xe7/0x100
[  150.931012]  ? stack_trace_save+0x3b/0x50
[  150.931036]  ? __lock_acquire+0x149d/0x1a70
[  150.931053]  __lock_acquire+0x149d/0x1a70
[  150.931077]  lock_acquire+0x1a7/0x3b0
[  150.931093]  ? mpol_rebind_mm+0x1e/0x50
[  150.931114]  down_write+0x38/0x70
[  150.931129]  ? mpol_rebind_mm+0x1e/0x50
[  150.931144]  mpol_rebind_mm+0x1e/0x50
[  150.931162]  cpuset_attach+0x229/0x390
[  150.931180]  cgroup_migrate_execute+0x46d/0x490
[  150.931199]  ? _raw_spin_unlock_irq+0x2f/0x50
[  150.931217]  cgroup_attach_task+0x20c/0x3b0
[  150.931245]  ? __cgroup1_procs_write.constprop.21+0xf3/0x150
[  150.931263]  __cgroup1_procs_write.constprop.21+0xf3/0x150
[  150.931286]  cgroup_file_write+0x64/0x210
[  150.931304]  kernfs_fop_write+0x117/0x1b0
[  150.931323]  vfs_write+0xe8/0x240
[  150.931341]  ksys_write+0x87/0xc0
[  150.931357]  ? lockdep_hardirqs_on+0x85/0x110
[  150.931374]  do_syscall_64+0x33/0x40
[  150.931391]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  150.931409] RIP: 0033:0x7fcdc56a6deb
[  150.931425] Code: 53 48 89 d5 48 89 f3 48 83 ec 18 48 89 7c 24 08 e8 5a fd ff ff 48 89 ea 41 89 c0 48 89 de 48 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 90 fd ff ff 48
[  150.931456] RSP: 002b:00007fcdbcfdc2f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  150.931476] RAX: ffffffffffffffda RBX: 00007fcdb403ca00 RCX: 00007fcdc56a6deb
[  150.931493] RDX: 0000000000000004 RSI: 00007fcdb403ca00 RDI: 000000000000001f
[  150.931510] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[  150.931526] R10: 0000000000000000 R11: 0000000000000293 R12: 00007fcdb403ca00
[  150.931543] R13: 0000000000000000 R14: 000000000000001f R15: 0000000000000214





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] v5.9.1-rt18
  2020-10-21 13:14 ` [ANNOUNCE] v5.9.1-rt18 Sebastian Andrzej Siewior
@ 2020-10-27  6:53   ` Fernando Lopez-Lezcano
  2020-10-27  8:22     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 22+ messages in thread
From: Fernando Lopez-Lezcano @ 2020-10-27  6:53 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: nando, LKML, linux-rt-users, Steven Rostedt

On 10/21/20 6:14 AM, Sebastian Andrzej Siewior wrote:
> On 2020-10-21 14:53:27 [+0200], To Thomas Gleixner wrote:
>> Dear RT folks!
>>
>> I'm pleased to announce the v5.9.1-rt18 patch set.

Maybe I'm doing something wrong but I get a compilation error (see 
below) when trying to do a debug build (building rpm packages for 
Fedora). 5.9.1 + rt19...

Builds fine otherwise...
Thanks,
-- Fernando



+ make -s 'HOSTCFLAGS=-O2 -g -pipe -Wall -Werror=format-security 
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions 
-fstack-protector-strong -grecord-gcc-switches 
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -fcommon -m64 
-mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection 
-fcf-protection' 'HOSTLDFLAGS=-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now 
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld' ARCH=x86_64 KCFLAGS= 
WITH_GCOV=0 -j4 modules
BUILDSTDERR: In file included from <command-line>:
BUILDSTDERR: lib/test_lockup.c: In function 'test_lockup_init':
BUILDSTDERR: lib/test_lockup.c:484:31: error: 'spinlock_t' {aka 'struct 
spinlock'} has no member named 'rlock'; did you mean 'lock'?
BUILDSTDERR:   484 |          offsetof(spinlock_t, rlock.magic),
BUILDSTDERR:       |                               ^~~~~
BUILDSTDERR: ././include/linux/compiler_types.h:135:57: note: in 
definition of macro '__compiler_offsetof'
BUILDSTDERR:   135 | #define __compiler_offsetof(a, b) 
__builtin_offsetof(a, b)
BUILDSTDERR:       | 
      ^
BUILDSTDERR: lib/test_lockup.c:484:10: note: in expansion of macro 
'offsetof'
BUILDSTDERR:   484 |          offsetof(spinlock_t, rlock.magic),
BUILDSTDERR:       |          ^~~~~~~~
BUILDSTDERR: ././include/linux/compiler_types.h:135:35: error: 
'rwlock_t' {aka 'struct rt_rw_lock'} has no member named 'magic'
BUILDSTDERR:   135 | #define __compiler_offsetof(a, b) 
__builtin_offsetof(a, b)
BUILDSTDERR:       |                                   ^~~~~~~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/stddef.h:17:32: note: in expansion of macro 
'__compiler_offsetof'
BUILDSTDERR:    17 | #define offsetof(TYPE, MEMBER) 
__compiler_offsetof(TYPE, MEMBER)
BUILDSTDERR:       |                                ^~~~~~~~~~~~~~~~~~~
BUILDSTDERR: lib/test_lockup.c:487:10: note: in expansion of macro 
'offsetof'
BUILDSTDERR:   487 |          offsetof(rwlock_t, magic),
BUILDSTDERR:       |          ^~~~~~~~
BUILDSTDERR: lib/test_lockup.c:488:10: error: 'RWLOCK_MAGIC' undeclared 
(first use in this function); did you mean 'STACK_MAGIC'?
BUILDSTDERR:   488 |          RWLOCK_MAGIC) ||
BUILDSTDERR:       |          ^~~~~~~~~~~~
BUILDSTDERR:       |          STACK_MAGIC
BUILDSTDERR: lib/test_lockup.c:488:10: note: each undeclared identifier 
is reported only once for each function it appears in
BUILDSTDERR: In file included from <command-line>:
BUILDSTDERR: ././include/linux/compiler_types.h:135:35: error: 'struct 
mutex' has no member named 'wait_lock'
BUILDSTDERR:   135 | #define __compiler_offsetof(a, b) 
__builtin_offsetof(a, b)
BUILDSTDERR:       |                                   ^~~~~~~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/stddef.h:17:32: note: in expansion of macro 
'__compiler_offsetof'
BUILDSTDERR:    17 | #define offsetof(TYPE, MEMBER) 
__compiler_offsetof(TYPE, MEMBER)
BUILDSTDERR:       |                                ^~~~~~~~~~~~~~~~~~~
BUILDSTDERR: lib/test_lockup.c:490:10: note: in expansion of macro 
'offsetof'
BUILDSTDERR:   490 |          offsetof(struct mutex, wait_lock.rlock.magic),
BUILDSTDERR:       |          ^~~~~~~~
BUILDSTDERR: ././include/linux/compiler_types.h:135:35: error: 'struct 
rw_semaphore' has no member named 'wait_lock'
BUILDSTDERR:   135 | #define __compiler_offsetof(a, b) 
__builtin_offsetof(a, b)
BUILDSTDERR:       |                                   ^~~~~~~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/stddef.h:17:32: note: in expansion of macro 
'__compiler_offsetof'
BUILDSTDERR:    17 | #define offsetof(TYPE, MEMBER) 
__compiler_offsetof(TYPE, MEMBER)
BUILDSTDERR:       |                                ^~~~~~~~~~~~~~~~~~~
BUILDSTDERR: lib/test_lockup.c:493:10: note: in expansion of macro 
'offsetof'
BUILDSTDERR:   493 |          offsetof(struct rw_semaphore, 
wait_lock.magic),
BUILDSTDERR:       |          ^~~~~~~~
BUILDSTDERR: make[1]: *** [scripts/Makefile.build:283: 
lib/test_lockup.o] Error 1
BUILDSTDERR: make: *** [Makefile:1784: lib] Error 2
BUILDSTDERR: make: *** Waiting for unfinished jobs....



>>
>> Changes since v5.9.1-rt17:
>>
>>    - Update the migrate-disable series by Peter Zijlstra to v3. Include
>>      also fixes discussed in the thread.
>>
>>    - UP builds did not boot since the replace of the migrate-disable
>>      code. Reported by Christian Egger. Fixed as a part of v3 by Peter
>>      Zijlstra.
>>
>>    - Rebase the printk code on top of the ringer buffer designed for
>>      printk which was merged in the v5.10 merge window. Patches by John
>>      Ogness.
>>
>> Known issues
>>       - It has been pointed out that due to changes to the printk code the
>>         internal buffer representation changed. This is only an issue if tools
>>         like `crash' are used to extract the printk buffer from a kernel memory
>>         image.
>>
>> The delta patch against v5.9.1-rt17 is appended below and can be found here:
>>   
>>       https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/incr/patch-5.9.1-rt17-rt18.patch.xz
>>
>> You can get this release via the git tree at:
>>
>>      git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git v5.9.1-rt18
>>
>> The RT patch against v5.9.1 can be found here:
>>
>>      https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/older/patch-5.9.1-rt18.patch.xz
>>
>> The split quilt queue is available at:
>>
>>      https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.9/older/patches-5.9.1-rt18.tar.xz
>>
> 
> The attached diff was too large and the mail was dropped. It is
> available at
> 	https://git.kernel.org/rt/linux-rt-devel/d/v5.9.1-rt18/v5.9.1-rt17
> 
> Sebastian
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] v5.9.1-rt18
  2020-10-27  6:53   ` Fernando Lopez-Lezcano
@ 2020-10-27  8:22     ` Sebastian Andrzej Siewior
  2020-10-27 17:07       ` Fernando Lopez-Lezcano
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-27  8:22 UTC (permalink / raw)
  To: Fernando Lopez-Lezcano
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-10-26 23:53:20 [-0700], Fernando Lopez-Lezcano wrote:
> Maybe I'm doing something wrong but I get a compilation error (see below)
> when trying to do a debug build (building rpm packages for Fedora). 5.9.1 +
> rt19...
> 
> Builds fine otherwise...

If you could remove CONFIG_TEST_LOCKUP then it should work. I will think
of something.

> Thanks,
> -- Fernando

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-27  6:03               ` Mike Galbraith
@ 2020-10-27  9:00                 ` Sebastian Andrzej Siewior
  2020-10-27  9:49                   ` Mike Galbraith
  2020-10-27 10:14                   ` Mike Galbraith
  0 siblings, 2 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-27  9:00 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On 2020-10-27 07:03:38 [+0100], Mike Galbraith wrote:
> On Mon, 2020-10-26 at 20:53 +0100, Sebastian Andrzej Siewior wrote:
> >
> > Could you try this, please?
> 
> Nogo, first call of sched_setscheduler() is via kthread_create().  I
> confirmed that nuking that (gratuitous user foot saving override) call
> on top of moving sched_set_fifo() does shut it up, but that won't fly.

mkay. but this then, too. Let me try if I can figure out when this
broke.

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 3edaa380dc7b4..64d6afb127239 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -244,6 +244,7 @@ EXPORT_SYMBOL_GPL(kthread_parkme);
 static int kthread(void *_create)
 {
 	/* Copy data: it's on kthread's stack */
+	static const struct sched_param param = { .sched_priority = 0 };
 	struct kthread_create_info *create = _create;
 	int (*threadfn)(void *data) = create->threadfn;
 	void *data = create->data;
@@ -273,6 +274,13 @@ static int kthread(void *_create)
 	init_completion(&self->parked);
 	current->vfork_done = &self->exited;
 
+	/*
+	 * root may have changed our (kthreadd's) priority or CPU mask.
+	 * The kernel thread should not inherit these properties.
+	 */
+	sched_setscheduler_nocheck(current, SCHED_NORMAL, &param);
+	set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_KTHREAD));
+
 	/* OK, tell user we're spawned, wait for stop or wakeup */
 	__set_current_state(TASK_UNINTERRUPTIBLE);
 	create->result = current;
@@ -370,7 +378,6 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
 	}
 	task = create->result;
 	if (!IS_ERR(task)) {
-		static const struct sched_param param = { .sched_priority = 0 };
 		char name[TASK_COMM_LEN];
 
 		/*
@@ -379,13 +386,6 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
 		 */
 		vsnprintf(name, sizeof(name), namefmt, args);
 		set_task_comm(task, name);
-		/*
-		 * root may have changed our (kthreadd's) priority or CPU mask.
-		 * The kernel thread should not inherit these properties.
-		 */
-		sched_setscheduler_nocheck(task, SCHED_NORMAL, &param);
-		set_cpus_allowed_ptr(task,
-				     housekeeping_cpumask(HK_FLAG_KTHREAD));
 	}
 	kfree(create);
 	return task;

Sebastian

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-27  9:00                 ` Sebastian Andrzej Siewior
@ 2020-10-27  9:49                   ` Mike Galbraith
  2020-10-27 10:14                   ` Mike Galbraith
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2020-10-27  9:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On Tue, 2020-10-27 at 10:00 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-10-27 07:03:38 [+0100], Mike Galbraith wrote:
> > On Mon, 2020-10-26 at 20:53 +0100, Sebastian Andrzej Siewior wrote:
> > >
> > > Could you try this, please?
> >
> > Nogo, first call of sched_setscheduler() is via kthread_create().  I
> > confirmed that nuking that (gratuitous user foot saving override) call
> > on top of moving sched_set_fifo() does shut it up, but that won't fly.
>
> mkay. but this then, too.

Yup, might even fly.

>  Let me try if I can figure out when this
> broke.
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 3edaa380dc7b4..64d6afb127239 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -244,6 +244,7 @@ EXPORT_SYMBOL_GPL(kthread_parkme);
>  static int kthread(void *_create)
>  {
>  	/* Copy data: it's on kthread's stack */
> +	static const struct sched_param param = { .sched_priority = 0 };
>  	struct kthread_create_info *create = _create;
>  	int (*threadfn)(void *data) = create->threadfn;
>  	void *data = create->data;
> @@ -273,6 +274,13 @@ static int kthread(void *_create)
>  	init_completion(&self->parked);
>  	current->vfork_done = &self->exited;
>
> +	/*
> +	 * root may have changed our (kthreadd's) priority or CPU mask.
> +	 * The kernel thread should not inherit these properties.
> +	 */
> +	sched_setscheduler_nocheck(current, SCHED_NORMAL, &param);
> +	set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_KTHREAD));
> +
>  	/* OK, tell user we're spawned, wait for stop or wakeup */
>  	__set_current_state(TASK_UNINTERRUPTIBLE);
>  	create->result = current;
> @@ -370,7 +378,6 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
>  	}
>  	task = create->result;
>  	if (!IS_ERR(task)) {
> -		static const struct sched_param param = { .sched_priority = 0 };
>  		char name[TASK_COMM_LEN];
>
>  		/*
> @@ -379,13 +386,6 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
>  		 */
>  		vsnprintf(name, sizeof(name), namefmt, args);
>  		set_task_comm(task, name);
> -		/*
> -		 * root may have changed our (kthreadd's) priority or CPU mask.
> -		 * The kernel thread should not inherit these properties.
> -		 */
> -		sched_setscheduler_nocheck(task, SCHED_NORMAL, &param);
> -		set_cpus_allowed_ptr(task,
> -				     housekeeping_cpumask(HK_FLAG_KTHREAD));
>  	}
>  	kfree(create);
>  	return task;
>
> Sebastian


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-27  9:00                 ` Sebastian Andrzej Siewior
  2020-10-27  9:49                   ` Mike Galbraith
@ 2020-10-27 10:14                   ` Mike Galbraith
  2020-10-27 10:18                     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 22+ messages in thread
From: Mike Galbraith @ 2020-10-27 10:14 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On Tue, 2020-10-27 at 10:00 +0100, Sebastian Andrzej Siewior wrote:
> Let me try if I can figure out when this broke.

My money is on...
710da3c8ea7df (Juri Lelli 2019-07-19 16:00:00 +0200 5317)       if (pi)
710da3c8ea7df (Juri Lelli 2019-07-19 16:00:00 +0200 5318)               cpuset_read_lock();
...having just had an unnoticed consequence for nouveau.

	-Mike


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-27 10:14                   ` Mike Galbraith
@ 2020-10-27 10:18                     ` Sebastian Andrzej Siewior
  2020-10-27 11:13                       ` Mike Galbraith
  0 siblings, 1 reply; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-27 10:18 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On 2020-10-27 11:14:34 [+0100], Mike Galbraith wrote:
> On Tue, 2020-10-27 at 10:00 +0100, Sebastian Andrzej Siewior wrote:
> > Let me try if I can figure out when this broke.
> 
> My money is on...
> 710da3c8ea7df (Juri Lelli 2019-07-19 16:00:00 +0200 5317)       if (pi)
> 710da3c8ea7df (Juri Lelli 2019-07-19 16:00:00 +0200 5318)               cpuset_read_lock();
> ...having just had an unnoticed consequence for nouveau.

but that is over a year old and should be noticed in v5.4-RT.

> 	-Mike

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: kvm+nouveau induced lockdep  gripe
  2020-10-27 10:18                     ` Sebastian Andrzej Siewior
@ 2020-10-27 11:13                       ` Mike Galbraith
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Galbraith @ 2020-10-27 11:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Hillf Danton, Thomas Gleixner, LKML, linux-rt-users,
	Steven Rostedt, Ben Skeggs

On Tue, 2020-10-27 at 11:18 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-10-27 11:14:34 [+0100], Mike Galbraith wrote:
> > On Tue, 2020-10-27 at 10:00 +0100, Sebastian Andrzej Siewior wrote:
> > > Let me try if I can figure out when this broke.
> >
> > My money is on...
> > 710da3c8ea7df (Juri Lelli 2019-07-19 16:00:00 +0200 5317)       if (pi)
> > 710da3c8ea7df (Juri Lelli 2019-07-19 16:00:00 +0200 5318)               cpuset_read_lock();
> > ...having just had an unnoticed consequence for nouveau.
>
> but that is over a year old and should be noticed in v5.4-RT.

Yup.  Dang lazy sod nouveau users haven't been doing their broad
spectrum RT or threadirqs testing.  This one hasn't anyway ;-)

[   73.087508] ======================================================
[   73.087508] WARNING: possible circular locking dependency detected
[   73.087509] 5.4.72-rt40-rt #1 Tainted: G            E
[   73.087510] ------------------------------------------------------
[   73.087510] libvirtd/1902 is trying to acquire lock:
[   73.087511] ffff8b0f54b2f8d8 (&mm->mmap_sem#2){++++}, at: mpol_rebind_mm+0x1e/0x50
[   73.087517]
               but task is already holding lock:
[   73.087518] ffffffff9d2a0430 (&cpuset_rwsem){++++}, at: cpuset_attach+0x38/0x390
[   73.087520]
               which lock already depends on the new lock.

[   73.087521]
               the existing dependency chain (in reverse order) is:
[   73.087521]
               -> #3 (&cpuset_rwsem){++++}:
[   73.087523]        cpuset_read_lock+0x39/0x100
[   73.087524]        __sched_setscheduler+0x476/0xb00
[   73.087526]        _sched_setscheduler+0x69/0x70
[   73.087527]        __kthread_create_on_node+0x122/0x180
[   73.087531]        kthread_create_on_node+0x37/0x40
[   73.087532]        setup_irq_thread+0x3c/0xa0
[   73.087533]        __setup_irq+0x4c3/0x760
[   73.087535]        request_threaded_irq+0xf8/0x160
[   73.087535]        nvkm_pci_oneinit+0x4c/0x70 [nouveau]
[   73.087569]        nvkm_subdev_init+0x60/0x1e0 [nouveau]
[   73.087586]        nvkm_device_init+0x10b/0x240 [nouveau]
[   73.087611]        nvkm_udevice_init+0x47/0x70 [nouveau]
[   73.087636]        nvkm_object_init+0x3d/0x180 [nouveau]
[   73.087652]        nvkm_ioctl_new+0x1a1/0x260 [nouveau]
[   73.087667]        nvkm_ioctl+0x10a/0x240 [nouveau]
[   73.087682]        nvif_object_init+0xbf/0x110 [nouveau]
[   73.087697]        nvif_device_init+0xe/0x50 [nouveau]
[   73.087713]        nouveau_cli_init+0x1ce/0x5a0 [nouveau]
[   73.087739]        nouveau_drm_device_init+0x54/0x7e0 [nouveau]
[   73.087765]        nouveau_drm_probe+0x1da/0x330 [nouveau]
[   73.087791]        local_pci_probe+0x42/0x90
[   73.087793]        pci_device_probe+0xe7/0x1a0
[   73.087794]        really_probe+0xf7/0x460
[   73.087796]        driver_probe_device+0x5d/0x130
[   73.087797]        device_driver_attach+0x4f/0x60
[   73.087798]        __driver_attach+0xa2/0x140
[   73.087798]        bus_for_each_dev+0x67/0x90
[   73.087800]        bus_add_driver+0x192/0x230
[   73.087801]        driver_register+0x5b/0xf0
[   73.087801]        do_one_initcall+0x56/0x3c4
[   73.087803]        do_init_module+0x5b/0x21d
[   73.087805]        load_module+0x1cd4/0x2340
[   73.087806]        __do_sys_finit_module+0xa7/0xe0
[   73.087807]        do_syscall_64+0x6c/0x270
[   73.087808]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   73.087810]
               -> #2 (&device->mutex){+.+.}:
[   73.087811]        _mutex_lock+0x28/0x40
[   73.087813]        nvkm_udevice_fini+0x21/0x70 [nouveau]
[   73.087839]        nvkm_object_fini+0xb8/0x210 [nouveau]
[   73.087854]        nvkm_object_fini+0x73/0x210 [nouveau]
[   73.087869]        nvkm_ioctl_del+0x7e/0xa0 [nouveau]
[   73.087884]        nvkm_ioctl+0x10a/0x240 [nouveau]
[   73.087898]        nvif_object_fini+0x49/0x60 [nouveau]
[   73.087914]        nvif_client_fini+0xe/0x40 [nouveau]
[   73.087930]        nouveau_cli_fini+0x78/0x90 [nouveau]
[   73.087955]        nouveau_drm_postclose+0xa3/0xd0 [nouveau]
[   73.087981]        drm_file_free.part.5+0x20c/0x2c0 [drm]
[   73.087991]        drm_release+0x4b/0x80 [drm]
[   73.087997]        __fput+0xd5/0x280
[   73.087999]        task_work_run+0x87/0xb0
[   73.088001]        exit_to_usermode_loop+0x13b/0x160
[   73.088002]        do_syscall_64+0x1be/0x270
[   73.088003]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   73.088004]
               -> #1 (&cli->lock){+.+.}:
[   73.088006]        _mutex_lock+0x28/0x40
[   73.088007]        nouveau_mem_fini+0x4a/0x70 [nouveau]
[   73.088033]        nv04_sgdma_unbind+0xe/0x20 [nouveau]
[   73.088058]        ttm_tt_unbind+0x1d/0x30 [ttm]
[   73.088061]        ttm_tt_destroy+0x13/0x60 [ttm]
[   73.088063]        ttm_bo_cleanup_memtype_use+0x32/0x80 [ttm]
[   73.088066]        ttm_bo_release+0x264/0x460 [ttm]
[   73.088068]        ttm_bo_vm_close+0x15/0x30 [ttm]
[   73.088070]        remove_vma+0x3e/0x70
[   73.088072]        __do_munmap+0x2d7/0x510
[   73.088073]        __vm_munmap+0x5b/0xa0
[   73.088074]        __x64_sys_munmap+0x27/0x30
[   73.088075]        do_syscall_64+0x6c/0x270
[   73.088076]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   73.088077]
               -> #0 (&mm->mmap_sem#2){++++}:
[   73.088078]        __lock_acquire+0x113f/0x1410
[   73.088080]        lock_acquire+0x93/0x230
[   73.088081]        down_write+0x3b/0x50
[   73.088082]        mpol_rebind_mm+0x1e/0x50
[   73.088083]        cpuset_attach+0x229/0x390
[   73.088084]        cgroup_migrate_execute+0x42c/0x450
[   73.088086]        cgroup_attach_task+0x267/0x3f0
[   73.088086]        __cgroup1_procs_write.constprop.20+0xe8/0x140
[   73.088088]        cgroup_file_write+0x7e/0x1a0
[   73.088089]        kernfs_fop_write+0x113/0x1b0
[   73.088091]        vfs_write+0xc1/0x1d0
[   73.088092]        ksys_write+0x87/0xc0
[   73.088093]        do_syscall_64+0x6c/0x270
[   73.088094]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   73.088095]
               other info that might help us debug this:

[   73.088095] Chain exists of:
                 &mm->mmap_sem#2 --> &device->mutex --> &cpuset_rwsem

[   73.088097]  Possible unsafe locking scenario:

[   73.088097]        CPU0                    CPU1
[   73.088098]        ----                    ----
[   73.088098]   lock(&cpuset_rwsem);
[   73.088099]                                lock(&device->mutex);
[   73.088099]                                lock(&cpuset_rwsem);
[   73.088100]   lock(&mm->mmap_sem#2);
[   73.088101]
                *** DEADLOCK ***

[   73.088101] 6 locks held by libvirtd/1902:
[   73.088102]  #0: ffff8b0f771f1ba0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x46/0x50
[   73.088105]  #1: ffff8b0cc7790658 (sb_writers#7){.+.+}, at: vfs_write+0x1af/0x1d0
[   73.088107]  #2: ffff8b0fa42762b8 (&of->mutex){+.+.}, at: kernfs_fop_write+0xde/0x1b0
[   73.088110]  #3: ffffffff9d29c578 (cgroup_mutex){+.+.}, at: cgroup_kn_lock_live+0xed/0x1d0
[   73.088112]  #4: ffffffff9d29c1b0 (cgroup_threadgroup_rwsem){++++}, at: cgroup_procs_write_start+0x4c/0x190
[   73.088114]  #5: ffffffff9d2a0430 (&cpuset_rwsem){++++}, at: cpuset_attach+0x38/0x390
[   73.088115]
               stack backtrace:
[   73.088116] CPU: 6 PID: 1902 Comm: libvirtd Kdump: loaded Tainted: G            E     5.4.72-rt40-rt #1
[   73.088117] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[   73.088118] Call Trace:
[   73.088120]  dump_stack+0x71/0x9b
[   73.088122]  check_noncircular+0x155/0x170
[   73.088125]  ? __lock_acquire+0x113f/0x1410
[   73.088127]  __lock_acquire+0x113f/0x1410
[   73.088129]  lock_acquire+0x93/0x230
[   73.088130]  ? mpol_rebind_mm+0x1e/0x50
[   73.088133]  down_write+0x3b/0x50
[   73.088134]  ? mpol_rebind_mm+0x1e/0x50
[   73.088135]  mpol_rebind_mm+0x1e/0x50
[   73.088137]  cpuset_attach+0x229/0x390
[   73.088138]  cgroup_migrate_execute+0x42c/0x450
[   73.088140]  ? rt_spin_unlock+0x5b/0xa0
[   73.088142]  cgroup_attach_task+0x267/0x3f0
[   73.088145]  ? __cgroup1_procs_write.constprop.20+0xe8/0x140
[   73.088146]  __cgroup1_procs_write.constprop.20+0xe8/0x140
[   73.088148]  cgroup_file_write+0x7e/0x1a0
[   73.088150]  kernfs_fop_write+0x113/0x1b0
[   73.088152]  vfs_write+0xc1/0x1d0
[   73.088153]  ksys_write+0x87/0xc0
[   73.088155]  do_syscall_64+0x6c/0x270
[   73.088156]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   73.088157] RIP: 0033:0x7f6be8550deb
[   73.088159] Code: 53 48 89 d5 48 89 f3 48 83 ec 18 48 89 7c 24 08 e8 5a fd ff ff 48 89 ea 41 89 c0 48 89 de 48 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 90 fd ff ff 48
[   73.088160] RSP: 002b:00007f6bde6832f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[   73.088161] RAX: ffffffffffffffda RBX: 00007f6bc80388b0 RCX: 00007f6be8550deb
[   73.088161] RDX: 0000000000000004 RSI: 00007f6bc80388b0 RDI: 000000000000001f
[   73.088162] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[   73.088162] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f6bc80388b0
[   73.088163] R13: 0000000000000000 R14: 000000000000001f R15: 0000000000000214


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] v5.9.1-rt18
  2020-10-27  8:22     ` Sebastian Andrzej Siewior
@ 2020-10-27 17:07       ` Fernando Lopez-Lezcano
  2020-10-28 20:24         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 22+ messages in thread
From: Fernando Lopez-Lezcano @ 2020-10-27 17:07 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Fernando Lopez-Lezcano
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 10/27/20 1:22 AM, Sebastian Andrzej Siewior wrote:
> On 2020-10-26 23:53:20 [-0700], Fernando Lopez-Lezcano wrote:
>> Maybe I'm doing something wrong but I get a compilation error (see below)
>> when trying to do a debug build (building rpm packages for Fedora). 5.9.1 +
>> rt19...
>>
>> Builds fine otherwise...
> 
> If you could remove CONFIG_TEST_LOCKUP then it should work. I will think
> of something.

Thanks much, I should have figured this out for myself :-( Just toooo 
busy. The compilation process went ahead (not finished yet), let me know 
if there is a proper patch. No hurry...

Thanks!
-- Fernando


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] v5.9.1-rt18
  2020-10-27 17:07       ` Fernando Lopez-Lezcano
@ 2020-10-28 20:24         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 22+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-10-28 20:24 UTC (permalink / raw)
  To: Fernando Lopez-Lezcano
  Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt

On 2020-10-27 10:07:35 [-0700], Fernando Lopez-Lezcano wrote:
> The compilation process went ahead (not finished yet), let me know if there
> is a proper patch. No hurry...

I just released -rt20 and it compiles now. I looked at the code and I
wouldn't recommend to use it unless you know exactly what you do.

> Thanks!
> -- Fernando

Sebastian

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-10-29  0:35 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201021125324.ualpvrxvzyie6d7d@linutronix.de>
2020-10-21 13:14 ` [ANNOUNCE] v5.9.1-rt18 Sebastian Andrzej Siewior
2020-10-27  6:53   ` Fernando Lopez-Lezcano
2020-10-27  8:22     ` Sebastian Andrzej Siewior
2020-10-27 17:07       ` Fernando Lopez-Lezcano
2020-10-28 20:24         ` Sebastian Andrzej Siewior
2020-10-22  5:21 ` ltp or kvm triggerable lockdep alloc_pid() deadlock gripe Mike Galbraith
2020-10-22 16:44   ` Sebastian Andrzej Siewior
2020-10-22  5:28 ` kvm+nouveau induced lockdep gripe Mike Galbraith
2020-10-23  9:01   ` Sebastian Andrzej Siewior
2020-10-23 12:07     ` Mike Galbraith
     [not found]     ` <20201024022236.19608-1-hdanton@sina.com>
2020-10-24  3:38       ` Mike Galbraith
     [not found]       ` <20201024050000.8104-1-hdanton@sina.com>
2020-10-24  5:25         ` Mike Galbraith
     [not found]         ` <20201024094224.2804-1-hdanton@sina.com>
2020-10-26 17:26           ` Sebastian Andrzej Siewior
2020-10-26 17:31         ` Sebastian Andrzej Siewior
2020-10-26 19:15           ` Mike Galbraith
2020-10-26 19:53             ` Sebastian Andrzej Siewior
2020-10-27  6:03               ` Mike Galbraith
2020-10-27  9:00                 ` Sebastian Andrzej Siewior
2020-10-27  9:49                   ` Mike Galbraith
2020-10-27 10:14                   ` Mike Galbraith
2020-10-27 10:18                     ` Sebastian Andrzej Siewior
2020-10-27 11:13                       ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).