linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* INFO: task hung in locks_remove_posix
@ 2020-05-20 20:53 syzbot
  2020-05-21 14:09 ` Tetsuo Handa
  2020-05-25  0:14 ` syzbot
  0 siblings, 2 replies; 9+ messages in thread
From: syzbot @ 2020-05-20 20:53 UTC (permalink / raw)
  To: andreyknvl, bfields, jlayton, linux-fsdevel, linux-kernel,
	linux-usb, syzkaller-bugs, viro

Hello,

syzbot found the following crash on:

HEAD commit:    806d8acc USB: dummy-hcd: use configurable endpoint naming ..
git tree:       https://github.com/google/kasan.git usb-fuzzer
console output: https://syzkaller.appspot.com/x/log.txt?x=16c9ece2100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d800e9bad158025f
dashboard link: https://syzkaller.appspot.com/bug?extid=f5bc30abd8916982419c
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+f5bc30abd8916982419c@syzkaller.appspotmail.com

INFO: task syz-executor.2:3145 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.2  D28552  3145    370 0x80004006
Call Trace:
 context_switch kernel/sched/core.c:3367 [inline]
 __schedule+0x892/0x1d80 kernel/sched/core.c:4083
 locks_remove_posix+0x277/0x4e0 fs/locks.c:2706
 __sched_text_start+0x8/0x8
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 prepare_to_wait_event+0x129/0x650 kernel/sched/wait.c:305
 schedule+0xcd/0x2b0 kernel/sched/core.c:4158
 wdm_flush+0x2ea/0x3c0 drivers/usb/class/cdc-wdm.c:590
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 wait_for_completion+0x280/0x280
 finish_wait+0x260/0x260 include/linux/list.h:301
 task_work_add+0x97/0x120 kernel/task_work.c:35
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 filp_close+0xb4/0x170 fs/open.c:1251
 close_files fs/file.c:388 [inline]
 put_files_struct fs/file.c:416 [inline]
 put_files_struct+0x1d8/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb36/0x2c80 kernel/exit.c:791
 find_held_lock+0x2d/0x110 kernel/locking/lockdep.c:4458
 mm_update_next_owner+0x7a0/0x7a0 kernel/exit.c:375
 lock_downgrade+0x720/0x720 kernel/locking/lockdep.c:4599
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x480/0x2480 kernel/signal.c:2739
 schedule_timeout_idle+0x80/0x80 kernel/time/timer.c:1942
 do_signal+0x88/0x1ae0 arch/x86/kernel/signal.c:784
 free_object+0x5/0x70 lib/debugobjects.c:429
 destroy_hrtimer_on_stack kernel/time/hrtimer.c:453 [inline]
 hrtimer_nanosleep+0x211/0x3a0 kernel/time/hrtimer.c:1947
 nanosleep_copyout+0x100/0x100 kernel/time/hrtimer.c:1861
 force_valid_ss arch/x86/kernel/signal.c:73 [inline]
 restore_sigcontext+0x620/0x620 arch/x86/kernel/signal.c:134
 hrtimer_init_sleeper_on_stack+0x90/0x90 kernel/time/hrtimer.c:1833
 put_old_itimerspec32+0x1d0/0x1d0 kernel/time/time.c:908
 __do_sys_clock_gettime kernel/time/posix-timers.c:1094 [inline]
 __se_sys_clock_gettime kernel/time/posix-timers.c:1082 [inline]
 __x64_sys_clock_gettime+0x154/0x240 kernel/time/posix-timers.c:1082
 __do_sys_nanosleep kernel/time/hrtimer.c:1962 [inline]
 __se_sys_nanosleep kernel/time/hrtimer.c:1953 [inline]
 __x64_sys_nanosleep+0x1ed/0x260 kernel/time/hrtimer.c:1953
 hrtimer_nanosleep+0x3a0/0x3a0 kernel/time/hrtimer.c:1943
 exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x4e0/0x5a0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
 trace_hardirqs_off_caller+0x2b/0x200 kernel/trace/trace_preemptirq.c:67

Showing all locks held in the system:
1 lock held by khungtaskd/23:
 #0: ffffffff87111260 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x264 kernel/locking/lockdep.c:5754
1 lock held by in:imklog/267:
 #0: ffff8881d976d9f0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 23 Comm: khungtaskd Not tainted 5.7.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xef/0x16e lib/dump_stack.c:118
 irq_force_complete_move.cold+0x13/0x47 arch/x86/kernel/apic/vector.c:1023
 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
 lapic_can_unplug_cpu.cold+0x3b/0x3b
 nmi_trigger_cpumask_backtrace+0x1db/0x207 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
 watchdog+0xa99/0xfd0 kernel/hung_task.c:289
 reset_hung_task_detector+0x30/0x30 kernel/hung_task.c:243
 kthread+0x326/0x430 kernel/kthread.c:268
 kthread_create_on_node+0xf0/0xf0 kernel/kthread.c:405
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:351
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline]
NMI backtrace for cpu 1 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline]
NMI backtrace for cpu 1 skipped: idling at default_idle+0x28/0x300 arch/x86/kernel/process.c:697


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: task hung in locks_remove_posix
  2020-05-20 20:53 INFO: task hung in locks_remove_posix syzbot
@ 2020-05-21 14:09 ` Tetsuo Handa
  2020-05-21 14:21   ` Dmitry Vyukov
  2020-05-21 14:48   ` INFO: task hung in locks_remove_posix Jeff Layton
  2020-05-25  0:14 ` syzbot
  1 sibling, 2 replies; 9+ messages in thread
From: Tetsuo Handa @ 2020-05-21 14:09 UTC (permalink / raw)
  To: syzbot, syzkaller-bugs; +Cc: linux-kernel

On 2020/05/21 5:53, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    806d8acc USB: dummy-hcd: use configurable endpoint naming ..
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> console output: https://syzkaller.appspot.com/x/log.txt?x=16c9ece2100000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=d800e9bad158025f
> dashboard link: https://syzkaller.appspot.com/bug?extid=f5bc30abd8916982419c
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.

This seems to be a mislabeling due to '?' in all lines in a trace.

#syz dup: INFO: task hung in wdm_flush

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: task hung in locks_remove_posix
  2020-05-21 14:09 ` Tetsuo Handa
@ 2020-05-21 14:21   ` Dmitry Vyukov
  2020-05-21 23:34     ` [5.7-rc5 x86 regression] ORC unwinder generates unreliable traces Tetsuo Handa
  2020-05-21 14:48   ` INFO: task hung in locks_remove_posix Jeff Layton
  1 sibling, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2020-05-21 14:21 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: syzbot, syzkaller-bugs, LKML

On Thu, May 21, 2020 at 4:09 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/05/21 5:53, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    806d8acc USB: dummy-hcd: use configurable endpoint naming ..
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16c9ece2100000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=d800e9bad158025f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=f5bc30abd8916982419c
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
>
> This seems to be a mislabeling due to '?' in all lines in a trace.
>
> #syz dup: INFO: task hung in wdm_flush

It seems that unwind from __schedule is broken. Fixing the unwinder
may be the action item here, otherwise we will get an infinite number
of such reports.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: task hung in locks_remove_posix
  2020-05-21 14:09 ` Tetsuo Handa
  2020-05-21 14:21   ` Dmitry Vyukov
@ 2020-05-21 14:48   ` Jeff Layton
  2020-05-21 16:27     ` Andrey Konovalov
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2020-05-21 14:48 UTC (permalink / raw)
  To: Tetsuo Handa, syzbot, syzkaller-bugs; +Cc: linux-kernel

On Thu, 2020-05-21 at 23:09 +0900, Tetsuo Handa wrote:
> On 2020/05/21 5:53, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    806d8acc USB: dummy-hcd: use configurable endpoint naming ..
> > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16c9ece2100000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=d800e9bad158025f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=f5bc30abd8916982419c
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > 
> > Unfortunately, I don't have any reproducer for this crash yet.
> 
> This seems to be a mislabeling due to '?' in all lines in a trace.
> 
> #syz dup: INFO: task hung in wdm_flush


I'm not sure I trust those stack traces. The console output shows '?'
characters in front of every frame. Doesn't that mean that that address
it found on the stack is unreliable?

In principle, unless you're overriding the filp->lock operation (and the
wdm fs doesn't do that, afaict), locks_remove_posix should not block.
I'll also note that there is some of this in the logs before the hung
task warnings:

[  182.020388][   T12] usb 5-1: too many endpoints for config 0 interface 107 altsetting 116: 116, using maximum allowed: 30
[  182.031661][   T12] usb 5-1: config 0 interface 107 altsetting 116 has 0 endpoint descriptors, different from the interface descriptor's value: 116
[  182.045145][   T12] usb 5-1: config 0 interface 107 has no altsetting 0
[  182.052028][   T12] usb 5-1: New USB device found, idVendor=0926, idProduct=3333, bcdDevice= 0.40
[  182.060120][ T3525] usb 6-1: USB disconnect, device number 20
[  182.061148][    C0] xpad 6-1:0.65: xpad_irq_out - usb_submit_urb failed with result -19
[  182.075465][ T3525] xpad 6-1:0.65: xpad_try_sending_next_out_packet - usb_submit_urb failed with result -19
[  182.075565][   T12] usb 5-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[  182.109020][   T12] usb 5-1: config 0 descriptor??
[  182.136857][  T163] usb usb2-port1: attempt power cycle
[  182.410396][ T4447] udc-core: couldn't find an available UDC or it's busy
[  182.417562][ T4447] misc raw-gadget: fail, usb_gadget_probe_driver returned -16
[  182.856513][  T163] usb 2-1: new high-speed USB device number 18 using dummy_hcd
[  183.026601][  T163] usb 2-1: device descriptor read/8, error -61
[  183.236577][  T163] usb 2-1: device descriptor read/8, error -71
[  184.068991][ T3525] usb 5-1: USB disconnect, device number 21
[  206.185571][   T23] INFO: task syz-executor.2:3145 blocked for more than 143 seconds.
[  206.193630][   T23]       Not tainted 5.7.0-rc5-syzkaller #0
[  206.199512][   T23] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  206.208242][   T23] syz-executor.2  D28552  3145    370 0x80004006

...which leads me to believe that this might have more to do with the
USB subsystem than anything in the posix locking code.

In any case, I doubt there's much we can do here without a more reliable
stack trace to work from. That call stack doesn't seem to make much
sense.

-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: INFO: task hung in locks_remove_posix
  2020-05-21 14:48   ` INFO: task hung in locks_remove_posix Jeff Layton
@ 2020-05-21 16:27     ` Andrey Konovalov
  0 siblings, 0 replies; 9+ messages in thread
From: Andrey Konovalov @ 2020-05-21 16:27 UTC (permalink / raw)
  To: Jeff Layton, Tetsuo Handa; +Cc: syzbot, syzkaller-bugs, LKML

On Thu, May 21, 2020 at 4:48 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Thu, 2020-05-21 at 23:09 +0900, Tetsuo Handa wrote:
> > On 2020/05/21 5:53, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    806d8acc USB: dummy-hcd: use configurable endpoint naming ..
> > > git tree:       https://github.com/google/kasan.git usb-fuzzer
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=16c9ece2100000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=d800e9bad158025f
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=f5bc30abd8916982419c
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > This seems to be a mislabeling due to '?' in all lines in a trace.
> >
> > #syz dup: INFO: task hung in wdm_flush
>
>
> I'm not sure I trust those stack traces. The console output shows '?'
> characters in front of every frame. Doesn't that mean that that address
> it found on the stack is unreliable?
>
> In principle, unless you're overriding the filp->lock operation (and the
> wdm fs doesn't do that, afaict), locks_remove_posix should not block.
> I'll also note that there is some of this in the logs before the hung
> task warnings:
>
> [  182.020388][   T12] usb 5-1: too many endpoints for config 0 interface 107 altsetting 116: 116, using maximum allowed: 30
> [  182.031661][   T12] usb 5-1: config 0 interface 107 altsetting 116 has 0 endpoint descriptors, different from the interface descriptor's value: 116
> [  182.045145][   T12] usb 5-1: config 0 interface 107 has no altsetting 0
> [  182.052028][   T12] usb 5-1: New USB device found, idVendor=0926, idProduct=3333, bcdDevice= 0.40
> [  182.060120][ T3525] usb 6-1: USB disconnect, device number 20
> [  182.061148][    C0] xpad 6-1:0.65: xpad_irq_out - usb_submit_urb failed with result -19
> [  182.075465][ T3525] xpad 6-1:0.65: xpad_try_sending_next_out_packet - usb_submit_urb failed with result -19
> [  182.075565][   T12] usb 5-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
> [  182.109020][   T12] usb 5-1: config 0 descriptor??
> [  182.136857][  T163] usb usb2-port1: attempt power cycle
> [  182.410396][ T4447] udc-core: couldn't find an available UDC or it's busy
> [  182.417562][ T4447] misc raw-gadget: fail, usb_gadget_probe_driver returned -16
> [  182.856513][  T163] usb 2-1: new high-speed USB device number 18 using dummy_hcd
> [  183.026601][  T163] usb 2-1: device descriptor read/8, error -61
> [  183.236577][  T163] usb 2-1: device descriptor read/8, error -71
> [  184.068991][ T3525] usb 5-1: USB disconnect, device number 21
> [  206.185571][   T23] INFO: task syz-executor.2:3145 blocked for more than 143 seconds.
> [  206.193630][   T23]       Not tainted 5.7.0-rc5-syzkaller #0
> [  206.199512][   T23] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  206.208242][   T23] syz-executor.2  D28552  3145    370 0x80004006
>
> ...which leads me to believe that this might have more to do with the
> USB subsystem than anything in the posix locking code.
>
> In any case, I doubt there's much we can do here without a more reliable
> stack trace to work from. That call stack doesn't seem to make much
> sense.

Most likely this is a duplicate of:

https://syzkaller.appspot.com/bug?id=e7b761593b23eb50855b9ea31e3be5472b711186

(with more than 30000 crashes now :)

#syz dup: INFO: task hung in wdm_flush

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [5.7-rc5 x86 regression] ORC unwinder generates unreliable traces
  2020-05-21 14:21   ` Dmitry Vyukov
@ 2020-05-21 23:34     ` Tetsuo Handa
       [not found]       ` <54652cf1-ca04-c3ec-a2fe-d0f47484fb5f@i-love.sakura.ne.jp>
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-05-21 23:34 UTC (permalink / raw)
  To: Thomas Gleixner, x86; +Cc: Dmitry Vyukov, LKML, Linus Torvalds

Hello.

syzbot is troubled by CONFIG_UNWINDER_ORC=y .

As of commit d2f8825ab78e4c18 ("Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost"),
sched_show_task() is defunctional (all threads' traces are marked as 'unreliable').

[   43.975229] sysrq: Show State
[   43.978609]   task                        PC stack   pid father
[   43.981856] systemd         S11664     1      0 0x00000000
[   43.984118] Call Trace:
[   43.985626]  ? __schedule+0x27d/0x770
[   43.987738]  ? schedule+0x5a/0xd0
[   43.989509]  ? schedule_hrtimeout_range_clock+0x11c/0x130
[   43.991702]  ? ep_poll+0x461/0x4f0
[   43.993384]  ? _raw_write_unlock_irq+0x1f/0x30
[   43.995343]  ? ep_poll+0x472/0x4f0
[   43.997053]  ? wait_woken+0xa0/0xa0
[   43.998763]  ? do_epoll_wait+0xcd/0xf0
[   44.000494]  ? __x64_sys_epoll_wait+0x15/0x20
[   44.002385]  ? do_syscall_64+0x45/0x1d0
[   44.004158]  ? entry_SYSCALL_64_after_hwframe+0x49/0xb3

Since v5.7-rc5 is bad and v5.7-rc4 is good, the culprit is commit c14cab2688d09b85
("Merge tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
which contains

    Pull x86 fixes from Thomas Gleixner:
     "A set of fixes for x86:
(...snipped...)
       - A series of fixes for objtool and the ORC unwinder addressing
         variety of small issues:

in the log.

----------
Kernel 5.7.0-rc6-00069-gd2f8825ab78e on an x86_64

localhost login: [   43.975229][    C6] sysrq: Show State
[   43.978609][    C6]   task                        PC stack   pid father
[   43.981856][    C6] systemd         S11664     1      0 0x00000000
[   43.984118][    C6] Call Trace:
[   43.985626][    C6]  ? __schedule+0x27d/0x770
[   43.987738][    C6]  ? schedule+0x5a/0xd0
[   43.989509][    C6]  ? schedule_hrtimeout_range_clock+0x11c/0x130
[   43.991702][    C6]  ? ep_poll+0x461/0x4f0
[   43.993384][    C6]  ? _raw_write_unlock_irq+0x1f/0x30
[   43.995343][    C6]  ? ep_poll+0x472/0x4f0
[   43.997053][    C6]  ? wait_woken+0xa0/0xa0
[   43.998763][    C6]  ? do_epoll_wait+0xcd/0xf0
[   44.000494][    C6]  ? __x64_sys_epoll_wait+0x15/0x20
[   44.002385][    C6]  ? do_syscall_64+0x45/0x1d0
[   44.004158][    C6]  ? entry_SYSCALL_64_after_hwframe+0x49/0xb3

----------
Kernel 5.7.0-rc5 on an x86_64

localhost login: [   33.662344][    C6] sysrq: Show State
[   33.663434][    C6]   task                        PC stack   pid father
[   33.665144][    C6] systemd         S11968     1      0 0x00000000
[   33.666724][    C6] Call Trace:
[   33.667603][    C6]  ? __schedule+0x27d/0x770
[   33.668692][    C6]  ? schedule+0x5a/0xd0
[   33.669834][    C6]  ? schedule_hrtimeout_range_clock+0x11c/0x130
[   33.671397][    C6]  ? ep_poll+0x29e/0x4d0
[   33.672494][    C6]  ? _raw_write_unlock_irq+0x1f/0x30
[   33.673811][    C6]  ? ep_poll+0x45c/0x4d0
[   33.674839][    C6]  ? wait_woken+0xa0/0xa0
[   33.676000][    C6]  ? do_epoll_wait+0xcd/0xf0
[   33.677184][    C6]  ? __x64_sys_epoll_wait+0x15/0x20
[   33.678433][    C6]  ? do_syscall_64+0x45/0x1d0
[   33.679637][    C6]  ? entry_SYSCALL_64_after_hwframe+0x49/0xb3

----------
Kernel 5.7.0-rc4 on an x86_64

localhost login: [   35.448925][    C6] sysrq: Show State
[   35.450252][    C6]   task                        PC stack   pid father
[   35.452227][    C6] systemd         S11944     1      0 0x00000000
[   35.454001][    C6] Call Trace:
[   35.454983][    C6]  ? __schedule+0x27d/0x770
[   35.456288][    C6]  schedule+0x5a/0xd0
[   35.457393][    C6]  schedule_hrtimeout_range_clock+0x11c/0x130
[   35.459106][    C6]  ? ep_poll+0x40b/0x4c0
[   35.460306][    C6]  ? _raw_write_unlock_irq+0x1f/0x30
[   35.461840][    C6]  ep_poll+0x38f/0x4c0
[   35.463017][    C6]  ? wake_up_q+0x90/0x90
[   35.464230][    C6]  do_epoll_wait+0xcd/0xf0
[   35.465464][    C6]  __x64_sys_epoll_wait+0x15/0x20
[   35.466878][    C6]  do_syscall_64+0x45/0x1d0
[   35.468073][    C6]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
[   35.469657][    C6] RIP: 0033:0x7f1f191b0e93
[   35.470910][    C6] Code: Bad RIP value.
[   35.472046][    C6] RSP: 002b:00007fffc4c25c88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
[   35.474303][    C6] RAX: ffffffffffffffda RBX: 00007fffc4c25c90 RCX: 00007f1f191b0e93
[   35.476852][    C6] RDX: 0000000000000035 RSI: 00007fffc4c25c90 RDI: 0000000000000004
[   35.479254][    C6] RBP: 00007fffc4c26000 R08: 9c23bff2734474f6 R09: 000055ec811556e0
[   35.481458][    C6] R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000000001
[   35.483643][    C6] R13: ffffffffffffffff R14: 0000000000000000 R15: 000055ec81fd23b0

----------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks
       [not found]           ` <alpine.LSU.2.21.2005220940070.18061@pobox.suse.cz>
@ 2020-05-22 13:54             ` Josh Poimboeuf
  2020-05-22 18:04               ` [tip: x86/urgent] " tip-bot2 for Josh Poimboeuf
  0 siblings, 1 reply; 9+ messages in thread
From: Josh Poimboeuf @ 2020-05-22 13:54 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Tetsuo Handa, Ingo Molnar, x86, Dmitry Vyukov, Linus Torvalds,
	Andy Lutomirski, Dave Jones, Jann Horn, Peter Zijlstra,
	Thomas Gleixner, Vince Weaver

Normally, show_trace_log_lvl() scans the stack, looking for text
addresses to print.  In parallel, it unwinds the stack with
unwind_next_frame().  If the stack address matches the pointer returned
by unwind_get_return_address_ptr() for the current frame, the text
address is printed normally without a question mark.  Otherwise it's
considered a breadcrumb (potentially from a previous call path) and it's
printed with a question mark to indicate that the address is unreliable
and typically can be ignored.

Since the following commit:

  f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")

... for inactive tasks, show_trace_log_lvl() prints *only* unreliable
addresses (prepended with '?').

That happens because, for the first frame of an inactive task,
unwind_get_return_address_ptr() returns the wrong return address
pointer: one word *below* the task stack pointer.  show_trace_log_lvl()
starts scanning at the stack pointer itself, so it never finds the first
'reliable' address, causing only guesses to being printed.

The first frame of an inactive task isn't a normal stack frame.  It's
actually just an instance of 'struct inactive_task_frame' which is left
behind by __switch_to_asm().  Now that this inactive frame is actually
exposed to callers, fix unwind_get_return_address_ptr() to interpret it
properly.

Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/kernel/unwind_orc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index fa79e4227d3d..7f969b2d240f 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -320,12 +320,19 @@ EXPORT_SYMBOL_GPL(unwind_get_return_address);
 
 unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
 {
+	struct task_struct *task = state->task;
+
 	if (unwind_done(state))
 		return NULL;
 
 	if (state->regs)
 		return &state->regs->ip;
 
+	if (task != current && state->sp == task->thread.sp) {
+		struct inactive_task_frame *frame = (void *)task->thread.sp;
+		return &frame->ret_addr;
+	}
+
 	if (state->sp)
 		return (unsigned long *)state->sp - 1;
 
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: x86/urgent] x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks
  2020-05-22 13:54             ` [PATCH] x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks Josh Poimboeuf
@ 2020-05-22 18:04               ` tip-bot2 for Josh Poimboeuf
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Josh Poimboeuf @ 2020-05-22 18:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Tetsuo Handa, Josh Poimboeuf, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     187b96db5ca79423618dfa29a05c438c34f9e1f0
Gitweb:        https://git.kernel.org/tip/187b96db5ca79423618dfa29a05c438c34f9e1f0
Author:        Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate:    Fri, 22 May 2020 08:54:35 -05:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 22 May 2020 19:55:17 +02:00

x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks

Normally, show_trace_log_lvl() scans the stack, looking for text
addresses to print.  In parallel, it unwinds the stack with
unwind_next_frame().  If the stack address matches the pointer returned
by unwind_get_return_address_ptr() for the current frame, the text
address is printed normally without a question mark.  Otherwise it's
considered a breadcrumb (potentially from a previous call path) and it's
printed with a question mark to indicate that the address is unreliable
and typically can be ignored.

Since the following commit:

  f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")

... for inactive tasks, show_trace_log_lvl() prints *only* unreliable
addresses (prepended with '?').

That happens because, for the first frame of an inactive task,
unwind_get_return_address_ptr() returns the wrong return address
pointer: one word *below* the task stack pointer.  show_trace_log_lvl()
starts scanning at the stack pointer itself, so it never finds the first
'reliable' address, causing only guesses to being printed.

The first frame of an inactive task isn't a normal stack frame.  It's
actually just an instance of 'struct inactive_task_frame' which is left
behind by __switch_to_asm().  Now that this inactive frame is actually
exposed to callers, fix unwind_get_return_address_ptr() to interpret it
properly.

Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200522135435.vbxs7umku5pyrdbk@treble
---
 arch/x86/kernel/unwind_orc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index fa79e42..7f969b2 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -320,12 +320,19 @@ EXPORT_SYMBOL_GPL(unwind_get_return_address);
 
 unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
 {
+	struct task_struct *task = state->task;
+
 	if (unwind_done(state))
 		return NULL;
 
 	if (state->regs)
 		return &state->regs->ip;
 
+	if (task != current && state->sp == task->thread.sp) {
+		struct inactive_task_frame *frame = (void *)task->thread.sp;
+		return &frame->ret_addr;
+	}
+
 	if (state->sp)
 		return (unsigned long *)state->sp - 1;
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: INFO: task hung in locks_remove_posix
  2020-05-20 20:53 INFO: task hung in locks_remove_posix syzbot
  2020-05-21 14:09 ` Tetsuo Handa
@ 2020-05-25  0:14 ` syzbot
  1 sibling, 0 replies; 9+ messages in thread
From: syzbot @ 2020-05-25  0:14 UTC (permalink / raw)
  To: andreyknvl, bfields, dvyukov, hdanton, jlayton, linux-fsdevel,
	linux-kernel, linux-usb, penguin-kernel, penguin-kernel,
	syzkaller-bugs, viro

syzbot has found a reproducer for the following crash on:

HEAD commit:    806d8acc USB: dummy-hcd: use configurable endpoint naming ..
git tree:       https://github.com/google/kasan.git usb-fuzzer
console output: https://syzkaller.appspot.com/x/log.txt?x=1109c09a100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d800e9bad158025f
dashboard link: https://syzkaller.appspot.com/bug?extid=f5bc30abd8916982419c
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=171ea49a100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+f5bc30abd8916982419c@syzkaller.appspotmail.com

INFO: task syz-executor.1:3125 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.1  D26504  3125    389 0x80004006
Call Trace:
 context_switch kernel/sched/core.c:3367 [inline]
 __schedule+0x892/0x1d80 kernel/sched/core.c:4083
 locks_remove_posix+0x277/0x4e0 fs/locks.c:2706
 __sched_text_start+0x8/0x8
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 prepare_to_wait_event+0x129/0x650 kernel/sched/wait.c:305
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 schedule+0xcd/0x2b0 kernel/sched/core.c:4158
 wdm_flush+0x2ea/0x3c0 drivers/usb/class/cdc-wdm.c:590
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 finish_wait+0x260/0x260 include/linux/list.h:301
 task_work_add+0x97/0x120 kernel/task_work.c:35
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 filp_close+0xb4/0x170 fs/open.c:1251
 close_files fs/file.c:388 [inline]
 put_files_struct fs/file.c:416 [inline]
 put_files_struct+0x1d8/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb36/0x2c80 kernel/exit.c:791
 find_held_lock+0x2d/0x110 kernel/locking/lockdep.c:4458
 mm_update_next_owner+0x7a0/0x7a0 kernel/exit.c:375
 lock_downgrade+0x720/0x720 kernel/locking/lockdep.c:4599
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x480/0x2480 kernel/signal.c:2739
 putname+0xe1/0x120 fs/namei.c:259
 lock_is_held include/linux/lockdep.h:406 [inline]
 rcu_read_lock_sched_held+0x9c/0xd0 kernel/rcu/update.c:121
 rcu_read_lock_bh_held+0xb0/0xb0 kernel/rcu/update.c:333
 do_signal+0x88/0x1ae0 arch/x86/kernel/signal.c:784
 putname+0xe1/0x120 fs/namei.c:259
 do_sys_openat2+0x46c/0x7d0 fs/open.c:1158
 force_valid_ss arch/x86/kernel/signal.c:73 [inline]
 restore_sigcontext+0x620/0x620 arch/x86/kernel/signal.c:134
 _copy_to_user+0x126/0x160 lib/usercopy.c:31
 put_timespec64+0xcb/0x120 kernel/time/time.c:812
 ns_to_kernel_old_timeval+0x100/0x100 kernel/time/time.c:521
 do_sys_open+0xc3/0x140 fs/open.c:1164
 filp_open+0x70/0x70 fs/open.c:1117
 exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x4e0/0x5a0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
INFO: task syz-executor.2:3132 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.2  D28552  3132    380 0x80004006
Call Trace:
 context_switch kernel/sched/core.c:3367 [inline]
 __schedule+0x892/0x1d80 kernel/sched/core.c:4083
 locks_remove_posix+0x277/0x4e0 fs/locks.c:2706
 __sched_text_start+0x8/0x8
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 prepare_to_wait_event+0x129/0x650 kernel/sched/wait.c:305
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 schedule+0xcd/0x2b0 kernel/sched/core.c:4158
 wdm_flush+0x2ea/0x3c0 drivers/usb/class/cdc-wdm.c:590
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 finish_wait+0x260/0x260 include/linux/list.h:301
 task_work_add+0x97/0x120 kernel/task_work.c:35
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 filp_close+0xb4/0x170 fs/open.c:1251
 close_files fs/file.c:388 [inline]
 put_files_struct fs/file.c:416 [inline]
 put_files_struct+0x1d8/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb36/0x2c80 kernel/exit.c:791
 find_held_lock+0x2d/0x110 kernel/locking/lockdep.c:4458
 mm_update_next_owner+0x7a0/0x7a0 kernel/exit.c:375
 lock_downgrade+0x720/0x720 kernel/locking/lockdep.c:4599
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x480/0x2480 kernel/signal.c:2739
 schedule_timeout_idle+0x80/0x80 kernel/time/timer.c:1942
 do_signal+0x88/0x1ae0 arch/x86/kernel/signal.c:784
 free_object+0x5/0x70 lib/debugobjects.c:429
 destroy_hrtimer_on_stack kernel/time/hrtimer.c:453 [inline]
 hrtimer_nanosleep+0x211/0x3a0 kernel/time/hrtimer.c:1947
 nanosleep_copyout+0x100/0x100 kernel/time/hrtimer.c:1861
 force_valid_ss arch/x86/kernel/signal.c:73 [inline]
 restore_sigcontext+0x620/0x620 arch/x86/kernel/signal.c:134
 hrtimer_init_sleeper_on_stack+0x90/0x90 kernel/time/hrtimer.c:1833
 put_old_itimerspec32+0x1d0/0x1d0 kernel/time/time.c:908
 __do_sys_nanosleep kernel/time/hrtimer.c:1962 [inline]
 __se_sys_nanosleep kernel/time/hrtimer.c:1953 [inline]
 __x64_sys_nanosleep+0x1ed/0x260 kernel/time/hrtimer.c:1953
 hrtimer_nanosleep+0x3a0/0x3a0 kernel/time/hrtimer.c:1943
 exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x4e0/0x5a0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
INFO: task syz-executor.3:3150 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.3  D29048  3150    386 0x80004006
Call Trace:
 context_switch kernel/sched/core.c:3367 [inline]
 __schedule+0x892/0x1d80 kernel/sched/core.c:4083
 locks_remove_posix+0x277/0x4e0 fs/locks.c:2706
 __sched_text_start+0x8/0x8
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 prepare_to_wait_event+0x129/0x650 kernel/sched/wait.c:305
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 schedule+0xcd/0x2b0 kernel/sched/core.c:4158
 wdm_flush+0x2ea/0x3c0 drivers/usb/class/cdc-wdm.c:590
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 finish_wait+0x260/0x260 include/linux/list.h:301
 task_work_add+0x97/0x120 kernel/task_work.c:35
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 filp_close+0xb4/0x170 fs/open.c:1251
 close_files fs/file.c:388 [inline]
 put_files_struct fs/file.c:416 [inline]
 put_files_struct+0x1d8/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb36/0x2c80 kernel/exit.c:791
 find_held_lock+0x2d/0x110 kernel/locking/lockdep.c:4458
 mm_update_next_owner+0x7a0/0x7a0 kernel/exit.c:375
 lock_downgrade+0x720/0x720 kernel/locking/lockdep.c:4599
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x480/0x2480 kernel/signal.c:2739
 down_interruptible+0x4b/0x80 kernel/locking/semaphore.c:85
 do_signal+0x88/0x1ae0 arch/x86/kernel/signal.c:784
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 __trace_hardirqs_on_caller kernel/locking/lockdep.c:3657 [inline]
 lockdep_hardirqs_on+0x3c7/0x5d0 kernel/locking/lockdep.c:3702
 force_valid_ss arch/x86/kernel/signal.c:73 [inline]
 restore_sigcontext+0x620/0x620 arch/x86/kernel/signal.c:134
 down_interruptible+0x4b/0x80 kernel/locking/semaphore.c:85
 raw_ioctl+0x11f/0x2570 drivers/usb/gadget/legacy/raw_gadget.c:1261
 ksys_dup3+0x3c0/0x3c0 include/linux/compiler.h:199
 raw_open+0x4d0/0x4d0 include/linux/semaphore.h:34
 down_read_nested+0x430/0x430 include/linux/compiler.h:199
 exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x4e0/0x5a0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
INFO: task syz-executor.4:3151 blocked for more than 143 seconds.
      Not tainted 5.7.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.4  D29048  3151    384 0x80004006
Call Trace:
 context_switch kernel/sched/core.c:3367 [inline]
 __schedule+0x892/0x1d80 kernel/sched/core.c:4083
 locks_remove_posix+0x277/0x4e0 fs/locks.c:2706
 __sched_text_start+0x8/0x8
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 prepare_to_wait_event+0x129/0x650 kernel/sched/wait.c:305
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 schedule+0xcd/0x2b0 kernel/sched/core.c:4158
 wdm_flush+0x2ea/0x3c0 drivers/usb/class/cdc-wdm.c:590
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 finish_wait+0x260/0x260 include/linux/list.h:301
 task_work_add+0x97/0x120 kernel/task_work.c:35
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 filp_close+0xb4/0x170 fs/open.c:1251
 close_files fs/file.c:388 [inline]
 put_files_struct fs/file.c:416 [inline]
 put_files_struct+0x1d8/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb36/0x2c80 kernel/exit.c:791
 find_held_lock+0x2d/0x110 kernel/locking/lockdep.c:4458
 mm_update_next_owner+0x7a0/0x7a0 kernel/exit.c:375
 lock_downgrade+0x720/0x720 kernel/locking/lockdep.c:4599
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x480/0x2480 kernel/signal.c:2739
 down_interruptible+0x4b/0x80 kernel/locking/semaphore.c:85
 do_signal+0x88/0x1ae0 arch/x86/kernel/signal.c:784
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 __trace_hardirqs_on_caller kernel/locking/lockdep.c:3657 [inline]
 lockdep_hardirqs_on+0x3c7/0x5d0 kernel/locking/lockdep.c:3702
 force_valid_ss arch/x86/kernel/signal.c:73 [inline]
 restore_sigcontext+0x620/0x620 arch/x86/kernel/signal.c:134
 down_interruptible+0x4b/0x80 kernel/locking/semaphore.c:85
 raw_ioctl+0x11f/0x2570 drivers/usb/gadget/legacy/raw_gadget.c:1261
 ksys_dup3+0x3c0/0x3c0 include/linux/compiler.h:199
 raw_open+0x4d0/0x4d0 include/linux/semaphore.h:34
 down_read_nested+0x430/0x430 include/linux/compiler.h:199
 exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x4e0/0x5a0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
INFO: task syz-executor.0:3152 blocked for more than 144 seconds.
      Not tainted 5.7.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.0  D29048  3152    383 0x80004006
Call Trace:
 context_switch kernel/sched/core.c:3367 [inline]
 __schedule+0x892/0x1d80 kernel/sched/core.c:4083
 locks_remove_posix+0x277/0x4e0 fs/locks.c:2706
 __sched_text_start+0x8/0x8
 spin_unlock_irqrestore include/linux/spinlock.h:408 [inline]
 prepare_to_wait_event+0x129/0x650 kernel/sched/wait.c:305
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 schedule+0xcd/0x2b0 kernel/sched/core.c:4158
 wdm_flush+0x2ea/0x3c0 drivers/usb/class/cdc-wdm.c:590
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 finish_wait+0x260/0x260 include/linux/list.h:301
 task_work_add+0x97/0x120 kernel/task_work.c:35
 wdm_poll+0x280/0x280 include/linux/poll.h:50
 filp_close+0xb4/0x170 fs/open.c:1251
 close_files fs/file.c:388 [inline]
 put_files_struct fs/file.c:416 [inline]
 put_files_struct+0x1d8/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb36/0x2c80 kernel/exit.c:791
 find_held_lock+0x2d/0x110 kernel/locking/lockdep.c:4458
 mm_update_next_owner+0x7a0/0x7a0 kernel/exit.c:375
 lock_downgrade+0x720/0x720 kernel/locking/lockdep.c:4599
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x480/0x2480 kernel/signal.c:2739
 down_interruptible+0x4b/0x80 kernel/locking/semaphore.c:85
 do_signal+0x88/0x1ae0 arch/x86/kernel/signal.c:784
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
 _raw_spin_unlock_irqrestore+0x39/0x40 kernel/locking/spinlock.c:191
 __trace_hardirqs_on_caller kernel/locking/lockdep.c:3657 [inline]
 lockdep_hardirqs_on+0x3c7/0x5d0 kernel/locking/lockdep.c:3702
 force_valid_ss arch/x86/kernel/signal.c:73 [inline]
 restore_sigcontext+0x620/0x620 arch/x86/kernel/signal.c:134
 down_interruptible+0x4b/0x80 kernel/locking/semaphore.c:85
 raw_ioctl+0x11f/0x2570 drivers/usb/gadget/legacy/raw_gadget.c:1261
 ksys_dup3+0x3c0/0x3c0 include/linux/compiler.h:199
 raw_open+0x4d0/0x4d0 include/linux/semaphore.h:34
 down_read_nested+0x430/0x430 include/linux/compiler.h:199
 exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x4e0/0x5a0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3

Showing all locks held in the system:
1 lock held by khungtaskd/23:
 #0: ffffffff87111260 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x264 kernel/locking/lockdep.c:5754
2 locks held by in:imklog/271:
 #0: ffff8881c8468370 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826
 #1: ffffffff8719fce0 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30 include/asm-generic/bitops/instrumented-atomic.h:30

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 23 Comm: khungtaskd Not tainted 5.7.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xef/0x16e lib/dump_stack.c:118
 irq_force_complete_move.cold+0x13/0x47 arch/x86/kernel/apic/vector.c:1023
 nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
 lapic_can_unplug_cpu.cold+0x3b/0x3b
 nmi_trigger_cpumask_backtrace+0x1db/0x207 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
 watchdog+0xa99/0xfd0 kernel/hung_task.c:289
 reset_hung_task_detector+0x30/0x30 kernel/hung_task.c:243
 kthread+0x326/0x430 kernel/kthread.c:268
 kthread_create_on_node+0xf0/0xf0 kernel/kthread.c:405
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:351
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline]
NMI backtrace for cpu 0 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline]
NMI backtrace for cpu 0 skipped: idling at default_idle+0x28/0x300 arch/x86/kernel/process.c:697


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-25  0:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20 20:53 INFO: task hung in locks_remove_posix syzbot
2020-05-21 14:09 ` Tetsuo Handa
2020-05-21 14:21   ` Dmitry Vyukov
2020-05-21 23:34     ` [5.7-rc5 x86 regression] ORC unwinder generates unreliable traces Tetsuo Handa
     [not found]       ` <54652cf1-ca04-c3ec-a2fe-d0f47484fb5f@i-love.sakura.ne.jp>
     [not found]         ` <20200522043037.fny37n7kjbfyrxo4@treble>
     [not found]           ` <alpine.LSU.2.21.2005220940070.18061@pobox.suse.cz>
2020-05-22 13:54             ` [PATCH] x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks Josh Poimboeuf
2020-05-22 18:04               ` [tip: x86/urgent] " tip-bot2 for Josh Poimboeuf
2020-05-21 14:48   ` INFO: task hung in locks_remove_posix Jeff Layton
2020-05-21 16:27     ` Andrey Konovalov
2020-05-25  0:14 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).