All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jirislaby@kernel.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: christian@brauner.io, "Eric W. Biederman" <ebiederm@xmission.com>,
	Linux kernel mailing list <linux-kernel@vger.kernel.org>
Subject: Re: 5.8-rc*: kernel BUG at kernel/signal.c:1917
Date: Mon, 20 Jul 2020 08:00:36 +0200	[thread overview]
Message-ID: <0deb1081-299b-1475-f368-7ca65b535a9c@kernel.org> (raw)
In-Reply-To: <20200718171406.GB16791@redhat.com>

On 18. 07. 20, 19:14, Oleg Nesterov wrote:
> On 07/18, Jiri Slaby wrote:
>>
>> On 17. 07. 20, 14:40, Oleg Nesterov wrote:
>>>
>>> please see the updated patch below, lets check ptrace_unfreeze() too.
>>
>> Sure, dmesg attached.
> 
> Thanks a lot!
> 
> But I am totally confused...
> 
>> [   94.513944] ------------[ cut here ]------------
>> [   94.513985] do not call blocking ops when !TASK_RUNNING; state=8 set at [<000000002fe279e9>] ptrace_check_attach+0xbf/0x110
> 
> OK, so the ptracer has already did the TASK_TRACED -> __TASK_TRACED change in
> ptrace_freeze_traced(),
> 
>> [   94.514019] WARNING: CPU: 16 PID: 34171 at kernel/sched/core.c:6881 __might_sleep+0x6c/0x70
>> [   94.514020] Modules linked in: ata_generic(E) pata_acpi(E) crc32_pclmul(E) qemu_fw_cfg(E) ata_piix(E) e1000(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) virtio_blk(E) virtio_mmio(E) xfs(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_intel(E) reiserfs(E) squashfs(E) fuse(E) dm_snapshot(E) dm_bufio(E) dm_crypt(E) dm_mod(E) binfmt_misc(E) loop(E) sg(E) virtio_rng(E)
>> [   94.514082] CPU: 16 PID: 34171 Comm: strace Tainted: G            E     5.8.0-rc5-100.g55927f9-default #1 openSUSE Tumbleweed (unreleased)
>> [   94.514084] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
>> [   94.514087] RIP: 0010:__might_sleep+0x6c/0x70
>> [   94.514090] Code: 41 5c 41 5d e9 25 ff ff ff 48 8b 90 68 1a 00 00 48 8b 70 10 48 c7 c7 10 45 70 8f c6 05 4f a9 68 01 01 48 89 d1 e8 7a bb fc ff <0f> 0b eb c8 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 49 89 fd
>> [   94.514092] RSP: 0018:ffff9ffa4ba1be00 EFLAGS: 00010286
>> [   94.514093] RAX: 0000000000000000 RBX: ffff8dc82b503e00 RCX: 0000000000000489
>> [   94.514094] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000247
>> [   94.514095] RBP: ffffffff8f6ffa6b R08: 0000000000000004 R09: 0000000000000489
>> [   94.514095] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000039
>> [   94.514096] R13: 0000000000000000 R14: 0000000000000001 R15: ffff8dc82b5045e4
>> [   94.514098] FS:  00007fa00f1f9240(0000) GS:ffff8dcb0c000000(0000) knlGS:0000000000000000
>> [   94.514099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   94.514100] CR2: 0000557b53d25877 CR3: 00000004ca490005 CR4: 0000000000360ee0
>> [   94.514103] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   94.514104] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   94.514105] Call Trace:
>> [   94.514821]  ptrace_stop+0x1a9/0x300
> 
> This is already wrong. But
> 
> 	Where does this __might_sleep() come from ??? I ses no blocking calls
> 	in ptrace_stop(). Not to mention it is called with ->siglock held and
> 	right after this lock is dropped we take tasklist_lock.

Decoded stacktrace:

> ptrace_stop (include/linux/freezer.h:57 include/linux/freezer.h:67 include/linux/freezer.h:128 include/linux/freezer.h:173 kernel/signal.c:2217)  
> ptrace_do_notify (kernel/signal.c:2272)  
> ptrace_notify (arch/x86/include/asm/paravirt.h:656 arch/x86/include/asm/qspinlock.h:55 include/linux/spinlock.h:211 include/linux/spinlock_api_smp.h:167 include/linux/spinlock.h:403 kernel/signal.c:2282)  
> syscall_trace_enter (include/linux/tracehook.h:73 include/linux/tracehook.h:104 arch/x86/entry/common.c:159)
> do_syscall_64 (arch/x86/entry/common.c:380)  
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:124)  

It is try_to_freeze_unsafe in try_to_freeze in freezable_schedule in
ptrace_stop.

> 
> 	How this connects to the debugging patch I sent? Did you see this warning
> 	without that patch?

I suppose this made it appear:
+CONFIG_PREEMPT_COUNT=y
-# CONFIG_DEBUG_ATOMIC_SLEEP is not set
+CONFIG_DEBUG_ATOMIC_SLEEP=y

>> [   94.514888] WARNING: CPU: 16 PID: 34171 at kernel/signal.c:2219 ptrace_stop+0x1d8/0x300
> ...
>> [   94.514899] RIP: 0010:ptrace_stop+0x1d8/0x300
> 
> This is WARN_ON(current->state) added to ptrace_stop(), this can explain
> BUG_ON() in do_notify_parent() you reported.
> 
> So, the tracee returns from schedule() with ->state != TASK_RUNNING ???
> This must not be possible.
> 
> OK, perhaps task->state was changed by ptrace_unfreeze_traced()? this can
> only happen if it races with ttwu(__TASK_TRACED) without ->siglock held,
> nobody should do this.
> 
> Strange.

Please see my other e-mail, all this is with dbfb089d360b applied. Maybe
it makes more sense now?

thanks,
-- 
js

  parent reply	other threads:[~2020-07-20  6:00 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-17 10:45 5.8-rc*: kernel BUG at kernel/signal.c:1917 Jiri Slaby
2020-07-17 11:04 ` Jiri Slaby
2020-07-17 11:12   ` Christian Brauner
2020-07-18 13:05     ` Jiri Slaby
2020-07-17 12:26   ` Oleg Nesterov
2020-07-17 12:40     ` Oleg Nesterov
2020-07-18 12:28       ` Jiri Slaby
2020-07-18 17:14         ` Oleg Nesterov
2020-07-18 17:44           ` Christian Brauner
2020-07-20  5:44             ` Jiri Slaby
2020-07-20  6:43               ` Oleg Nesterov
2020-07-20  8:26                 ` Oleg Nesterov
2020-07-20  8:41                   ` Peter Zijlstra
2020-07-20 10:59                     ` peterz
2020-07-20 11:26                       ` peterz
2020-07-20 11:40                         ` Jiri Slaby
2020-07-20 12:20                         ` Valentin Schneider
2020-07-20 13:17                           ` peterz
2020-07-20 14:26                             ` Valentin Schneider
2020-07-20 12:57                         ` Christian Brauner
2020-07-20 14:05                         ` peterz
2020-07-20 14:02                       ` Oleg Nesterov
2020-07-20 14:21                         ` Peter Zijlstra
2020-07-20 14:39                           ` Oleg Nesterov
2020-07-20 15:35                             ` Oleg Nesterov
2020-07-20 15:38                               ` Peter Zijlstra
2020-07-21  4:52                           ` Paul Gortmaker
2020-07-21  8:37                             ` peterz
2020-07-21 12:13                               ` [PATCH] sched: Fix race against ptrace_freeze_trace() peterz
2020-07-21 14:29                                 ` Christian Brauner
2020-07-21 15:38                                 ` Oleg Nesterov
2020-07-21  9:14                           ` 5.8-rc*: kernel BUG at kernel/signal.c:1917 Valentin Schneider
     [not found]           ` <20200719072726.5892-1-hdanton@sina.com>
2020-07-19 18:23             ` Oleg Nesterov
2020-07-20  6:00           ` Jiri Slaby [this message]
2020-07-20  6:56             ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0deb1081-299b-1475-f368-7ca65b535a9c@kernel.org \
    --to=jirislaby@kernel.org \
    --cc=christian@brauner.io \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.