From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-f194.google.com ([209.85.215.194]:44305 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727077AbeH1Aq2 (ORCPT ); Mon, 27 Aug 2018 20:46:28 -0400 Received: by mail-pg1-f194.google.com with SMTP id r1-v6so124667pgp.11 for ; Mon, 27 Aug 2018 13:58:11 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <001a113f711ae2110c055f45acb8@google.com> <20171212220647.GJ185376@gmail.com> From: Dmitry Vyukov Date: Mon, 27 Aug 2018 13:57:50 -0700 Message-ID: Subject: Re: possible deadlock in seq_read To: Kees Cook Cc: Eric Biggers , syzbot , "linux-fsdevel@vger.kernel.org" , LKML , syzkaller-bugs , Al Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Aug 27, 2018 at 11:20 AM, Kees Cook wrote: > On Tue, Dec 12, 2017 at 2:06 PM, Eric Biggers wrote: >> On Fri, Dec 01, 2017 at 03:29:01AM -0800, syzbot wrote: >>> Hello, >>> >>> syzkaller hit the following crash on >>> df8ba95c572a187ed2aa7403e97a7a7f58c01f00 >>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master >>> compiler: gcc (GCC) 7.1.1 20170620 >>> .config is attached >>> Raw console output is attached. >>> >>> Unfortunately, I don't have any reproducer for this bug yet. >>> >>> >>> >>> ====================================================== >>> WARNING: possible circular locking dependency detected >>> 4.15.0-rc1+ #202 Not tainted >>> ------------------------------------------------------ >>> syz-executor4/26476 is trying to acquire lock: >>> (&p->lock){+.+.}, at: [<0000000040185b66>] seq_read+0xd5/0x13d0 >>> fs/seq_file.c:165 >>> >>> but task is already holding lock: >>> (&pipe->mutex/1){+.+.}, at: [<00000000c644bcdc>] pipe_lock_nested >>> fs/pipe.c:67 [inline] >>> (&pipe->mutex/1){+.+.}, at: [<00000000c644bcdc>] >>> pipe_lock+0x56/0x70 fs/pipe.c:75 >>> >>> which lock already depends on the new lock. >>> >>> >>> the existing dependency chain (in reverse order) is: >>> >>> -> #2 (&pipe->mutex/1){+.+.}: >>> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004 >>> __mutex_lock_common kernel/locking/mutex.c:756 [inline] >>> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 >>> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 >>> __pipe_lock fs/pipe.c:88 [inline] >>> fifo_open+0x15c/0xa40 fs/pipe.c:916 >>> do_dentry_open+0x682/0xd70 fs/open.c:752 >>> vfs_open+0x107/0x230 fs/open.c:866 >>> do_last fs/namei.c:3379 [inline] >>> path_openat+0x1157/0x3530 fs/namei.c:3519 >>> do_filp_open+0x25b/0x3b0 fs/namei.c:3554 >>> do_open_execat+0x1b9/0x5c0 fs/exec.c:849 >>> do_execveat_common.isra.30+0x90c/0x23c0 fs/exec.c:1741 >>> do_execveat fs/exec.c:1859 [inline] >>> SYSC_execveat fs/exec.c:1940 [inline] >>> SyS_execveat+0x4f/0x60 fs/exec.c:1932 >>> do_syscall_64+0x26c/0x920 arch/x86/entry/common.c:285 >>> return_from_SYSCALL_64+0x0/0x75 >>> >>> -> #1 (&sig->cred_guard_mutex){+.+.}: >>> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004 >>> __mutex_lock_common kernel/locking/mutex.c:756 [inline] >>> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 >>> mutex_lock_killable_nested+0x16/0x20 kernel/locking/mutex.c:923 >>> do_io_accounting+0x1c2/0xf50 fs/proc/base.c:2682 >>> proc_tid_io_accounting+0x1f/0x30 fs/proc/base.c:2725 >>> proc_single_show+0xf8/0x170 fs/proc/base.c:744 >>> seq_read+0x385/0x13d0 fs/seq_file.c:234 >>> __vfs_read+0xef/0xa00 fs/read_write.c:411 >>> vfs_read+0x124/0x360 fs/read_write.c:447 >>> SYSC_read fs/read_write.c:573 [inline] >>> SyS_read+0xef/0x220 fs/read_write.c:566 >>> entry_SYSCALL_64_fastpath+0x1f/0x96 >>> >> >> So the problem with all these deadlocks involving pipe->mutex and >> sig->cred_guard_mutex is that execve() ranks pipe->mutex below >> sig->cred_guard_mutex when it tries to open a fifo, whereas reading or writing >> some of the /proc files result in ->cred_guard_mutex being taken which may be >> underneath pipe->mutex from splice(). Here's a program which causes an actual >> deadlock using this bug (in addition to reproducing the lockdep report): >> >> #define _GNU_SOURCE >> #include >> #include >> #include >> #include >> >> static void *exec_thread(void *_arg) >> { >> for (;;) >> execl("fifo", "fifo", NULL); >> } >> >> int main() >> { >> int readend, writeend; >> int syscallfd; >> pthread_t t; >> >> mknod("fifo", 0777|S_IFIFO, 0); >> readend = open("fifo", O_RDONLY|O_NONBLOCK); >> writeend = open("fifo", O_WRONLY); >> syscallfd = open("/proc/self/syscall", O_RDONLY); >> >> pthread_create(&t, NULL, exec_thread, NULL); >> >> for (;;) { >> char buffer[16]; >> loff_t off_in = 0; >> splice(syscallfd, &off_in, writeend, NULL, 16, 0); >> read(readend, buffer, 16); >> } >> } >> >> I'm not sure what the fix will be. Maybe the proc handlers should take a >> different lock instead of cred_guard_mutex. Or perhaps execve should check that >> the file is a regular file before it attempts to open it. > > This cleaner reproducer still generates the lockdep warning (but I can > ctrl-C out of it without leaving behind a zombie), but I see that > syzbot isn't seeing this any more. Why did it stop? (And can we feed a > reproducer in to syzbot?) > > Was this creating an uninterruptible deadlock before? (Perhaps > something did change here?) Hi, This never was an uninterruptible deadlock. This is a lockdep "potential deadlock". For the deadlock to actually happen you need very precise thread interleaving when 2 threads lock 2 different locks first and then try to lock them the other way around.