From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E837C433F5 for ; Wed, 16 Feb 2022 04:12:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF6EB6B0078; Tue, 15 Feb 2022 23:12:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AA5C46B007B; Tue, 15 Feb 2022 23:12:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9462D6B007D; Tue, 15 Feb 2022 23:12:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 852016B0078 for ; Tue, 15 Feb 2022 23:12:14 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 361088249980 for ; Wed, 16 Feb 2022 04:12:14 +0000 (UTC) X-FDA: 79147320588.21.EEDF858 Received: from lgeamrelo11.lge.com (lgeamrelo11.lge.com [156.147.23.51]) by imf16.hostedemail.com (Postfix) with ESMTP id E0E76180002 for ; Wed, 16 Feb 2022 04:12:05 +0000 (UTC) Received: from unknown (HELO lgeamrelo02.lge.com) (156.147.1.126) by 156.147.23.51 with ESMTP; 16 Feb 2022 13:11:56 +0900 X-Original-SENDERIP: 156.147.1.126 X-Original-MAILFROM: byungchul.park@lge.com Received: from unknown (HELO localhost.localdomain) (10.177.244.38) by 156.147.1.126 with ESMTP; 16 Feb 2022 13:11:56 +0900 X-Original-SENDERIP: 10.177.244.38 X-Original-MAILFROM: byungchul.park@lge.com From: Byungchul Park To: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org Cc: torvalds@linux-foundation.org, mingo@redhat.com, linux-kernel@vger.kernel.org, peterz@infradead.org, will@kernel.org, tglx@linutronix.de, rostedt@goodmis.org, joel@joelfernandes.org, sashal@kernel.org, daniel.vetter@ffwll.ch, chris@chris-wilson.co.uk, duyuyang@gmail.com, johannes.berg@intel.com, tj@kernel.org, willy@infradead.org, david@fromorbit.com, amir73il@gmail.com, bfields@fieldses.org, gregkh@linuxfoundation.org, kernel-team@lge.com, linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org, minchan@kernel.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com, sj@kernel.org, jglisse@redhat.com, dennis@kernel.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, vbabka@suse.cz, ngupta@vflare.org, linux-block@vger.kernel.org, axboe@kernel.dk, paolo.valente@linaro.org, josef@toxicpanda.com, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz, jlayton@kernel.org, dan.j.williams@intel.com, hch@infradead.org, djwong@kernel.org, dri-devel@lists.freedesktop.org, airlied@linux.ie, rodrigosiqueiramelo@gmail.com, melissa.srw@gmail.com, hamohammed.sa@gmail.com Subject: [REPORT] ext4 deadlock possibilities by DEPT Date: Wed, 16 Feb 2022 13:11:51 +0900 Message-Id: <1644984711-26423-1-git-send-email-byungchul.park@lge.com> X-Mailer: git-send-email 1.9.1 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: E0E76180002 X-Rspam-User: Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of byungchul.park@lge.com designates 156.147.23.51 as permitted sender) smtp.mailfrom=byungchul.park@lge.com; dmarc=none X-Stat-Signature: es68s9pr6ujbyqemsecsbe3dga7tc7nx X-HE-Tag: 1644984725-595352 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Theodore, Andreas and ext4 folks, I've been developing a tool for detecting deadlock possibilities by tracking wait/event rather than lock(?) acquisition order to try to cover all synchonization machanisms. It's done on v5.17-rc1 tag. https://github.com/lgebyungchulpark/linux-dept/commits/dept1.11_on_v5.17-rc1 Benifit: 0. Works with all lock primitives. 1. Works with wait_for_completion()/complete(). 2. Works with 'wait' on PG_locked. 3. Works with 'wait' on PG_writeback. 4. Works with swait/wakeup. 5. Works with waitqueue. 6. Multiple reports are allowed. 7. Deduplication control on multiple reports. 8. Withstand false positives thanks to 6. 9. Easy to tag any wait/event. Future work: 0. To make it more stable. 1. To separates Dept from Lockdep. 2. To improves performance in terms of time and space. 3. To use Dept as a dependency engine for Lockdep. 4. To add any missing tags of wait/event in the kernel. 5. To deduplicate stack trace. I've got several reports from the tool. Some of them look like false alarms caused by Lockdep's fake annotations added for better detection. However, some others look like real deadlock possibility. Because of my unfamiliarity of the domain, it's hard to confirm if it's a real one. I'd like to ask for your opinion on it and it'd be appreciated. How to interpret the report is: 1. E(event) in each context cannot be triggered because of the W(wait) that cannot be woken. 2. The stack trace helping find the problematic code is located in each conext's detail. Let me add the reports on this email thread. --- Thanks, Byungchul