From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-f54.google.com ([209.85.210.54]:43710 "EHLO mail-ot1-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726769AbeJCDUE (ORCPT ); Tue, 2 Oct 2018 23:20:04 -0400 Received: by mail-ot1-f54.google.com with SMTP id e21-v6so3283012otk.10 for ; Tue, 02 Oct 2018 13:34:54 -0700 (PDT) MIME-Version: 1.0 From: Vijay Chidambaram Date: Tue, 2 Oct 2018 15:34:17 -0500 Message-ID: Subject: Crashmonkey and Ace: tools to test file-system crash consistency To: linux-fsdevel , Jayashree Mohan , Ashlie Martinez , vijay@cs.utexas.edu Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi, My research group at UT Austin has developed two tools to help test file-system crash-consistency. The tools found 10 previously undiscovered bugs in btrfs and F2FS (some of which have existed in the kernel since 2014). The tools work with any POSIX file system, and do not require any modifications to file-system code. The first tool, Crashmonkey, takes as input a workload (a sequence of file-system operations). It runs the workload on a new file system, simulates crashes after persistence points (fsync/fdatasync/sync), recovers the file system, and tests if the file system recovered correctly. It does not use fsck to test crash-consistency; it uses its own fine-grained checks instead. The second tool, Automatic Crash Explorer (Ace), generates workloads that are fed to Crashmonkey. Given constraints such as the size of the workload or which file-system operations to use in the workload, Ace systematically generates all workloads that fall within the constraints. Together, the tools provide push-button testing of crash-consistency for POSIX file systems. Based on the computational budget for testing, you can configure the tools to test different number of workloads. The improvement over xfstests and dm-log-writes is that given high-level constraints, generating workloads and testing each workload is done automatically with our tools. With dm-log-writes and xfstests, you need to manually write each workload. Our code is available here: https://github.com/utsaslab/crashmonkey We have instructions on how to run a single command to test your file system with over 300 simple workloads. This command should take about 35 min on a single-core machine. The paper about the work is available here: http://www.cs.utexas.edu/~jaya/pdf/osdi18-B3.pdf Slides from our upcoming conference talk: http://www.cs.utexas.edu/~jaya/slides/osdi18-B3-slides.pdf Finally, a demo about the tool: https://www.youtube.com/watch?v=6fiomPVK8o0&feature=youtu.be We hope you find the tools useful. Thanks to Amir Goldstein and Ted Ts'o who encouraged us in doing this work! Thanks, Vijay Chidambaram http://www.cs.utexas.edu/~vijay/