From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88421C433DB for ; Thu, 21 Jan 2021 19:32:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5888423A33 for ; Thu, 21 Jan 2021 19:32:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726448AbhAUTbs (ORCPT ); Thu, 21 Jan 2021 14:31:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726398AbhAUTFG (ORCPT ); Thu, 21 Jan 2021 14:05:06 -0500 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 451AEC061786 for ; Thu, 21 Jan 2021 10:56:21 -0800 (PST) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 3C4581F45F40 From: Gabriel Krisman Bertazi To: Jan Kara Cc: tytso@mit.edu, jack@suse.com, viro@zeniv.linux.org.uk, amir73il@gmail.com, dhowells@redhat.com, david@fromorbit.com, darrick.wong@oracle.com, khazhy@google.com, linux-fsdevel@vger.kernel.org, kernel@collabora.com Subject: Re: [RFC] Filesystem error notifications proposal Organization: Collabora References: <87lfcne59g.fsf@collabora.com> <20210121114453.GD24063@quack2.suse.cz> Date: Thu, 21 Jan 2021 15:56:12 -0300 In-Reply-To: <20210121114453.GD24063@quack2.suse.cz> (Jan Kara's message of "Thu, 21 Jan 2021 12:44:53 +0100") Message-ID: <87a6t2dsqb.fsf@collabora.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Jan Kara writes: > On Wed 20-01-21 17:13:15, Gabriel Krisman Bertazi wrote: >> fanotify is based on fsnotify, and the infrastructure for the >> visibility semantics is mostly implemented by fsnotify itself. It >> would be possible to make error notifications a new mechanism on top >> of fsnotify, without modifying fanotify, but that would require >> exposing a very similar interface to userspace, new system calls, and >> that doesn't seem justifiable since we can extend fanotify syscalls >> without ABI breakage. > > I agree fanotify could be used for propagating the error information from > kernel to userspace. But before we go to design how exactly the events > should look like, I'd like to hear a usecase (or usecases) for this > feature. Because the intended use very much determines which information we > need to propagate to userspace and what flexibility we need there. Hi Jan, Thanks for the review. My user is a cloud provider who wants to monitor several machines, potentially with several volumes each, for metadata and data corruption that might require attention from a sysadmin. They would run a watcher for each filesystem of each machine and consolidate the information. Right now, they have a solution in-house to monitor ext4, which basically exports to userspace any kind of condition that would trigger one of the ext4_error_* functions. We need to monitor a bit more than just writeback errors, but that is a big part of the functionality we want. We would be interested in finding out about inode corruption, checksum failures, and read errors as well. Khazhy, in Cc, is more familiar with the specific requirements, and might be able to provide more information on their usage of the existing solution. >> 3 Proposal >> ========== >> >> The error notification framework is based on fanotify instead of >> watch_queue. It is exposed through a new set of marks FAN_ERROR_*, >> exposed through the fanotify_mark(2) API. >> >> fanotify (fsnotify-based) has the infrastructure in-place to link >> notifications happening at filesystem objects to mountpoints and to >> filesystems, and already exposes an interface with well defined >> semantics of how those are exposed to watchers in different >> mountpoints or different subtrees. >> >> A new message format is exposed, if the user passed >> FAN_REPORT_DETAILED_ERROR fanotify_init(2) flag. FAN_ERROR messages >> don't have FID/DFID records. > > So this depends on the use case. I can imagine that we could introduce new > event type FAN_FS_ERROR (like we currently have FAN_MODIFY, FAN_ACCESS, > FAN_CREATE, ...). This event type would be generated when error happens on > fs object - we can generate it associated with struct file, struct dentry, > or only superblock, depending on what information is available at the place > where fs spots the error. Similarly to how we generate other fsnotify > events. But here's the usecase question - is there really need for fine > granularity of error reporting (i.e., someone watching errors only on a > particular file, or directory)? My use-case requires only watching an entire filesystem, and not specific files. I can imagine a use-case of someone watching a specific mount point, like an application running inside a container is notified only of errors in files/directories within that container. > > Then we can have FAN_REPORT_DETAILED_ERROR which would add to event more > information about the error like you write below - although identification > of the inode / fsid is IMO redundant - you can get all that information > (and more) from FID / DFID event info entries if you want it. When FID provides a handle, I'd need to get information on that handle without opening it, as the file might be corrupted. Would we need an interface to retrieve inode number by handle without opening the file, i.e. stat_by_handle? Thanks, -- Gabriel Krisman Bertazi