From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.kernel.org ([198.145.29.99]:39322 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752086AbeDQLID (ORCPT ); Tue, 17 Apr 2018 07:08:03 -0400 Message-ID: <1523963281.4779.21.camel@kernel.org> Subject: [LSF/MM TOPIC] improving writeback error handling From: Jeff Layton To: lsf-pc@lists.linux-foundation.org Cc: linux-fsdevel , Matthew Wilcox , Andres Freund Date: Tue, 17 Apr 2018 07:08:01 -0400 Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: I'd like to have a discussion on how to improve the handling of errors that occur during writeback. I think there are 4 main issues that would be helpful to cover: 1) In v4.16, we added the errseq mechanism to the kernel to make writeback error reporting more reliable. That has helped some use cases, but there are some applications (e.g. PostgreSQL) that were relying on the ability to see writeback errors that occurred before the file description existed. Do we need to tweak this mechanism to help those use cases, or would that do more harm than good? 2) Most filesystems report data wb errors on all fds that were open at the time of the error now, but metadata writeback can also fail, and those don't get reported in the same way so far. Should we extend those semantics to metadata writeback? How do we get there if so? 3) The question of what to do with pages in the pagecache that fail writeback is still unresolved. Most filesystems just clear the dirty bit and and carry on, but some invalidate them or just clear the uptodate bit. This sort of inconsistency (and lack of documentation) is problematic and has led to applications assuming behavior that doesn't exist. I believe we need to declare an "ideal behavior" for Linux filesystems in this regard, add VM/FS helpers to make it easier for filesystems to conform to that behavior, and document it well. The big question is : what sort of behavior makes the most sense here? 4) syncfs doesn't currently report an error when a single inode fails writeback, only when syncing out the block device. Should it report errors in that case as well? Thanks! -- Jeff Layton