From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D42AC5ACAE for ; Wed, 11 Sep 2019 16:13:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6A00F20863 for ; Wed, 11 Sep 2019 16:13:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1568218409; bh=/i/D7IWNKgFBFeRyXp88n15uUa7dgrKAdzJKMFVQkfU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=GRXyjESEallD77+nXYBBVr9Sc1tVRwEY+37VXaZIL3ZxHqhNa4++sleh+MKabmWkO ulGPfCrkMlO99VRYJco5L3JVA1Ro33R60jDcjp84AbBO4wUWm7u3oLXPJWkKGbeYNl hK1E0WU/b87IO6frrlpoA3leUWjJpvP/B1rEfrYw= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728899AbfIKQN2 (ORCPT ); Wed, 11 Sep 2019 12:13:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:57448 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728825AbfIKQN2 (ORCPT ); Wed, 11 Sep 2019 12:13:28 -0400 Received: from mail-ua1-f42.google.com (mail-ua1-f42.google.com [209.85.222.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4B94C20863 for ; Wed, 11 Sep 2019 16:13:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1568218407; bh=/i/D7IWNKgFBFeRyXp88n15uUa7dgrKAdzJKMFVQkfU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=j6NFIk94OMOLBEGPtFFLhEJBmGs/AnKxZBYoRSm5YdVUeJLhao1MXXoY4V58mYe9q EzzR9lq62vgELo7cQxYh7tqidIo5Ov3YAd7CLpCVvf7EVIqM9Vr04/VByTW7+pbGXZ i+8FxDTJP6I8DEQWvQeBb6c60EarQUnKNn+U2/YY= Received: by mail-ua1-f42.google.com with SMTP id u18so6976780uap.2 for ; Wed, 11 Sep 2019 09:13:27 -0700 (PDT) X-Gm-Message-State: APjAAAX/dxFqjZfjg/38+8BR8BdjV/mOLpxF9TnS/SSXo3O3y92b1mGH I4Cbtc0OdAL5X6MLvBFswOp5IcUtOWvQmcPV8vo= X-Google-Smtp-Source: APXvYqyP8wz4mWDNXvVtE6wjvlD3AcHv479nnTQpTjBuQl/f4tOt+XO2zjq9gbfW01qGcPZU9MhA7y/uoRKgIrffjI0= X-Received: by 2002:ab0:2410:: with SMTP id f16mr17318666uan.83.1568218406277; Wed, 11 Sep 2019 09:13:26 -0700 (PDT) MIME-Version: 1.0 References: <20190911145542.1125-1-fdmanana@kernel.org> <18E989C2-3E39-4C87-907D-EA671099C4AE@fb.com> In-Reply-To: <18E989C2-3E39-4C87-907D-EA671099C4AE@fb.com> From: Filipe Manana Date: Wed, 11 Sep 2019 17:13:15 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] Btrfs: fix unwritten extent buffers and hangs on future writeback attempts To: Chris Mason Cc: "linux-btrfs@vger.kernel.org" , Dennis Zhou Content-Type: text/plain; charset="UTF-8" Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Wed, Sep 11, 2019 at 5:04 PM Chris Mason wrote: > > On 11 Sep 2019, at 15:55, fdmanana@kernel.org wrote: > > > From: Filipe Manana > > > > So fix this by not overwriting the return value (ret) with the result > > from flush_write_bio(). We also need to clear the > > EXTENT_BUFFER_WRITEBACK > > bit in case flush_write_bio() returns an error, otherwise it will hang > > any future attempts to writeback the extent buffer. > > > > This is a regression introduced in the 5.2 kernel. > > > > Fixes: 2e3c25136adfb ("btrfs: extent_io: add proper error handling to > > lock_extent_buffer_for_io()") > > Fixes: f4340622e0226 ("btrfs: extent_io: Move the BUG_ON() in > > flush_write_bio() one level up") > > Reported-by: Zdenek Sojka > > Link: > > https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#u > > Reported-by: Stefan Priebe - Profihost AG > > Link: > > https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#t > > Reported-by: Drazen Kacar > > Link: > > https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/ > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377 > > Signed-off-by: Filipe Manana > > --- > > fs/btrfs/extent_io.c | 23 ++++++++++++++--------- > > 1 file changed, 14 insertions(+), 9 deletions(-) > > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > > index 1ff438fd5bc2..1311ba0fc031 100644 > > --- a/fs/btrfs/extent_io.c > > +++ b/fs/btrfs/extent_io.c > > @@ -3628,6 +3628,13 @@ void wait_on_extent_buffer_writeback(struct > > extent_buffer *eb) > > TASK_UNINTERRUPTIBLE); > > } > > > > +static void end_extent_buffer_writeback(struct extent_buffer *eb) > > +{ > > + clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags); > > + smp_mb__after_atomic(); > > + wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK); > > +} > > + > > /* > > * Lock eb pages and flush the bio if we can't the locks > > * > > @@ -3699,8 +3706,11 @@ static noinline_for_stack int > > lock_extent_buffer_for_io(struct extent_buffer *eb > > > > if (!trylock_page(p)) { > > if (!flush) { > > - ret = flush_write_bio(epd); > > - if (ret < 0) { > > + int err; > > + > > + err = flush_write_bio(epd); > > + if (err < 0) { > > + ret = err; > > failed_page_nr = i; > > goto err_unlock; > > } > > > Dennis (cc'd) has been trying a similar fix against this in production, > but sending it was interrupted by plumbing conferences. I think he > found that it needs to undo this as well: > > percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, > -eb->len, > fs_info->dirty_metadata_batch); > > With the IO errors, we should end up aborting the FS. This function > also flips the the extent buffer written and dirty flags, and his patch > resets them as well. Given that we're aborting anyway, it's not > critical, but it's probably a good idea to fix things up in the goto > err_unlock just to make future bugs less likely. Yes, I considered that at some point (undo everything done so far, under the locks, etc) and thought it was pointless as well because we abort. But we may not abort - if we never start the writeback for an eb because we returned error from flush_write_bio(), we can leave btree_write_cache_pages() without noticing the error. Since writeback never started, and btree_write_cache_pages() didn't return the error, the transaction commit path may never get an error from filemap_fdatawrite_range, and we can commit the transaction despite failure to start writeback for some extent buffer. A problem that existed before that regression in 5.2 anyway. Sending it as separate. I'll include the undo of all operations in patch however, it doesn't hurt for sure. Thanks. > > -chris