From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A161C43219 for ; Sat, 4 May 2019 09:31:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4B36320675 for ; Sat, 4 May 2019 09:31:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726694AbfEDJbU convert rfc822-to-8bit (ORCPT ); Sat, 4 May 2019 05:31:20 -0400 Received: from smtprelay03.ispgateway.de ([80.67.29.7]:60391 "EHLO smtprelay03.ispgateway.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725823AbfEDJbT (ORCPT ); Sat, 4 May 2019 05:31:19 -0400 X-Greylist: delayed 138511 seconds by postgrey-1.27 at vger.kernel.org; Sat, 04 May 2019 05:31:18 EDT Received: from [94.217.144.7] (helo=[192.168.177.20]) by smtprelay03.ispgateway.de with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1hMr0m-0008KV-3A; Sat, 04 May 2019 11:31:16 +0200 From: "Hendrik Friedel" To: "Chris Murphy" Subject: Re[4]: Rough (re)start with btrfs Cc: "Qu Wenruo" , "Chris Murphy" , "Btrfs BTRFS" Date: Sat, 04 May 2019 09:31:10 +0000 Message-Id: In-Reply-To: References: Reply-To: "Hendrik Friedel" User-Agent: eM_Client/7.2.34062.0 Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Df-Sender: aGVuZHJpa0BmcmllZGVscy5uYW1l Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Hello, this: >Some prefer bug report in mail list directly like me, some prefer kernel >bugzilla. and this: >Not sure if other is looking into this. >Btrfs bug tracking is somewhat tricky. may be related... >Not likely. You can do a scrub to check for metadata and data >corruption. Did that. All good. >And you can do an offline (unmounted) 'btrfs check >--readonly' to check the validity of the metadata. Will do that. > The Btrfs call >traces during the blocked task are INFO, not warnings or errors, so >the file system and data is likely fine. There's no read, write, >corruption, or generation errors in the dmesg; but you can also check >'btfs dev stats ' which is a persistent counter for this >particular device. [/dev/sdh1].write_io_errs 0 [/dev/sdh1].read_io_errs 0 [/dev/sdh1].flush_io_errs 0 [/dev/sdh1].corruption_errs 0 [/dev/sdh1].generation_errs 0 >I should have read this before replying earlier. > >You can also do a one time clean mount with '-o >clear_cache,space_cache=v2' which will remove the v1 (default) space >cache, and create a v2 cache. Subsequent mount will see the flag for >this feature and always use the v2 cache. It's a totally differently >implementation and shouldn't have this problem. So, you have a suspicion already about what caused the problem? Why is v2 then not default? Is it worth chasing the Bug in v1? For me, the question now is, whether we should chase this Bug or not. I encountered it three times while filling a 8TB drive with 7TB. Now, I have 1TB left and I am not sure I can reproduce, but I can try. >Qu would know better but usually developers ask for sysrq+w when >there's blocked tasks. I am wondering, whether there is a -long term- a better way than this. Ideally, btrfs would automatically create a btrfs-bug-DD-MM-YY-hh-mm-ss.tar.gz with all the info you need and inform the User about it and where to issue the bug. I am aware that this is tricky. But in order to further mature btrfs, I assume you need more real life data with good quality (that is, the right logs) without too much work (asking for logs). What's your view on this? >You know what? Try changing the scheduler from mq-deadline to none. >Change nothing else. Now try to reproduce. Let's see if it still >happens. Wouldn't it make sense first to try to reproduce without changing anything? >Also, what are the mount options? rw,noatime,nospace_cache,subvolid=5,subvol=/ But noatime and nospace_cache I added just today. Greetings, Hendrik