linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Swâmi Petaramesh" <swami@petaramesh.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Anand Jain <anand.jain@oracle.com>
Cc: Lionel Bouton <lionel-subscription@bouton.name>,
	linux-btrfs@vger.kernel.org
Subject: Re: Massive filesystem corruption since kernel 5.2 (ARCH)
Date: Thu, 8 Aug 2019 11:55:21 +0200	[thread overview]
Message-ID: <22973d72-5709-c705-1c8d-1b438df1cc49@petaramesh.org> (raw)
In-Reply-To: <69c47874-6608-2509-c059-659c4a1b6782@gmx.com>

Hi Qu,

On 8/8/19 10:46 AM, Qu Wenruo wrote:
> Follow up questions about the corruption.
>
> Is there enough free space (not only unallocated, but allocated bg) for
> metadata?
>
> As further digging into the case, it looks like btrfs is even harder to
> get corrupted for tree blocks.
>
> If we have enough metadata free space, we will try to allocate tree
> blocks at bytenr sequence, without reusing old bytenr until there is not
> enough space or hit the end of the block group.
>
> This means, even we have something wrong implementing barrier, we still
> won't write new data to old tree blocks (even several trans ago).


It's kind of hard for me to say if the 2 filesystems that got corrupt
lacked allocated metadata space at any time, and now both filesystems
have been reformatted, so I cannot tell.

What I can be 100% sure is that I never got any “No space left on
device” ENOSPC on any of them.

*BUT* the SSD on which the machine runs may have run close to full as I
had copied a bunch of ISOs on it shortly before upgrading packages - and
kernel.

However the upgrade went seemingly good and I didn't see no ENOSPC at
any time.


On the external HD that went corrupt as well, I'm pretty sure it
happened as follows :

- I started a full backup onto it in an emergency ;

- I asked myself « Will I have enough space » and checked with “df”.

- There were still several dozens of GBs free but not enough for a full
system backup. I cannot tell if these had been allocated or not in the past.

- Noticing that I would miss HD space (but far before it actually
happened) I deleted a high number of snapshots from the HD.

- I thus assume that the deletion of snapshots would have freed a good
amount of data AND metadata space.

So the situation of the external HD was that a full backup was in
progress and a vast number of snapshots have been deleted meanwhile.

After that the FS got corrupt at some point.


For the internal SSD, it looks like the kernel upgrade went good and the
machine rebooted OK, then midnight came and with it probably the cron
task that performs “snapper” timeline snapshots deletion.


Then the machine was turned off and rebooted next day, and by that time
the FS was corrupt.


So I strongly suspect the issue has something to do with snapshots
deletion, but I cannot tell more.


It may be worth noticing that the machine has been running a lot since I
reverted back to kernel 5.1 and reformatted the filesystems, and that no
corruption has occurred since, even though I performed quite a lot of
backups on the external HD after it has been reformatted.

Everything is in the exact same setup as before, except for the kernel.

So I would definitely exclude an hardware problem on the machine : it's
now running fine as it ever did.

I plan to retry upgrading to Arch kernel 5.2 in the coming weeks after
having performed a full disk binary clone in case it happens again.

(However I've seen that Arch has released 3-4 kernel 5.2 package updates
since, so it won't be the exact same kernel by the time I test again).

I will be on vacation until August, 20, so I cannot perform this test
before I'm back.

But I'll be glad to help if I can and thank you very much for your help
with this issue.

Best regards.

ॐ

-- 
Swâmi Petaramesh <swami@petaramesh.org> PGP 9076E32E



  reply	other threads:[~2019-08-08  9:55 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-29 12:32 Massive filesystem corruption since kernel 5.2 (ARCH) Swâmi Petaramesh
2019-07-29 13:02 ` Swâmi Petaramesh
2019-07-29 13:35   ` Qu Wenruo
2019-07-29 13:42     ` Swâmi Petaramesh
2019-07-29 13:47       ` Qu Wenruo
2019-07-29 13:52         ` Swâmi Petaramesh
2019-07-29 13:59           ` Qu Wenruo
2019-07-29 14:01           ` Swâmi Petaramesh
2019-07-29 14:08             ` Qu Wenruo
2019-07-29 14:21               ` Swâmi Petaramesh
2019-07-29 14:27                 ` Qu Wenruo
2019-07-29 14:34                   ` Swâmi Petaramesh
2019-07-29 14:40                     ` Qu Wenruo
2019-07-29 14:46                       ` Swâmi Petaramesh
2019-07-29 14:51                         ` Qu Wenruo
2019-07-29 14:55                           ` Swâmi Petaramesh
2019-07-29 15:05                             ` Swâmi Petaramesh
2019-07-29 19:20                               ` Chris Murphy
2019-07-30  6:47                                 ` Swâmi Petaramesh
2019-07-29 19:10                       ` Chris Murphy
2019-07-30  8:09                         ` Swâmi Petaramesh
2019-07-30 20:15                           ` Chris Murphy
2019-07-30 22:44                             ` Swâmi Petaramesh
2019-07-30 23:13                               ` Graham Cobb
2019-07-30 23:24                                 ` Chris Murphy
     [not found] ` <f8b08aec-2c43-9545-906e-7e41953d9ed4@bouton.name>
2019-07-29 13:35   ` Swâmi Petaramesh
2019-07-30  8:04     ` Henk Slager
2019-07-30  8:17       ` Swâmi Petaramesh
2019-07-29 13:39   ` Lionel Bouton
2019-07-29 13:45     ` Swâmi Petaramesh
     [not found]       ` <d8c571e4-718e-1241-66ab-176d091d6b48@bouton.name>
2019-07-29 14:04         ` Swâmi Petaramesh
2019-08-01  4:50           ` Anand Jain
2019-08-01  6:07             ` Swâmi Petaramesh
2019-08-01  6:36               ` Qu Wenruo
2019-08-01  8:07                 ` Swâmi Petaramesh
2019-08-01  8:43                   ` Qu Wenruo
2019-08-01 13:46                     ` Anand Jain
2019-08-01 18:56                       ` Swâmi Petaramesh
2019-08-08  8:46                         ` Qu Wenruo
2019-08-08  9:55                           ` Swâmi Petaramesh [this message]
2019-08-08 10:12                             ` Qu Wenruo
2019-08-24 17:44 Christoph Anton Mitterer
2019-08-25 10:00 ` Swâmi Petaramesh
2019-08-27  0:00   ` Christoph Anton Mitterer
2019-08-27  5:06     ` Swâmi Petaramesh
2019-08-27  6:13       ` Swâmi Petaramesh
2019-08-27  6:21         ` Qu Wenruo
2019-08-27  6:34           ` Swâmi Petaramesh
2019-08-27  6:52             ` Qu Wenruo
2019-08-27  9:14               ` Swâmi Petaramesh
2019-08-27 12:40                 ` Hans van Kranenburg
2019-08-29 12:46                   ` Oliver Freyermuth
2019-08-29 13:08                     ` Christoph Anton Mitterer
2019-08-29 13:09                     ` Swâmi Petaramesh
2019-08-29 13:11                     ` Qu Wenruo
2019-08-29 13:17                       ` Oliver Freyermuth
2019-08-29 17:40                         ` Oliver Freyermuth
2019-08-27 10:59           ` Swâmi Petaramesh
2019-08-27 11:11             ` Alberto Bursi
2019-08-27 11:20               ` Swâmi Petaramesh
2019-08-27 11:29                 ` Alberto Bursi
2019-08-27 11:45                   ` Swâmi Petaramesh
2019-08-27 17:49               ` Swâmi Petaramesh
2019-08-27 22:10               ` Chris Murphy
2019-08-27 12:52 ` Michal Soltys
2019-09-12  7:50 ` Filipe Manana
2019-09-12  8:24   ` James Harvey
2019-09-12  9:06     ` Filipe Manana
2019-09-12  9:09     ` Holger Hoffstätte
2019-09-12 10:53     ` Swâmi Petaramesh
2019-09-12 12:58       ` Christoph Anton Mitterer
2019-10-14  4:00         ` Nicholas D Steeves
2019-09-12  8:48   ` Swâmi Petaramesh
2019-09-12 13:09   ` Christoph Anton Mitterer
2019-09-12 14:28     ` Filipe Manana
2019-09-12 14:39       ` Christoph Anton Mitterer
2019-09-12 14:57         ` Swâmi Petaramesh
2019-09-12 16:21           ` Zdenek Kaspar
2019-09-12 18:52             ` Swâmi Petaramesh
2019-09-13 18:50       ` Pete
     [not found]         ` <CACzgC9gvhGwyQAKm5J1smZZjim-ecEix62ZQCY-wwJYVzMmJ3Q@mail.gmail.com>
2019-10-14  2:07           ` Adam Bahe
2019-10-14  2:19             ` Qu Wenruo
2019-10-14 17:54             ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22973d72-5709-c705-1c8d-1b438df1cc49@petaramesh.org \
    --to=swami@petaramesh.org \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lionel-subscription@bouton.name \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).