From: Christian Wimmer <telefonchris@icloud.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu WenRuo <wqu@suse.com>,
Anand Jain <anand.jain@oracle.com>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: 12 TB btrfs file system on virtual machine broke again
Date: Sun, 5 Jan 2020 22:31:55 -0300 [thread overview]
Message-ID: <4F9562E6-7337-4842-855B-0AF52C4C7449@icloud.com> (raw)
In-Reply-To: <CAJCQCtQxN17UL7swO7vU6-ORVmHfQHteUQZ7iS1w7Y5XLHTpVA@mail.gmail.com>
> On 5. Jan 2020, at 19:28, Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sun, Jan 5, 2020 at 2:58 PM Christian Wimmer <telefonchris@icloud.com> wrote:
>>> On 5. Jan 2020, at 18:13, Chris Murphy <lists@colorremedies.com> wrote:
>>>
>>> On Sun, Jan 5, 2020 at 1:36 PM Christian Wimmer <telefonchris@icloud.com> wrote:
>>>>
>>>>
>>>>
>>>>> On 5. Jan 2020, at 17:30, Chris Murphy <lists@colorremedies.com> wrote:
>>>>>
>>>>> On Sun, Jan 5, 2020 at 12:48 PM Christian Wimmer
>>>>> <telefonchris@icloud.com> wrote:
>>>>>>
>>>>>>
>>>>>> #fdisk -l
>>>>>> Disk /dev/sda: 256 GiB, 274877906944 bytes, 536870912 sectors
>>>>>> Disk model: Suse 15.1-0 SSD
>>>>>> Units: sectors of 1 * 512 = 512 bytes
>>>>>> Sector size (logical/physical): 512 bytes / 4096 bytes
>>>>>> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>>>>>> Disklabel type: gpt
>>>>>> Disk identifier: 186C0CD6-F3B8-471C-B2AF-AE3D325EC215
>>>>>>
>>>>>> Device Start End Sectors Size Type
>>>>>> /dev/sda1 2048 18431 16384 8M BIOS boot
>>>>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem
>>>>>> /dev/sda3 532674560 536870878 4196319 2G Linux swap
>>>>>
>>>>>
>>>>>
>>>>>> btrfs insp dump-s /dev/sda2
>>>>>>
>>>>>>
>>>>>> Here I have only btrfs-progs version 4.19.1:
>>>>>>
>>>>>> linux-ze6w:~ # btrfs version
>>>>>> btrfs-progs v4.19.1
>>>>>> linux-ze6w:~ # btrfs insp dump-s /dev/sda2
>>>>>> superblock: bytenr=65536, device=/dev/sda2
>>>>>> ---------------------------------------------------------
>>>>>> csum_type 0 (crc32c)
>>>>>> csum_size 4
>>>>>> csum 0x6d9388e2 [match]
>>>>>> bytenr 65536
>>>>>> flags 0x1
>>>>>> ( WRITTEN )
>>>>>> magic _BHRfS_M [match]
>>>>>> fsid affdbdfa-7b54-4888-b6e9-951da79540a3
>>>>>> metadata_uuid affdbdfa-7b54-4888-b6e9-951da79540a3
>>>>>> label
>>>>>> generation 799183
>>>>>> root 724205568
>>>>>> sys_array_size 97
>>>>>> chunk_root_generation 797617
>>>>>> root_level 1
>>>>>> chunk_root 158835163136
>>>>>> chunk_root_level 0
>>>>>> log_root 0
>>>>>> log_root_transid 0
>>>>>> log_root_level 0
>>>>>> total_bytes 272719937536
>>>>>> bytes_used 106188886016
>>>>>> sectorsize 4096
>>>>>> nodesize 16384
>>>>>> leafsize (deprecated) 16384
>>>>>> stripesize 4096
>>>>>> root_dir 6
>>>>>> num_devices 1
>>>>>> compat_flags 0x0
>>>>>> compat_ro_flags 0x0
>>>>>> incompat_flags 0x163
>>>>>> ( MIXED_BACKREF |
>>>>>> DEFAULT_SUBVOL |
>>>>>> BIG_METADATA |
>>>>>> EXTENDED_IREF |
>>>>>> SKINNY_METADATA )
>>>>>> cache_generation 799183
>>>>>> uuid_tree_generation 557352
>>>>>> dev_item.uuid 8968cd08-0c45-4aff-ab64-65f979b21694
>>>>>> dev_item.fsid affdbdfa-7b54-4888-b6e9-951da79540a3 [match]
>>>>>> dev_item.type 0
>>>>>> dev_item.total_bytes 272719937536
>>>>>> dev_item.bytes_used 129973092352
>>>>>> dev_item.io_align 4096
>>>>>> dev_item.io_width 4096
>>>>>> dev_item.sector_size 4096
>>>>>> dev_item.devid 1
>>>>>> dev_item.dev_group 0
>>>>>> dev_item.seek_speed 0
>>>>>> dev_item.bandwidth 0
>>>>>> dev_item.generation 0
>>>>>
>>>>> Partition map says
>>>>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem
>>>>>
>>>>> Btrfs super says
>>>>>> total_bytes 272719937536
>>>>>
>>>>> 272719937536*512=532656128
>>>>>
>>>>> Kernel FITRIM want is want=532656128
>>>>>
>>>>> OK so the problem is the Btrfs super isn't set to the size of the
>>>>> partition. The usual way this happens is user error: partition is
>>>>> resized (shrink) without resizing the file system first. This file
>>>>> system is still at risk of having problems even if you disable
>>>>> fstrim.timer. You need to shrink the file system is the same size as
>>>>> the partition.
>>>>>
>>>>
>>>> Could this be a problem of Parallels Virtual machine that maybe sometimes try to get more space on the hosting file system?
>>>> One solution would be to have a fixed size of the disc file instead of a growing one.
>>>
>>> I don't see how it's related. Parallels has no ability I'm aware of to
>>> change the GPT partition map or the Btrfs super block - as in, rewrite
>>> it out with a modification correctly including all checksums being
>>> valid. This /dev/sda has somehow been mangled on purpose.
>>>
>>> Again, from the GPT
>>>>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem
>>>>>> /dev/sda3 532674560 536870878 4196319 2G Linux swap
>>>
>>> The end LBA for sda2 is 419448831, but the start LBA for sda3 is
>>> 532674560. There's a ~54G gap in there as if something was removed.
>>> I'm not sure why a software installer would produce this kind of
>>> layout on purpose, because it has no purpose.
>>>
>>>
>>
>> Ok, understand. Very strange. Maybe we should forget about this particular problem.
>> Should I repair it somehow? And if yes, how?
>
>
>>>>>> /dev/sda2 18432 419448831 419430400 200G Linux filesystem
>
> delete this partition, recreate a new one with the same start LBA,
> 18432 and a new end LBA that matches the actual fs size:
>
> 18432+(272719937536/512)=532674560
>
> write it and reboot the VM. You could instead resize Btrfs to match
> the partition but that might piss off the kernel if Btrfs thinks it
> needs to move block groups from a location outside the partition. So I
> would just resize the partition. And then you need to do a scrub and a
> btrfs check on this volume to see if it's damaged.
>
> I don't know but I suspect it could be possible that this malformed
> root might have resulted in a significant instability of the system at
> some point, and in it's last states of confusion as it face planted,
> wrote out very spurious data causing your broken Btrfs file system. I
> can't prove that.
>
>
Ok, I will try.
>
>
>
>
>>
>>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>> linux-ze6w:~ # systemctl status fstrim.timer
>>>>>> ● fstrim.timer - Discard unused blocks once a week
>>>>>> Loaded: loaded (/usr/lib/systemd/system/fstrim.timer; enabled; vendor preset: enabled)
>>>>>> Active: active (waiting) since Sun 2020-01-05 15:24:59 -03; 1h 19min ago
>>>>>> Trigger: Mon 2020-01-06 00:00:00 -03; 7h left
>>>>>> Docs: man:fstrim
>>>>>>
>>>>>> Jan 05 15:24:59 linux-ze6w systemd[1]: Started Discard unused blocks once a week.
>>>>>>
>>>>>> linux-ze6w:~ # systemctl status fstrim.service
>>>>>> ● fstrim.service - Discard unused blocks on filesystems from /etc/fstab
>>>>>> Loaded: loaded (/usr/lib/systemd/system/fstrim.service; static; vendor preset: disabled)
>>>>>> Active: inactive (dead)
>>>>>> Docs: man:fstrim(8)
>>>>>> linux-ze6w:~ #
>>>>>
>>>>> OK so it's not set to run. Why do you have FITRIM being called?
>>>>
>>>> No idea.
>>>
>>> Well you're going to have to find it. I can't do that for you.
>>
>>
>> Ok, I will have a look. Can I simply deactivate the service?
>
> fstrim.service is a one shot. The usual method of it being activated
> once per week is via fstrim.timer - but your status check of
> fstrim.timer says it's disabled. So something else is running fstrim.
> I have no idea what it is, you have to find it in order to deactivate
> it.
Ok, got it.
>
> This 12T file system is a single "device" backed by a 12T file on the
> Promise drive? And it's a Parallel's formatted VM file? I guess I
> would have used raw instead of a Parallels format. That way you can
> inspect things from outside the VM. But that's perhaps a minor point.
I would like to do so! I will investigate on how to do so.
I am using this way that I am doing because of the speed.
I have a 2TB Samsung_X5 where I have a 1.8TB disc file and writing a 10GB file takes only 4.4 seconds:
bash$ dd if=/dev/zero of=erasemenow6 bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 4.43557 s, 2.4 GB/s
bash$
Isn’t that fantastic?
A little worse results I get with the Pegasus Promise3:
bash$ dd if=/dev/zero of=erasemenow6 bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 5.49031 s, 1.9 GB/s
bash$
This speeds I did not achieve with any other combination of files systems.
And all this I get together with the fantastic btrfs file system that allows me to copy files of 10GB size in just a fraction of a second.
It is just that I am a little afraid now because of the two mega crashes that damaged all my data.
The Parallels Virtual machine can not access the Samsung_X5 or Pegasus directly in order to partition it, so I have to format them in Mac OS with ‘Mac Os Extended (Journaled). Then the Parallels formatted VM files are created on it.
BTW, this files are created as “Expanding disc”, so they occupy only some MB in the beginning and grow by time. Should this be a problem?
Actually I liked a lot your idea of creating the file in raw format and thus being able to inspect things outside the VM.
How can I do this? Do you have any idea?
Thanks a lot guys!
Chris
next prev parent reply other threads:[~2020-01-06 1:32 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-06 3:44 [PATCH] btrfs-progs: Skip device tree when we failed to read it Qu Wenruo
2019-12-06 6:12 ` Anand Jain
2019-12-06 15:50 ` Christian Wimmer
2019-12-06 16:34 ` Christian Wimmer
[not found] ` <762365A0-8BDF-454B-ABA9-AB2F0C958106@icloud.com>
2019-12-07 1:16 ` Qu WenRuo
2019-12-07 3:47 ` Christian Wimmer
2019-12-07 4:31 ` Qu Wenruo
2019-12-07 13:03 ` Christian Wimmer
2019-12-07 14:10 ` Qu Wenruo
2019-12-07 14:25 ` Christian Wimmer
2019-12-07 16:44 ` Christian Wimmer
2019-12-08 1:21 ` Qu WenRuo
2019-12-10 21:25 ` Christian Wimmer
2019-12-11 0:36 ` Qu Wenruo
2019-12-11 15:57 ` Christian Wimmer
[not found] ` <9FB359ED-EAD4-41DD-B846-1422F2DC4242@icloud.com>
2020-01-04 17:07 ` 12 TB btrfs file system on virtual machine broke again Christian Wimmer
2020-01-05 4:03 ` Chris Murphy
2020-01-05 13:40 ` Christian Wimmer
2020-01-05 14:07 ` Martin Raiber
2020-01-05 14:14 ` Christian Wimmer
2020-01-05 14:23 ` Christian Wimmer
2020-01-05 4:25 ` Qu Wenruo
2020-01-05 14:17 ` Christian Wimmer
2020-01-05 18:50 ` Chris Murphy
2020-01-05 19:18 ` Christian Wimmer
2020-01-05 19:36 ` Chris Murphy
2020-01-05 19:49 ` Christian Wimmer
2020-01-05 19:52 ` Christian Wimmer
2020-01-05 20:34 ` Chris Murphy
2020-01-05 20:36 ` Chris Murphy
[not found] ` <3F43DDB8-0372-4CDE-B143-D2727D3447BC@icloud.com>
2020-01-05 20:30 ` Chris Murphy
2020-01-05 20:36 ` Christian Wimmer
2020-01-05 21:13 ` Chris Murphy
2020-01-05 21:58 ` Christian Wimmer
2020-01-05 22:28 ` Chris Murphy
2020-01-06 1:31 ` Christian Wimmer [this message]
2020-01-06 1:33 ` Christian Wimmer
2020-01-11 17:04 ` 12 TB btrfs file system on virtual machine broke again (third time) Christian Wimmer
2020-01-11 17:23 ` Christian Wimmer
2020-01-11 19:46 ` Chris Murphy
2020-01-13 19:41 ` 12 TB btrfs file system on virtual machine broke again (fourth time) Christian Wimmer
2020-01-13 20:03 ` Chris Murphy
2020-01-31 16:35 ` btrfs not booting any more Christian Wimmer
2020-05-08 12:20 ` btrfs reports bad key ordering after out of memory situation Christian Wimmer
2020-01-05 23:50 ` 12 TB btrfs file system on virtual machine broke again Qu Wenruo
2020-01-06 1:32 ` Christian Wimmer
2020-01-11 7:25 ` Andrei Borzenkov
2021-10-15 21:01 ` need help in a broken 2TB BTRFS partition Christian Wimmer
2021-10-16 10:08 ` Qu Wenruo
2021-10-16 17:29 ` Christian Wimmer
2021-10-16 22:55 ` Qu Wenruo
2021-10-16 17:35 ` Christian Wimmer
2021-10-16 23:27 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F9562E6-7337-4842-855B-0AF52C4C7449@icloud.com \
--to=telefonchris@icloud.com \
--cc=anand.jain@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.