All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Wimmer <telefonchris@icloud.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu WenRuo <wqu@suse.com>,
	Anand Jain <anand.jain@oracle.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: 12 TB btrfs file system on virtual machine broke again
Date: Sun, 5 Jan 2020 22:31:55 -0300	[thread overview]
Message-ID: <4F9562E6-7337-4842-855B-0AF52C4C7449@icloud.com> (raw)
In-Reply-To: <CAJCQCtQxN17UL7swO7vU6-ORVmHfQHteUQZ7iS1w7Y5XLHTpVA@mail.gmail.com>



> On 5. Jan 2020, at 19:28, Chris Murphy <lists@colorremedies.com> wrote:
> 
> On Sun, Jan 5, 2020 at 2:58 PM Christian Wimmer <telefonchris@icloud.com> wrote:
>>> On 5. Jan 2020, at 18:13, Chris Murphy <lists@colorremedies.com> wrote:
>>> 
>>> On Sun, Jan 5, 2020 at 1:36 PM Christian Wimmer <telefonchris@icloud.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On 5. Jan 2020, at 17:30, Chris Murphy <lists@colorremedies.com> wrote:
>>>>> 
>>>>> On Sun, Jan 5, 2020 at 12:48 PM Christian Wimmer
>>>>> <telefonchris@icloud.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> #fdisk -l
>>>>>> Disk /dev/sda: 256 GiB, 274877906944 bytes, 536870912 sectors
>>>>>> Disk model: Suse 15.1-0 SSD
>>>>>> Units: sectors of 1 * 512 = 512 bytes
>>>>>> Sector size (logical/physical): 512 bytes / 4096 bytes
>>>>>> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>>>>>> Disklabel type: gpt
>>>>>> Disk identifier: 186C0CD6-F3B8-471C-B2AF-AE3D325EC215
>>>>>> 
>>>>>> Device         Start       End   Sectors  Size Type
>>>>>> /dev/sda1       2048     18431     16384    8M BIOS boot
>>>>>> /dev/sda2      18432 419448831 419430400  200G Linux filesystem
>>>>>> /dev/sda3  532674560 536870878   4196319    2G Linux swap
>>>>> 
>>>>> 
>>>>> 
>>>>>> btrfs insp dump-s /dev/sda2
>>>>>> 
>>>>>> 
>>>>>> Here I have only btrfs-progs version 4.19.1:
>>>>>> 
>>>>>> linux-ze6w:~ # btrfs version
>>>>>> btrfs-progs v4.19.1
>>>>>> linux-ze6w:~ # btrfs insp dump-s /dev/sda2
>>>>>> superblock: bytenr=65536, device=/dev/sda2
>>>>>> ---------------------------------------------------------
>>>>>> csum_type               0 (crc32c)
>>>>>> csum_size               4
>>>>>> csum                    0x6d9388e2 [match]
>>>>>> bytenr                  65536
>>>>>> flags                   0x1
>>>>>>                      ( WRITTEN )
>>>>>> magic                   _BHRfS_M [match]
>>>>>> fsid                    affdbdfa-7b54-4888-b6e9-951da79540a3
>>>>>> metadata_uuid           affdbdfa-7b54-4888-b6e9-951da79540a3
>>>>>> label
>>>>>> generation              799183
>>>>>> root                    724205568
>>>>>> sys_array_size          97
>>>>>> chunk_root_generation   797617
>>>>>> root_level              1
>>>>>> chunk_root              158835163136
>>>>>> chunk_root_level        0
>>>>>> log_root                0
>>>>>> log_root_transid        0
>>>>>> log_root_level          0
>>>>>> total_bytes             272719937536
>>>>>> bytes_used              106188886016
>>>>>> sectorsize              4096
>>>>>> nodesize                16384
>>>>>> leafsize (deprecated)           16384
>>>>>> stripesize              4096
>>>>>> root_dir                6
>>>>>> num_devices             1
>>>>>> compat_flags            0x0
>>>>>> compat_ro_flags         0x0
>>>>>> incompat_flags          0x163
>>>>>>                      ( MIXED_BACKREF |
>>>>>>                        DEFAULT_SUBVOL |
>>>>>>                        BIG_METADATA |
>>>>>>                        EXTENDED_IREF |
>>>>>>                        SKINNY_METADATA )
>>>>>> cache_generation        799183
>>>>>> uuid_tree_generation    557352
>>>>>> dev_item.uuid           8968cd08-0c45-4aff-ab64-65f979b21694
>>>>>> dev_item.fsid           affdbdfa-7b54-4888-b6e9-951da79540a3 [match]
>>>>>> dev_item.type           0
>>>>>> dev_item.total_bytes    272719937536
>>>>>> dev_item.bytes_used     129973092352
>>>>>> dev_item.io_align       4096
>>>>>> dev_item.io_width       4096
>>>>>> dev_item.sector_size    4096
>>>>>> dev_item.devid          1
>>>>>> dev_item.dev_group      0
>>>>>> dev_item.seek_speed     0
>>>>>> dev_item.bandwidth      0
>>>>>> dev_item.generation     0
>>>>> 
>>>>> Partition map says
>>>>>> /dev/sda2      18432 419448831 419430400  200G Linux filesystem
>>>>> 
>>>>> Btrfs super says
>>>>>> total_bytes             272719937536
>>>>> 
>>>>> 272719937536*512=532656128
>>>>> 
>>>>> Kernel FITRIM want is want=532656128
>>>>> 
>>>>> OK so the problem is the Btrfs super isn't set to the size of the
>>>>> partition. The usual way this happens is user error: partition is
>>>>> resized (shrink) without resizing the file system first. This file
>>>>> system is still at risk of having problems even if you disable
>>>>> fstrim.timer. You need to shrink the file system is the same size as
>>>>> the partition.
>>>>> 
>>>> 
>>>> Could this be a problem of Parallels Virtual machine that maybe sometimes try to get more space on the hosting file system?
>>>> One solution would be to have a fixed size of the disc file instead of a growing one.
>>> 
>>> I don't see how it's related. Parallels has no ability I'm aware of to
>>> change the GPT partition map or the Btrfs super block - as in, rewrite
>>> it out with a modification correctly including all checksums being
>>> valid. This /dev/sda has somehow been mangled on purpose.
>>> 
>>> Again, from the GPT
>>>>>> /dev/sda2      18432 419448831 419430400  200G Linux filesystem
>>>>>> /dev/sda3  532674560 536870878   4196319    2G Linux swap
>>> 
>>> The end LBA for sda2 is 419448831, but the start LBA for sda3 is
>>> 532674560. There's a ~54G gap in there as if something was removed.
>>> I'm not sure why a software installer would produce this kind of
>>> layout on purpose, because it has no purpose.
>>> 
>>> 
>> 
>> Ok, understand. Very strange. Maybe we should forget about this particular problem.
>> Should I repair it somehow? And if yes, how?
> 
> 
>>>>>> /dev/sda2      18432 419448831 419430400  200G Linux filesystem
> 
> delete this partition, recreate a new one with the same start LBA,
> 18432 and a new end LBA that matches the actual fs size:
> 
> 18432+(272719937536/512)=532674560
> 
> write it and reboot the VM. You could instead resize Btrfs to match
> the partition but that might piss off the kernel if Btrfs thinks it
> needs to move block groups from a location outside the partition. So I
> would just resize the partition. And then you need to do a scrub and a
> btrfs check on this volume to see if it's damaged.
> 
> I don't know but I suspect it could be possible that this malformed
> root might have resulted in a significant instability of the system at
> some point, and in it's last states of confusion as it face planted,
> wrote out very spurious data causing your broken Btrfs file system. I
> can't prove that.
> 
> 

Ok, I will try.

> 
> 
> 
> 
>> 
>>> 
>>> 
>>> 
>>>> 
>>>>> 
>>>>> 
>>>>>> linux-ze6w:~ # systemctl status fstrim.timer
>>>>>> ● fstrim.timer - Discard unused blocks once a week
>>>>>> Loaded: loaded (/usr/lib/systemd/system/fstrim.timer; enabled; vendor preset: enabled)
>>>>>> Active: active (waiting) since Sun 2020-01-05 15:24:59 -03; 1h 19min ago
>>>>>> Trigger: Mon 2020-01-06 00:00:00 -03; 7h left
>>>>>>   Docs: man:fstrim
>>>>>> 
>>>>>> Jan 05 15:24:59 linux-ze6w systemd[1]: Started Discard unused blocks once a week.
>>>>>> 
>>>>>> linux-ze6w:~ # systemctl status fstrim.service
>>>>>> ● fstrim.service - Discard unused blocks on filesystems from /etc/fstab
>>>>>> Loaded: loaded (/usr/lib/systemd/system/fstrim.service; static; vendor preset: disabled)
>>>>>> Active: inactive (dead)
>>>>>>   Docs: man:fstrim(8)
>>>>>> linux-ze6w:~ #
>>>>> 
>>>>> OK so it's not set to run. Why do you have FITRIM being called?
>>>> 
>>>> No idea.
>>> 
>>> Well you're going to have to find it. I can't do that for you.
>> 
>> 
>> Ok, I will have a look. Can I simply deactivate the service?
> 
> fstrim.service is a one shot. The usual method of it being activated
> once per week is via fstrim.timer - but your status check of
> fstrim.timer says it's disabled. So something else is running fstrim.
> I have no idea what it is, you have to find it in order to deactivate
> it.


Ok, got it.

> 
> This 12T file system is a single "device" backed by a 12T file on the
> Promise drive? And it's a Parallel's formatted VM file? I guess I
> would have used raw instead of a Parallels format. That way you can
> inspect things from outside the VM. But that's perhaps a minor point.

I would like to do so! I will investigate on how to do so.

I am using this way that I am doing because of the speed.
I have a 2TB Samsung_X5 where I have a 1.8TB disc file and writing a 10GB file takes only 4.4 seconds:

bash$ dd if=/dev/zero of=erasemenow6 bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 4.43557 s, 2.4 GB/s
bash$ 

Isn’t that fantastic?

A little worse results I get with the Pegasus Promise3:

bash$ dd if=/dev/zero of=erasemenow6 bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 5.49031 s, 1.9 GB/s
bash$ 

This speeds I did not achieve with any other combination of files systems.
And all this I get together with the fantastic btrfs file system that allows me to copy files of 10GB size in just a fraction of a second.
It is just that I am a little afraid now because of the two mega crashes that damaged all my data.

The Parallels Virtual machine can not access the Samsung_X5 or Pegasus directly in order to partition it, so I have to format them in Mac OS with ‘Mac Os Extended (Journaled). Then the Parallels formatted VM files are created on it. 
BTW, this files are created as “Expanding disc”, so they occupy only some MB in the beginning and grow by time. Should this be a problem?

Actually I liked a lot your idea of creating the file in raw format and thus being able to inspect things outside the VM. 
How can I do this? Do you have any idea?

Thanks a lot guys!

Chris




  reply	other threads:[~2020-01-06  1:32 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06  3:44 [PATCH] btrfs-progs: Skip device tree when we failed to read it Qu Wenruo
2019-12-06  6:12 ` Anand Jain
2019-12-06 15:50   ` Christian Wimmer
2019-12-06 16:34   ` Christian Wimmer
     [not found]   ` <762365A0-8BDF-454B-ABA9-AB2F0C958106@icloud.com>
2019-12-07  1:16     ` Qu WenRuo
2019-12-07  3:47       ` Christian Wimmer
2019-12-07  4:31         ` Qu Wenruo
2019-12-07 13:03           ` Christian Wimmer
2019-12-07 14:10             ` Qu Wenruo
2019-12-07 14:25               ` Christian Wimmer
2019-12-07 16:44               ` Christian Wimmer
2019-12-08  1:21                 ` Qu WenRuo
2019-12-10 21:25                   ` Christian Wimmer
2019-12-11  0:36                     ` Qu Wenruo
2019-12-11 15:57                       ` Christian Wimmer
     [not found]           ` <9FB359ED-EAD4-41DD-B846-1422F2DC4242@icloud.com>
2020-01-04 17:07             ` 12 TB btrfs file system on virtual machine broke again Christian Wimmer
2020-01-05  4:03               ` Chris Murphy
2020-01-05 13:40                 ` Christian Wimmer
2020-01-05 14:07                   ` Martin Raiber
2020-01-05 14:14                     ` Christian Wimmer
2020-01-05 14:23                       ` Christian Wimmer
2020-01-05  4:25               ` Qu Wenruo
2020-01-05 14:17                 ` Christian Wimmer
2020-01-05 18:50                   ` Chris Murphy
2020-01-05 19:18                     ` Christian Wimmer
2020-01-05 19:36                       ` Chris Murphy
2020-01-05 19:49                         ` Christian Wimmer
2020-01-05 19:52                         ` Christian Wimmer
2020-01-05 20:34                           ` Chris Murphy
2020-01-05 20:36                             ` Chris Murphy
     [not found]                         ` <3F43DDB8-0372-4CDE-B143-D2727D3447BC@icloud.com>
2020-01-05 20:30                           ` Chris Murphy
2020-01-05 20:36                             ` Christian Wimmer
2020-01-05 21:13                               ` Chris Murphy
2020-01-05 21:58                                 ` Christian Wimmer
2020-01-05 22:28                                   ` Chris Murphy
2020-01-06  1:31                                     ` Christian Wimmer [this message]
2020-01-06  1:33                                     ` Christian Wimmer
2020-01-11 17:04                                     ` 12 TB btrfs file system on virtual machine broke again (third time) Christian Wimmer
2020-01-11 17:23                                     ` Christian Wimmer
2020-01-11 19:46                                       ` Chris Murphy
2020-01-13 19:41                                         ` 12 TB btrfs file system on virtual machine broke again (fourth time) Christian Wimmer
2020-01-13 20:03                                           ` Chris Murphy
2020-01-31 16:35                                             ` btrfs not booting any more Christian Wimmer
2020-05-08 12:20                                             ` btrfs reports bad key ordering after out of memory situation Christian Wimmer
2020-01-05 23:50                   ` 12 TB btrfs file system on virtual machine broke again Qu Wenruo
2020-01-06  1:32                     ` Christian Wimmer
2020-01-11  7:25                     ` Andrei Borzenkov
2021-10-15 21:01                     ` need help in a broken 2TB BTRFS partition Christian Wimmer
2021-10-16 10:08                       ` Qu Wenruo
2021-10-16 17:29                         ` Christian Wimmer
2021-10-16 22:55                           ` Qu Wenruo
2021-10-16 17:35                         ` Christian Wimmer
2021-10-16 23:27                           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F9562E6-7337-4842-855B-0AF52C4C7449@icloud.com \
    --to=telefonchris@icloud.com \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.