linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: "Michael Laß" <bevan@bi-co.net>,
	"Chris Murphy" <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Massive filesystem corruption after balance + fstrim on Linux 5.1.2
Date: Sat, 18 May 2019 18:26:58 +0800	[thread overview]
Message-ID: <158a3491-e4d2-d905-7f58-11a15bddcd70@gmx.com> (raw)
In-Reply-To: <AD966642-1043-468D-BABF-8FC9AF514D36@bi-co.net>


[-- Attachment #1.1: Type: text/plain, Size: 3948 bytes --]



On 2019/5/18 下午5:18, Michael Laß wrote:
> 
>> Am 18.05.2019 um 06:09 schrieb Chris Murphy <lists@colorremedies.com>:
>>
>> On Fri, May 17, 2019 at 11:37 AM Michael Laß <bevan@bi-co.net> wrote:
>>>
>>>
>>> I tried to reproduce this issue: I recreated the btrfs file system, set up a minimal system and issued fstrim again. It printed the following error message:
>>>
>>> fstrim: /: FITRIM ioctl failed: Input/output error
>>
>> Huh. Any kernel message at the same time? I would expect any fstrim
>> user space error message to also have a kernel message. Any i/o error
>> suggests some kind of storage stack failure - which could be hardware
>> or software, you can't know without seeing the kernel messages.
> 
> I missed that. The kernel messages are:
> 
> attempt to access beyond end of device
> sda1: rw=16387, want=252755893, limit=250067632
> BTRFS warning (device dm-5): failed to trim 1 device(s), last error -5
> 
> Here are some more information on the partitions and LVM physical segments:
> 
> fdisk -l /dev/sda:
> 
> Device     Boot Start       End   Sectors   Size Id Type
> /dev/sda1  *     2048 250069679 250067632 119.2G 8e Linux LVM
> 
> pvdisplay -m:
> 
>   --- Physical volume ---
>   PV Name               /dev/sda1
>   VG Name               vg_system
>   PV Size               119.24 GiB / not usable <22.34 MiB
>   Allocatable           yes (but full)
>   PE Size               32.00 MiB
>   Total PE              3815
>   Free PE               0
>   Allocated PE          3815
>   PV UUID               mqCLFy-iDnt-NfdC-lfSv-Maor-V1Ih-RlG8lP
>    
>   --- Physical Segments ---
>   Physical extent 0 to 1248:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	2231 to 3479
>   Physical extent 1249 to 1728:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	640 to 1119
>   Physical extent 1729 to 1760:
>     Logical volume	/dev/vg_system/grml-images
>     Logical extents	0 to 31
>   Physical extent 1761 to 2016:
>     Logical volume	/dev/vg_system/swap
>     Logical extents	0 to 255
>   Physical extent 2017 to 2047:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	3480 to 3510
>   Physical extent 2048 to 2687:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	0 to 639
>   Physical extent 2688 to 3007:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	1911 to 2230
>   Physical extent 3008 to 3320:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	1120 to 1432
>   Physical extent 3321 to 3336:
>     Logical volume	/dev/vg_system/boot
>     Logical extents	0 to 15
>   Physical extent 3337 to 3814:
>     Logical volume	/dev/vg_system/btrfs
>     Logical extents	1433 to 1910
>    
> 
> Would btrfs even be able to accidentally trim parts of other LVs or does this clearly hint towards a LVM/dm issue?

I can't speak sure, but (at least for latest kernel) btrfs has a lot of
extra mount time self check, including chunk stripe check against
underlying device, thus the possibility shouldn't be that high for btrfs.

> Is there an easy way to somehow trace the trim through the different layers so one can see where it goes wrong?

Sure, you could use dm-log-writes.
It will record all read/write (including trim) for later replay.

So in your case, you can build the storage stack like:

Btrfs
<dm-log-writes>
LUKS/dmcrypt
LVM
MBR partition
Samsung SSD

Then replay the log (using src/log-write/replay-log in fstests) with
verbose output, you can verify every trim operation against the dmcrypt
device size.

If all trim are fine, then move the dm-log-writes a layer lower, until
you find which layer is causing the problem.

Thanks,
Qu
> 
> Cheers,
> Michael
> 
> PS: Current state of bisection: It looks like the error was introduced somewhere between b5dd0c658c31b469ccff1b637e5124851e7a4a1c and v5.1.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2019-05-18 10:27 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 22:16 Massive filesystem corruption after balance + fstrim on Linux 5.1.2 Michael Laß
2019-05-16 23:41 ` Qu Wenruo
2019-05-16 23:42 ` Chris Murphy
2019-05-17 17:37   ` Michael Laß
2019-05-18  4:09     ` Chris Murphy
2019-05-18  9:18       ` Michael Laß
2019-05-18  9:31         ` Roman Mamedov
2019-05-18 10:09           ` Michael Laß
2019-05-18 10:26         ` Qu Wenruo [this message]
2019-05-19 19:55           ` fstrim discarding too many or wrong blocks on Linux 5.1, leading to data loss Michael Laß
2019-05-20 11:38             ` [dm-devel] " Michael Laß
2019-05-21 16:46               ` Michael Laß
2019-05-21 19:00                 ` Andrea Gelmini
2019-05-21 19:59                   ` Michael Laß
2019-05-21 20:12                   ` Mike Snitzer
2019-05-24 15:00                     ` Andrea Gelmini
2019-05-24 15:10                       ` Greg KH
     [not found]             ` <CAK-xaQYPs62v971zm1McXw_FGzDmh_vpz3KLEbxzkmrsSgTfXw@mail.gmail.com>
2019-05-20 13:58               ` Michael Laß
2019-05-20 14:53                 ` Andrea Gelmini
2019-05-20 16:45                   ` Milan Broz
2019-05-20 19:58                     ` Michael Laß
2019-05-21 18:54                     ` Andrea Gelmini
2019-05-28 12:36 ` Massive filesystem corruption after balance + fstrim on Linux 5.1.2 Christoph Anton Mitterer
2019-05-28 12:43   ` Michael Laß

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=158a3491-e4d2-d905-7f58-11a15bddcd70@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=bevan@bi-co.net \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).