interest in post-mortem examination of a BTRFS system and improving the btrfs-code?

All of lore.kernel.org
 help / color / mirror / Atom feed

* interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
       [not found] <aa81a49a-d5ca-0f1c-fa75-9ed3656cff55@avgustinov.eu>
@ 2019-03-31 18:44 ` btrfs
  2019-04-02  0:24   ` Qu Wenruo
  2019-04-04  2:48   ` Jeff Mahoney
  0 siblings, 2 replies; 51+ messages in thread
From: btrfs @ 2019-03-31 18:44 UTC (permalink / raw)
  To: linux-btrfs

Dear all,

I am a big fan of btrfs, and I am using it since 2013 - in the meantime 
on at least four different computers. During this time, I suffered at 
least four bad btrfs-failures leading to unmountable, unreadable and 
unrecoverable file system. Since in three of the cases I did not manage 
to recover even a single file, I am beginning to lose my confidence in 
btrfs: for 35-years working with different computers no other file 
system was so bad at recovering files!

Considering the importance of btrfs and keeping in mind the number of 
similar failures, described in countless forums on the net, I have got 
an idea: to donate my last two damaged filesystems for investigation 
purposes and thus hopefully contribute to the improvement of btrfs. One 
condition: any recovered personal data (mostly pictures and audio files) 
should remain undisclosed and be deleted.

Should anybody be interested in this - feel free to contact me 
personally (I am not reading the list regularly!), otherwise I am going 
to reformat and reuse both systems in two weeks from today.

Some more info:

   - The smaller system is 83.6GB, I could either send you an image of 
this system on an unneeded hard drive or put it into a dedicated 
computer and give you root rights and ssh-access to it (the network lin 
is 100Mb down, 50Mb up, so it should be acceptable).

   - The used space on the other file system is about 3 TB (4 TB 
capacity) and it is distributed among 5 drives, so I can only offer 
remote access to this, but I will need time to organize it.

If you need additional information - please ask, but keep in mind that I 
have almost no "free time" and the answer could need a day or two.

Kind regards,

Nik.

--

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-03-31 18:44 ` interest in post-mortem examination of a BTRFS system and improving the btrfs-code? btrfs
@ 2019-04-02  0:24   ` Qu Wenruo
  2019-04-02 13:06     ` Nik.
  2019-04-04  2:48   ` Jeff Mahoney
  1 sibling, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-02  0:24 UTC (permalink / raw)
  To: btrfs, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2626 bytes --]



On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
> Dear all,
> 
> 
> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
> on at least four different computers. During this time, I suffered at
> least four bad btrfs-failures leading to unmountable, unreadable and
> unrecoverable file system. Since in three of the cases I did not manage
> to recover even a single file, I am beginning to lose my confidence in
> btrfs: for 35-years working with different computers no other file
> system was so bad at recovering files!
> 
> Considering the importance of btrfs and keeping in mind the number of
> similar failures, described in countless forums on the net, I have got
> an idea: to donate my last two damaged filesystems for investigation
> purposes and thus hopefully contribute to the improvement of btrfs. One
> condition: any recovered personal data (mostly pictures and audio files)
> should remain undisclosed and be deleted.
> 
> Should anybody be interested in this - feel free to contact me
> personally (I am not reading the list regularly!), otherwise I am going
> to reformat and reuse both systems in two weeks from today.
> 
> Some more info:
> 
>   - The smaller system is 83.6GB, I could either send you an image of
> this system on an unneeded hard drive or put it into a dedicated
> computer and give you root rights and ssh-access to it (the network lin
> is 100Mb down, 50Mb up, so it should be acceptable).

I'm a little more interested in this case, as it's easier to debug.

However there is one requirement before debugging.

*NO* btrfs check --repair/--init-* run at all.
btrfs check --repair is known to cause transid error.


And, I'm afraid even with some debugging, the result would be pretty
predictable.

It will be 90% transid error.
And if it's tree block from future, then it's something barrier related.
If it's tree block from the past, then it's some tree block doesn't
reach disk.

We have being chasing the spectre for a long time, had several
assumption but never pinned it down.


But anyway, more info is always better.

I'd like to get the ssh access for this smaller image.

Thanks,
Qu

> 
>   - The used space on the other file system is about 3 TB (4 TB
> capacity) and it is distributed among 5 drives, so I can only offer
> remote access to this, but I will need time to organize it.
> 
> If you need additional information - please ask, but keep in mind that I
> have almost no "free time" and the answer could need a day or two.
> 
> Kind regards,
> 
> Nik.
> 
> -- 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02  0:24   ` Qu Wenruo
@ 2019-04-02 13:06     ` Nik.
  2019-04-02 13:24       ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-02 13:06 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs


2019-04-02 02:24, Qu Wenruo:
> 
> On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
>> Dear all,
>>
>>
>> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
>> on at least four different computers. During this time, I suffered at
>> least four bad btrfs-failures leading to unmountable, unreadable and
>> unrecoverable file system. Since in three of the cases I did not manage
>> to recover even a single file, I am beginning to lose my confidence in
>> btrfs: for 35-years working with different computers no other file
>> system was so bad at recovering files!
>>
>> Considering the importance of btrfs and keeping in mind the number of
>> similar failures, described in countless forums on the net, I have got
>> an idea: to donate my last two damaged filesystems for investigation
>> purposes and thus hopefully contribute to the improvement of btrfs. One
>> condition: any recovered personal data (mostly pictures and audio files)
>> should remain undisclosed and be deleted.
>>
>> Should anybody be interested in this - feel free to contact me
>> personally (I am not reading the list regularly!), otherwise I am going
>> to reformat and reuse both systems in two weeks from today.
>>
>> Some more info:
>>
>>    - The smaller system is 83.6GB, I could either send you an image of
>> this system on an unneeded hard drive or put it into a dedicated
>> computer and give you root rights and ssh-access to it (the network link
>> is 100Mb down, 50Mb up, so it should be acceptable).
> 
> I'm a little more interested in this case, as it's easier to debug.
> 
> However there is one requirement before debugging.
> 
> *NO* btrfs check --repair/--init-* run at all.
> btrfs check --repair is known to cause transid error.

unfortunately, this file system was used as testbed and even
"btrfs check --repair --check-data-csum --init-csum-tree --init-extent 
tree ..." was attempted on it.
So I assume you are not interested.

On the larger file system only "btrfs check --repair --readonly ..." was 
attempted (without success; most command executions were documented, so 
the results can be made available), no writing commands were issued.

> And, I'm afraid even with some debugging, the result would be pretty
> predictable.

I do not need anything from the smaller file system and have (hopefully 
fresh enough) backups from the bigger one.
I would be good enough if it helps to find any bugs, which are still in 
the code.

> It will be 90% transid error.
> And if it's tree block from future, then it's something barrier related.
> If it's tree block from the past, then it's some tree block doesn't
> reach disk.
> 
> We have being chasing the spectre for a long time, had several
> assumption but never pinned it down.

IMHO spectre would lead to much bigger loses - at least in my case it 
could have happened all four times, but it did not.

> But anyway, more info is always better.
> 
> I'd like to get the ssh access for this smaller image.

If you are still interested, please advise how to create the image of 
the file system. I can imagine that it is preferable to use the 
original, but in my case it is a (not mounted) partition of a bigger 
hard drive, and the other partitions are in use. The "btrfs-image" seems 
inappropriate to me, "dd" will probably screw things up?

Kind regards,

Nik.
-- 
> Thanks,
> Qu
> 
>>
>>    - The used space on the other file system is about 3 TB (4 TB
>> capacity) and it is distributed among 5 drives, so I can only offer
>> remote access to this, but I will need time to organize it.
>>
>> If you need additional information - please ask, but keep in mind that I
>> have almost no "free time" and the answer could need a day or two.
>>
>> Kind regards,
>>
>> Nik.
>>
>> -- 
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 13:06     ` Nik.
@ 2019-04-02 13:24       ` Qu Wenruo
  2019-04-02 13:29         ` Hugo Mills
                           ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-02 13:24 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4236 bytes --]



On 2019/4/2 下午9:06, Nik. wrote:
> 
> 2019-04-02 02:24, Qu Wenruo:
>>
>> On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
>>> Dear all,
>>>
>>>
>>> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
>>> on at least four different computers. During this time, I suffered at
>>> least four bad btrfs-failures leading to unmountable, unreadable and
>>> unrecoverable file system. Since in three of the cases I did not manage
>>> to recover even a single file, I am beginning to lose my confidence in
>>> btrfs: for 35-years working with different computers no other file
>>> system was so bad at recovering files!
>>>
>>> Considering the importance of btrfs and keeping in mind the number of
>>> similar failures, described in countless forums on the net, I have got
>>> an idea: to donate my last two damaged filesystems for investigation
>>> purposes and thus hopefully contribute to the improvement of btrfs. One
>>> condition: any recovered personal data (mostly pictures and audio files)
>>> should remain undisclosed and be deleted.
>>>
>>> Should anybody be interested in this - feel free to contact me
>>> personally (I am not reading the list regularly!), otherwise I am going
>>> to reformat and reuse both systems in two weeks from today.
>>>
>>> Some more info:
>>>
>>>    - The smaller system is 83.6GB, I could either send you an image of
>>> this system on an unneeded hard drive or put it into a dedicated
>>> computer and give you root rights and ssh-access to it (the network link
>>> is 100Mb down, 50Mb up, so it should be acceptable).
>>
>> I'm a little more interested in this case, as it's easier to debug.
>>
>> However there is one requirement before debugging.
>>
>> *NO* btrfs check --repair/--init-* run at all.
>> btrfs check --repair is known to cause transid error.
> 
> unfortunately, this file system was used as testbed and even
> "btrfs check --repair --check-data-csum --init-csum-tree --init-extent
> tree ..." was attempted on it.
> So I assume you are not interested.

Then the fs can be further corrupted, so I'm not interested.

> 
> On the larger file system only "btrfs check --repair --readonly ..." was
> attempted (without success; most command executions were documented, so
> the results can be made available), no writing commands were issued.

--repair will cause write, unless it even failed to open the filesystem.

If that's the case, it would be pretty interesting for me to poking
around the fs, and obviously, all read-only.

> 
>> And, I'm afraid even with some debugging, the result would be pretty
>> predictable.
> 
> I do not need anything from the smaller file system and have (hopefully
> fresh enough) backups from the bigger one.
> I would be good enough if it helps to find any bugs, which are still in
> the code.
> 
>> It will be 90% transid error.
>> And if it's tree block from future, then it's something barrier related.
>> If it's tree block from the past, then it's some tree block doesn't
>> reach disk.
>>
>> We have being chasing the spectre for a long time, had several
>> assumption but never pinned it down.
> 
> IMHO spectre would lead to much bigger loses - at least in my case it
> could have happened all four times, but it did not.
> 
>> But anyway, more info is always better.
>>
>> I'd like to get the ssh access for this smaller image.
> 
> If you are still interested, please advise how to create the image of
> the file system.

If the larger fs really doesn't get any write (btrfs check --repair
failed to open the fs, thus have the output "cannot open file system"),
I'm interesting in that one.

If not, then no.

> I can imagine that it is preferable to use the
> original, but in my case it is a (not mounted) partition of a bigger
> hard drive, and the other partitions are in use. The "btrfs-image" seems
> inappropriate to me, "dd" will probably screw things up?

Since the fs is too large, I don't think either way is good enough.

So in this case, the best way for me to poke around is to give me a
caged container with only read access to the larger fs.

Thanks,
Qu

> 
> Kind regards,
> 
> Nik.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 13:24       ` Qu Wenruo
@ 2019-04-02 13:29         ` Hugo Mills
  2019-04-02 14:05           ` Nik.
  2019-04-02 13:59         ` Nik.
  2019-04-02 18:28         ` Chris Murphy
  2 siblings, 1 reply; 51+ messages in thread
From: Hugo Mills @ 2019-04-02 13:29 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Nik., linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 869 bytes --]

On Tue, Apr 02, 2019 at 09:24:03PM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/4/2 下午9:06, Nik. wrote:
[snip]
> > On the larger file system only "btrfs check --repair --readonly ..." was
> > attempted (without success; most command executions were documented, so
> > the results can be made available), no writing commands were issued.
> 
> --repair will cause write, unless it even failed to open the filesystem.

   If btrfs check accepted both --repair and --readonly without
complaining, then that's a regression and a bug. --readonly should be
mutually exclusive with any option that might write to the FS, and if
it isn't any more, then it's been broken and needs fixing.

   Hugo.

-- 
Hugo Mills             | Great films about cricket: Interview with the Umpire
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 13:24       ` Qu Wenruo
  2019-04-02 13:29         ` Hugo Mills
@ 2019-04-02 13:59         ` Nik.
  2019-04-02 14:12           ` Qu Wenruo
  2019-04-02 18:28         ` Chris Murphy
  2 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-02 13:59 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-02 15:24, Qu Wenruo:
> 
> 
> On 2019/4/2 下午9:06, Nik. wrote:
>>
>> 2019-04-02 02:24, Qu Wenruo:
>>>
>>> On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
>>>> Dear all,
>>>>
>>>>
>>>> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
>>>> on at least four different computers. During this time, I suffered at
>>>> least four bad btrfs-failures leading to unmountable, unreadable and
>>>> unrecoverable file system. Since in three of the cases I did not manage
>>>> to recover even a single file, I am beginning to lose my confidence in
>>>> btrfs: for 35-years working with different computers no other file
>>>> system was so bad at recovering files!
>>>>
>>>> Considering the importance of btrfs and keeping in mind the number of
>>>> similar failures, described in countless forums on the net, I have got
>>>> an idea: to donate my last two damaged filesystems for investigation
>>>> purposes and thus hopefully contribute to the improvement of btrfs. One
>>>> condition: any recovered personal data (mostly pictures and audio files)
>>>> should remain undisclosed and be deleted.
>>>>
>>>> Should anybody be interested in this - feel free to contact me
>>>> personally (I am not reading the list regularly!), otherwise I am going
>>>> to reformat and reuse both systems in two weeks from today.
>>>>
>>>> Some more info:
>>>>
>>>>     - The smaller system is 83.6GB, I could either send you an image of
>>>> this system on an unneeded hard drive or put it into a dedicated
>>>> computer and give you root rights and ssh-access to it (the network link
>>>> is 100Mb down, 50Mb up, so it should be acceptable).
>>>
>>> I'm a little more interested in this case, as it's easier to debug.
>>>
>>> However there is one requirement before debugging.
>>>
>>> *NO* btrfs check --repair/--init-* run at all.
>>> btrfs check --repair is known to cause transid error.
>>
>> unfortunately, this file system was used as testbed and even
>> "btrfs check --repair --check-data-csum --init-csum-tree --init-extent
>> tree ..." was attempted on it.
>> So I assume you are not interested.
> 
> Then the fs can be further corrupted, so I'm not interested.
> 
>>
>> On the larger file system only "btrfs check --repair --readonly ..." was
>> attempted (without success; most command executions were documented, so
>> the results can be made available), no writing commands were issued.
> 
> --repair will cause write, unless it even failed to open the filesystem.
> 
> If that's the case, it would be pretty interesting for me to poking
> around the fs, and obviously, all read-only.
> 
>>
>>> And, I'm afraid even with some debugging, the result would be pretty
>>> predictable.
>>
>> I do not need anything from the smaller file system and have (hopefully
>> fresh enough) backups from the bigger one.
>> I would be good enough if it helps to find any bugs, which are still in
>> the code.
>>
>>> It will be 90% transid error.
>>> And if it's tree block from future, then it's something barrier related.
>>> If it's tree block from the past, then it's some tree block doesn't
>>> reach disk.
>>>
>>> We have being chasing the spectre for a long time, had several
>>> assumption but never pinned it down.
>>
>> IMHO spectre would lead to much bigger loses - at least in my case it
>> could have happened all four times, but it did not.
>>
>>> But anyway, more info is always better.
>>>
>>> I'd like to get the ssh access for this smaller image.
>>
>> If you are still interested, please advise how to create the image of
>> the file system.
> 
> If the larger fs really doesn't get any write (btrfs check --repair
> failed to open the fs, thus have the output "cannot open file system"),
> I'm interesting in that one.

This is excerpt from the terminal log:
"# btrfs check --readonly /dev/md0
incorrect offsets 15003 146075
ERROR: cannot open file system
#"

Btw., since the list does allow _plain_text_only, I wonder how do you quote?

> If not, then no.
> 
>> I can imagine that it is preferable to use the
>> original, but in my case it is a (not mounted) partition of a bigger
>> hard drive, and the other partitions are in use. The "btrfs-image" seems
>> inappropriate to me, "dd" will probably screw things up?
> 
> Since the fs is too large, I don't think either way is good enough.
> 
> So in this case, the best way for me to poke around is to give me a
> caged container with only read access to the larger fs.

I am afraid that this machine is too weak for using containers on it 
(QNAP SS839Pro NAS, Intel Atom, 2GB RAM), and right now I do not have 
other machine, which could accommodate five hard drives. Let me consider 
how to organize this or give another idea. One way could be "async ssh" 
-  a private ssl-chat on one of my servers, so that you can write your 
commands there, I execute them on the machine as soon as I can and put 
the output back into the chat-window? Sounds silly, but could start 
immediately, and I have no better idea right now, sorry!

Thank you for trying to improve btrfs!

Nik.
> 
> Thanks,
> Qu

You are not from the 007 - lab, are you? ;-)

>>
>> Kind regards,
>>
>> Nik.
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 13:29         ` Hugo Mills
@ 2019-04-02 14:05           ` Nik.
  0 siblings, 0 replies; 51+ messages in thread
From: Nik. @ 2019-04-02 14:05 UTC (permalink / raw)
  To: Hugo Mills, Qu Wenruo, linux-btrfs

of course it complains, it was a typo from me, sorry. The real command 
was "btrfs check --readonly ...", just to ensure that no writing takes 
place.
	Nik.
--

2019-04-02 15:29, Hugo Mills:
> On Tue, Apr 02, 2019 at 09:24:03PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2019/4/2 下午9:06, Nik. wrote:
> [snip]
>>> On the larger file system only "btrfs check --repair --readonly ..." was
>>> attempted (without success; most command executions were documented, so
>>> the results can be made available), no writing commands were issued.
>>
>> --repair will cause write, unless it even failed to open the filesystem.
> 
>     If btrfs check accepted both --repair and --readonly without
> complaining, then that's a regression and a bug. --readonly should be
> mutually exclusive with any option that might write to the FS, and if
> it isn't any more, then it's been broken and needs fixing.
> 
>     Hugo.
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 13:59         ` Nik.
@ 2019-04-02 14:12           ` Qu Wenruo
  2019-04-02 14:19             ` Hans van Kranenburg
  2019-04-02 21:22             ` Nik.
  0 siblings, 2 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-02 14:12 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 6694 bytes --]



On 2019/4/2 下午9:59, Nik. wrote:
> 
> 
> 2019-04-02 15:24, Qu Wenruo:
>>
>>
>> On 2019/4/2 下午9:06, Nik. wrote:
>>>
>>> 2019-04-02 02:24, Qu Wenruo:
>>>>
>>>> On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
>>>>> Dear all,
>>>>>
>>>>>
>>>>> I am a big fan of btrfs, and I am using it since 2013 - in the
>>>>> meantime
>>>>> on at least four different computers. During this time, I suffered at
>>>>> least four bad btrfs-failures leading to unmountable, unreadable and
>>>>> unrecoverable file system. Since in three of the cases I did not
>>>>> manage
>>>>> to recover even a single file, I am beginning to lose my confidence in
>>>>> btrfs: for 35-years working with different computers no other file
>>>>> system was so bad at recovering files!
>>>>>
>>>>> Considering the importance of btrfs and keeping in mind the number of
>>>>> similar failures, described in countless forums on the net, I have got
>>>>> an idea: to donate my last two damaged filesystems for investigation
>>>>> purposes and thus hopefully contribute to the improvement of btrfs.
>>>>> One
>>>>> condition: any recovered personal data (mostly pictures and audio
>>>>> files)
>>>>> should remain undisclosed and be deleted.
>>>>>
>>>>> Should anybody be interested in this - feel free to contact me
>>>>> personally (I am not reading the list regularly!), otherwise I am
>>>>> going
>>>>> to reformat and reuse both systems in two weeks from today.
>>>>>
>>>>> Some more info:
>>>>>
>>>>>     - The smaller system is 83.6GB, I could either send you an
>>>>> image of
>>>>> this system on an unneeded hard drive or put it into a dedicated
>>>>> computer and give you root rights and ssh-access to it (the network
>>>>> link
>>>>> is 100Mb down, 50Mb up, so it should be acceptable).
>>>>
>>>> I'm a little more interested in this case, as it's easier to debug.
>>>>
>>>> However there is one requirement before debugging.
>>>>
>>>> *NO* btrfs check --repair/--init-* run at all.
>>>> btrfs check --repair is known to cause transid error.
>>>
>>> unfortunately, this file system was used as testbed and even
>>> "btrfs check --repair --check-data-csum --init-csum-tree --init-extent
>>> tree ..." was attempted on it.
>>> So I assume you are not interested.
>>
>> Then the fs can be further corrupted, so I'm not interested.
>>
>>>
>>> On the larger file system only "btrfs check --repair --readonly ..." was
>>> attempted (without success; most command executions were documented, so
>>> the results can be made available), no writing commands were issued.
>>
>> --repair will cause write, unless it even failed to open the filesystem.
>>
>> If that's the case, it would be pretty interesting for me to poking
>> around the fs, and obviously, all read-only.
>>
>>>
>>>> And, I'm afraid even with some debugging, the result would be pretty
>>>> predictable.
>>>
>>> I do not need anything from the smaller file system and have (hopefully
>>> fresh enough) backups from the bigger one.
>>> I would be good enough if it helps to find any bugs, which are still in
>>> the code.
>>>
>>>> It will be 90% transid error.
>>>> And if it's tree block from future, then it's something barrier
>>>> related.
>>>> If it's tree block from the past, then it's some tree block doesn't
>>>> reach disk.
>>>>
>>>> We have being chasing the spectre for a long time, had several
>>>> assumption but never pinned it down.
>>>
>>> IMHO spectre would lead to much bigger loses - at least in my case it
>>> could have happened all four times, but it did not.
>>>
>>>> But anyway, more info is always better.
>>>>
>>>> I'd like to get the ssh access for this smaller image.
>>>
>>> If you are still interested, please advise how to create the image of
>>> the file system.
>>
>> If the larger fs really doesn't get any write (btrfs check --repair
>> failed to open the fs, thus have the output "cannot open file system"),
>> I'm interesting in that one.
> 
> This is excerpt from the terminal log:
> "# btrfs check --readonly /dev/md0
> incorrect offsets 15003 146075
> ERROR: cannot open file system
> #"

That's great.

And to my surprise, this is completely different problem.

And I believe, it will be detected by latest write time tree checker
patches in next kernel release.

This problem is normally caused by memory bit flip.
This should ring a little alert about the problem.

Anyway, v5.2 or v5.3 kernel would be much better to catch such problems.

> 
> Btw., since the list does allow _plain_text_only, I wonder how do you
> quote?
> 
>> If not, then no.
>>
>>> I can imagine that it is preferable to use the
>>> original, but in my case it is a (not mounted) partition of a bigger
>>> hard drive, and the other partitions are in use. The "btrfs-image" seems
>>> inappropriate to me, "dd" will probably screw things up?
>>
>> Since the fs is too large, I don't think either way is good enough.
>>
>> So in this case, the best way for me to poke around is to give me a
>> caged container with only read access to the larger fs.
> 
> I am afraid that this machine is too weak for using containers on it
> (QNAP SS839Pro NAS, Intel Atom, 2GB RAM), and right now I do not have
> other machine, which could accommodate five hard drives. Let me consider
> how to organize this or give another idea. One way could be "async ssh"
> -  a private ssl-chat on one of my servers, so that you can write your
> commands there, I execute them on the machine as soon as I can and put
> the output back into the chat-window? Sounds silly, but could start
> immediately, and I have no better idea right now, sorry!

Your btrfs check output is already good enough to locate the problem.

The next thing would be just to help you recovery that image if that's
what you need.

The purposed idea is not that uncommon. In fact it's just another way of
"show commands, user execute and report, developer check the output" loop.

In your case, you just need latest btrfs-progs and re-run "btrfs check
--readonly" on it.

If it just shows the same result, meaning I can't get the info about
which tree block is corrupted, then you could try to mount it with -o ro
using *LATEST* kernel.

Latest kernel will report anything wrong pretty vocally, in that case,
dmesg would include the bytenr of corrupted tree block.

Then I could craft needed commands to further debug the fs.

Thanks,
Qu

> 
> Thank you for trying to improve btrfs!
> 
> Nik.
>>
>> Thanks,
>> Qu
> 
> You are not from the 007 - lab, are you? ;-)
> 
>>>
>>> Kind regards,
>>>
>>> Nik.
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 14:12           ` Qu Wenruo
@ 2019-04-02 14:19             ` Hans van Kranenburg
  2019-04-02 15:04               ` Nik.
  2019-04-02 21:22             ` Nik.
  1 sibling, 1 reply; 51+ messages in thread
From: Hans van Kranenburg @ 2019-04-02 14:19 UTC (permalink / raw)
  To: Qu Wenruo, Nik., linux-btrfs

On 4/2/19 4:12 PM, Qu Wenruo wrote:
> 
> 
> On 2019/4/2 下午9:59, Nik. wrote:
>>
>>
>> 2019-04-02 15:24, Qu Wenruo:
>>>
>>>
>>> On 2019/4/2 下午9:06, Nik. wrote:
>>>>
>>> If the larger fs really doesn't get any write (btrfs check --repair
>>> failed to open the fs, thus have the output "cannot open file system"),
>>> I'm interesting in that one.
>>
>> This is excerpt from the terminal log:
>> "# btrfs check --readonly /dev/md0
>> incorrect offsets 15003 146075
>> ERROR: cannot open file system
>> #"
> 
> That's great.
> 
> And to my surprise, this is completely different problem.
> 
> And I believe, it will be detected by latest write time tree checker
> patches in next kernel release.
> 
> This problem is normally caused by memory bit flip.

To illustrate for whoever needs it to follow this reasoning:

bin(146075) -> 0b100011101010011011
bin(15003)  ->     0b11101010011011

So, 146075 is actually 15003, but with a bit flipped from 0 to 1.

> This should ring a little alert about the problem.

Hans

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 14:19             ` Hans van Kranenburg
@ 2019-04-02 15:04               ` Nik.
  2019-04-02 15:07                 ` Hans van Kranenburg
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-02 15:04 UTC (permalink / raw)
  To: Hans van Kranenburg, Qu Wenruo, linux-btrfs



2019-04-02 16:19, Hans van Kranenburg:
> On 4/2/19 4:12 PM, Qu Wenruo wrote:
>>
>>
>> On 2019/4/2 下午9:59, Nik. wrote:
>>>
>>>
>>> 2019-04-02 15:24, Qu Wenruo:
>>>>
>>>>
>>>> On 2019/4/2 下午9:06, Nik. wrote:
>>>>>
>>>> If the larger fs really doesn't get any write (btrfs check --repair
>>>> failed to open the fs, thus have the output "cannot open file system"),
>>>> I'm interesting in that one.
>>>
>>> This is excerpt from the terminal log:
>>> "# btrfs check --readonly /dev/md0
>>> incorrect offsets 15003 146075
>>> ERROR: cannot open file system
>>> #"
>>
>> That's great.
>>
>> And to my surprise, this is completely different problem.
>>
>> And I believe, it will be detected by latest write time tree checker
>> patches in next kernel release.
>>
>> This problem is normally caused by memory bit flip.
> 
> To illustrate for whoever needs it to follow this reasoning:
> 
> bin(146075) -> 0b100011101010011011
> bin(15003)  ->     0b11101010011011

Wait a minute! I hope you are not saying that the file system grew too 
much for addressing it on a 32bit architecture (inappropriate variable 
type somewhere in the code)? Because the problem came shortly after 
adding another drive to the file system...


> So, 146075 is actually 15003, but with a bit flipped from 0 to 1.
> 
>> This should ring a little alert about the problem.
> 
> Hans
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 15:04               ` Nik.
@ 2019-04-02 15:07                 ` Hans van Kranenburg
  0 siblings, 0 replies; 51+ messages in thread
From: Hans van Kranenburg @ 2019-04-02 15:07 UTC (permalink / raw)
  To: Nik., Qu Wenruo, linux-btrfs

On 4/2/19 5:04 PM, Nik. wrote:
> 
> 
> 2019-04-02 16:19, Hans van Kranenburg:
>> On 4/2/19 4:12 PM, Qu Wenruo wrote:
>>>
>>>
>>> On 2019/4/2 下午9:59, Nik. wrote:
>>>>
>>>>
>>>> 2019-04-02 15:24, Qu Wenruo:
>>>>>
>>>>>
>>>>> On 2019/4/2 下午9:06, Nik. wrote:
>>>>>>
>>>>> If the larger fs really doesn't get any write (btrfs check --repair
>>>>> failed to open the fs, thus have the output "cannot open file
>>>>> system"),
>>>>> I'm interesting in that one.
>>>>
>>>> This is excerpt from the terminal log:
>>>> "# btrfs check --readonly /dev/md0
>>>> incorrect offsets 15003 146075
>>>> ERROR: cannot open file system
>>>> #"
>>>
>>> That's great.
>>>
>>> And to my surprise, this is completely different problem.
>>>
>>> And I believe, it will be detected by latest write time tree checker
>>> patches in next kernel release.
>>>
>>> This problem is normally caused by memory bit flip.
>>
>> To illustrate for whoever needs it to follow this reasoning:
>>
>> bin(146075) -> 0b100011101010011011
>> bin(15003)  ->     0b11101010011011
> 
> Wait a minute! I hope you are not saying that the file system grew too
> much for addressing it on a 32bit architecture (inappropriate variable
> type somewhere in the code)? Because the problem came shortly after
> adding another drive to the file system...

No, we're saying that the memory (hardware) in your computer is not
reliable, and it's corrupting memory pages before thet gets checksummed
and written out to disk.

When reading this metadata back later, the filesystem explodes because
the contents are invalid.

>> So, 146075 is actually 15003, but with a bit flipped from 0 to 1.
>>
>>> This should ring a little alert about the problem.

Hans


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 13:24       ` Qu Wenruo
  2019-04-02 13:29         ` Hugo Mills
  2019-04-02 13:59         ` Nik.
@ 2019-04-02 18:28         ` Chris Murphy
  2019-04-02 19:02           ` Hugo Mills
  2 siblings, 1 reply; 51+ messages in thread
From: Chris Murphy @ 2019-04-02 18:28 UTC (permalink / raw)
  To: Btrfs BTRFS

On Tue, Apr 2, 2019 at 7:24 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> On 2019/4/2 下午9:06, Nik. wrote:

> > On the larger file system only "btrfs check --repair --readonly ..." was
> > attempted (without success; most command executions were documented, so
> > the results can be made available), no writing commands were issued.
>
> --repair will cause write, unless it even failed to open the filesystem.

It consider `--repair --readonly` is a contradictory request, and it's
ambiguous what the user wants (it's user error) and the command should
fail with "conflicting options" error.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 18:28         ` Chris Murphy
@ 2019-04-02 19:02           ` Hugo Mills
  0 siblings, 0 replies; 51+ messages in thread
From: Hugo Mills @ 2019-04-02 19:02 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 967 bytes --]

On Tue, Apr 02, 2019 at 12:28:12PM -0600, Chris Murphy wrote:
> On Tue, Apr 2, 2019 at 7:24 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > On 2019/4/2 下午9:06, Nik. wrote:
> 
> > > On the larger file system only "btrfs check --repair --readonly ..." was
> > > attempted (without success; most command executions were documented, so
> > > the results can be made available), no writing commands were issued.
> >
> > --repair will cause write, unless it even failed to open the filesystem.
> 
> It consider `--repair --readonly` is a contradictory request, and it's
> ambiguous what the user wants (it's user error) and the command should
> fail with "conflicting options" error.

   I already raised that question. :)

   It was a typo in the email. --repair was what was intended.

   Hugo.

-- 
Hugo Mills             | Great films about cricket: Forrest Stump
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 14:12           ` Qu Wenruo
  2019-04-02 14:19             ` Hans van Kranenburg
@ 2019-04-02 21:22             ` Nik.
  2019-04-03  1:04               ` Qu Wenruo
  1 sibling, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-02 21:22 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs


2019-04-02 16:12, Qu Wenruo:
> 
> 
> On 2019/4/2 下午9:59, Nik. wrote:
>>
>>
>> 2019-04-02 15:24, Qu Wenruo:
>>>
>>>
>>> On 2019/4/2 下午9:06, Nik. wrote:
>>>>
>>>> 2019-04-02 02:24, Qu Wenruo:
>>>>>
>>>>> On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
>>>>>> Dear all,
>>>>>>
>>>>>>
>>>>>> I am a big fan of btrfs, and I am using it since 2013 - in the
>>>>>> meantime
>>>>>> on at least four different computers. During this time, I suffered at
>>>>>> least four bad btrfs-failures leading to unmountable, unreadable and
>>>>>> unrecoverable file system. Since in three of the cases I did not
>>>>>> manage
>>>>>> to recover even a single file, I am beginning to lose my confidence in
>>>>>> btrfs: for 35-years working with different computers no other file
>>>>>> system was so bad at recovering files!
>>>>>>
>>>>>> Considering the importance of btrfs and keeping in mind the number of
>>>>>> similar failures, described in countless forums on the net, I have got
>>>>>> an idea: to donate my last two damaged filesystems for investigation
>>>>>> purposes and thus hopefully contribute to the improvement of btrfs.
>>>>>> One
>>>>>> condition: any recovered personal data (mostly pictures and audio
>>>>>> files)
>>>>>> should remain undisclosed and be deleted.
>>>>>>
>>>>>> Should anybody be interested in this - feel free to contact me
>>>>>> personally (I am not reading the list regularly!), otherwise I am
>>>>>> going
>>>>>> to reformat and reuse both systems in two weeks from today.
>>>>>>
>>>>>> Some more info:
>>>>>>
>>>>>>      - The smaller system is 83.6GB, I could either send you an
>>>>>> image of
>>>>>> this system on an unneeded hard drive or put it into a dedicated
>>>>>> computer and give you root rights and ssh-access to it (the network
>>>>>> link
>>>>>> is 100Mb down, 50Mb up, so it should be acceptable).
>>>>>
>>>>> I'm a little more interested in this case, as it's easier to debug.
>>>>>
>>>>> However there is one requirement before debugging.
>>>>>
>>>>> *NO* btrfs check --repair/--init-* run at all.
>>>>> btrfs check --repair is known to cause transid error.
>>>>
>>>> unfortunately, this file system was used as testbed and even
>>>> "btrfs check --repair --check-data-csum --init-csum-tree --init-extent
>>>> tree ..." was attempted on it.
>>>> So I assume you are not interested.
>>>
>>> Then the fs can be further corrupted, so I'm not interested.
>>>
>>>>
>>>> On the larger file system only "btrfs check --repair --readonly ..." was
>>>> attempted (without success; most command executions were documented, so
>>>> the results can be made available), no writing commands were issued.
>>>
>>> --repair will cause write, unless it even failed to open the filesystem.
>>>
>>> If that's the case, it would be pretty interesting for me to poking
>>> around the fs, and obviously, all read-only.
>>>
>>>>
>>>>> And, I'm afraid even with some debugging, the result would be pretty
>>>>> predictable.
>>>>
>>>> I do not need anything from the smaller file system and have (hopefully
>>>> fresh enough) backups from the bigger one.
>>>> I would be good enough if it helps to find any bugs, which are still in
>>>> the code.
>>>>
>>>>> It will be 90% transid error.
>>>>> And if it's tree block from future, then it's something barrier
>>>>> related.
>>>>> If it's tree block from the past, then it's some tree block doesn't
>>>>> reach disk.
>>>>>
>>>>> We have being chasing the spectre for a long time, had several
>>>>> assumption but never pinned it down.
>>>>
>>>> IMHO spectre would lead to much bigger loses - at least in my case it
>>>> could have happened all four times, but it did not.
>>>>
>>>>> But anyway, more info is always better.
>>>>>
>>>>> I'd like to get the ssh access for this smaller image.
>>>>
>>>> If you are still interested, please advise how to create the image of
>>>> the file system.
>>>
>>> If the larger fs really doesn't get any write (btrfs check --repair
>>> failed to open the fs, thus have the output "cannot open file system"),
>>> I'm interesting in that one.
>>
>> This is excerpt from the terminal log:
>> "# btrfs check --readonly /dev/md0
>> incorrect offsets 15003 146075
>> ERROR: cannot open file system
>> #"
> 
> That's great.
> 
> And to my surprise, this is completely different problem.
> 
> And I believe, it will be detected by latest write time tree checker
> patches in next kernel release.

Is the next release going to come out in April?

> This problem is normally caused by memory bit flip.

Well, this system has suffered many power outages (at least 6 since 
2013), and after each outage I had to run scrub AND nevertheless 
discovered the loss of a couple of files. I can imagine, that the power 
supply or the mother board of this machine is not (well) designed for 
reliability, but:
   1) shouldn't the file system be immune to this?
   2) Isn't is too stupid to lose terabytes of information due to a 
flipped bit?
The same machine has ext4 and FAT file systems, and they never have had 
a problem or recovered automatically by means of fsck during the next 
reboot!

> This should ring a little alert about the problem.
> 
> Anyway, v5.2 or v5.3 kernel would be much better to catch such problems.

This kernel isn't even scheduled, is it? Well, I am not really in a 
hurry...

>>
>> Btw., since the list does allow _plain_text_only, I wonder how do you
>> quote?
>>
>>> If not, then no.
>>>
>>>> I can imagine that it is preferable to use the
>>>> original, but in my case it is a (not mounted) partition of a bigger
>>>> hard drive, and the other partitions are in use. The "btrfs-image" seems
>>>> inappropriate to me, "dd" will probably screw things up?
>>>
>>> Since the fs is too large, I don't think either way is good enough.
>>>
>>> So in this case, the best way for me to poke around is to give me a
>>> caged container with only read access to the larger fs.
>>
>> I am afraid that this machine is too weak for using containers on it
>> (QNAP SS839Pro NAS, Intel Atom, 2GB RAM), and right now I do not have
>> other machine, which could accommodate five hard drives. Let me consider
>> how to organize this or give another idea. One way could be "async ssh"
>> -  a private ssl-chat on one of my servers, so that you can write your
>> commands there, I execute them on the machine as soon as I can and put
>> the output back into the chat-window? Sounds silly, but could start
>> immediately, and I have no better idea right now, sorry!
> 
> Your btrfs check output is already good enough to locate the problem.
> 
> The next thing would be just to help you recovery that image if that's
> what you need.

Well, let me say it again: 1) I have a backup, but one is never sure 
which newest files are not in it. 2) It is much more important to be 
sure that the btrfs code is flawless and no other btrfs file system is 
in danger! I can live with some loses, but inability to recover even 
single file is not acceptable!

> The purposed idea is not that uncommon. In fact it's just another way of
> "show commands, user execute and report, developer check the output" loop.
> 
> In your case, you just need latest btrfs-progs and re-run "btrfs check
> --readonly" on it.

Will try this, but have no time before tomorrow evening.


> If it just shows the same result, meaning I can't get the info about
> which tree block is corrupted, then you could try to mount it with -o ro
> using *LATEST* kernel.

I tried this before with the 4.15.0-46 kernel, it was impossible. Will 
try again with newer ona as soon as possible (in best case tomorrow 
evening); I will post the results.

> Latest kernel will report anything wrong pretty vocally, in that case,
> dmesg would include the bytenr of corrupted tree block.
> 
> Then I could craft needed commands to further debug the fs.

Ok, I will try to post more info tomorrow about this time.

Nik.
--

> Thanks,
> Qu
> 
>>
>> Thank you for trying to improve btrfs!
>>
>> Nik.
>>>
>>> Thanks,
>>> Qu
>>
>> You are not from the 007 - lab, are you? ;-)
>>
>>>>
>>>> Kind regards,
>>>>
>>>> Nik.
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-02 21:22             ` Nik.
@ 2019-04-03  1:04               ` Qu Wenruo
  2019-04-04 15:27                 ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-03  1:04 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 9472 bytes --]



On 2019/4/3 上午5:22, Nik. wrote:
> 
> 2019-04-02 16:12, Qu Wenruo:
>>
>>
>> On 2019/4/2 下午9:59, Nik. wrote:
>>>
>>>
>>> 2019-04-02 15:24, Qu Wenruo:
>>>>
>>>>
>>>> On 2019/4/2 下午9:06, Nik. wrote:
>>>>>
>>>>> 2019-04-02 02:24, Qu Wenruo:
>>>>>>
>>>>>> On 2019/4/1 上午2:44, btrfs@avgustinov.eu wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>>
>>>>>>> I am a big fan of btrfs, and I am using it since 2013 - in the
>>>>>>> meantime
>>>>>>> on at least four different computers. During this time, I
>>>>>>> suffered at
>>>>>>> least four bad btrfs-failures leading to unmountable, unreadable and
>>>>>>> unrecoverable file system. Since in three of the cases I did not
>>>>>>> manage
>>>>>>> to recover even a single file, I am beginning to lose my
>>>>>>> confidence in
>>>>>>> btrfs: for 35-years working with different computers no other file
>>>>>>> system was so bad at recovering files!
>>>>>>>
>>>>>>> Considering the importance of btrfs and keeping in mind the
>>>>>>> number of
>>>>>>> similar failures, described in countless forums on the net, I
>>>>>>> have got
>>>>>>> an idea: to donate my last two damaged filesystems for investigation
>>>>>>> purposes and thus hopefully contribute to the improvement of btrfs.
>>>>>>> One
>>>>>>> condition: any recovered personal data (mostly pictures and audio
>>>>>>> files)
>>>>>>> should remain undisclosed and be deleted.
>>>>>>>
>>>>>>> Should anybody be interested in this - feel free to contact me
>>>>>>> personally (I am not reading the list regularly!), otherwise I am
>>>>>>> going
>>>>>>> to reformat and reuse both systems in two weeks from today.
>>>>>>>
>>>>>>> Some more info:
>>>>>>>
>>>>>>>      - The smaller system is 83.6GB, I could either send you an
>>>>>>> image of
>>>>>>> this system on an unneeded hard drive or put it into a dedicated
>>>>>>> computer and give you root rights and ssh-access to it (the network
>>>>>>> link
>>>>>>> is 100Mb down, 50Mb up, so it should be acceptable).
>>>>>>
>>>>>> I'm a little more interested in this case, as it's easier to debug.
>>>>>>
>>>>>> However there is one requirement before debugging.
>>>>>>
>>>>>> *NO* btrfs check --repair/--init-* run at all.
>>>>>> btrfs check --repair is known to cause transid error.
>>>>>
>>>>> unfortunately, this file system was used as testbed and even
>>>>> "btrfs check --repair --check-data-csum --init-csum-tree --init-extent
>>>>> tree ..." was attempted on it.
>>>>> So I assume you are not interested.
>>>>
>>>> Then the fs can be further corrupted, so I'm not interested.
>>>>
>>>>>
>>>>> On the larger file system only "btrfs check --repair --readonly
>>>>> ..." was
>>>>> attempted (without success; most command executions were
>>>>> documented, so
>>>>> the results can be made available), no writing commands were issued.
>>>>
>>>> --repair will cause write, unless it even failed to open the
>>>> filesystem.
>>>>
>>>> If that's the case, it would be pretty interesting for me to poking
>>>> around the fs, and obviously, all read-only.
>>>>
>>>>>
>>>>>> And, I'm afraid even with some debugging, the result would be pretty
>>>>>> predictable.
>>>>>
>>>>> I do not need anything from the smaller file system and have
>>>>> (hopefully
>>>>> fresh enough) backups from the bigger one.
>>>>> I would be good enough if it helps to find any bugs, which are
>>>>> still in
>>>>> the code.
>>>>>
>>>>>> It will be 90% transid error.
>>>>>> And if it's tree block from future, then it's something barrier
>>>>>> related.
>>>>>> If it's tree block from the past, then it's some tree block doesn't
>>>>>> reach disk.
>>>>>>
>>>>>> We have being chasing the spectre for a long time, had several
>>>>>> assumption but never pinned it down.
>>>>>
>>>>> IMHO spectre would lead to much bigger loses - at least in my case it
>>>>> could have happened all four times, but it did not.
>>>>>
>>>>>> But anyway, more info is always better.
>>>>>>
>>>>>> I'd like to get the ssh access for this smaller image.
>>>>>
>>>>> If you are still interested, please advise how to create the image of
>>>>> the file system.
>>>>
>>>> If the larger fs really doesn't get any write (btrfs check --repair
>>>> failed to open the fs, thus have the output "cannot open file system"),
>>>> I'm interesting in that one.
>>>
>>> This is excerpt from the terminal log:
>>> "# btrfs check --readonly /dev/md0
>>> incorrect offsets 15003 146075
>>> ERROR: cannot open file system
>>> #"
>>
>> That's great.
>>
>> And to my surprise, this is completely different problem.
>>
>> And I believe, it will be detected by latest write time tree checker
>> patches in next kernel release.
> 
> Is the next release going to come out in April?

Next release is v5.1, which doesn't contain all my recent tree-checker
enhancement.

So I'm afraid you need to wait for June.

> 
>> This problem is normally caused by memory bit flip.
> 
> Well, this system has suffered many power outages (at least 6 since
> 2013), and after each outage I had to run scrub AND nevertheless
> discovered the loss of a couple of files. I can imagine, that the power
> supply or the mother board of this machine is not (well) designed for
> reliability, but:

Unless the PSU is so unreliable so that the VRM for memory or memory
controller doesn't get needed voltage, power outage is not related to
this case.

>   1) shouldn't the file system be immune to this?

If memory is corrupted, nothing can help, unless you have ECC memory.

>   2) Isn't is too stupid to lose terabytes of information due to a
> flipped bit?

Depends on where the bit flip is.
If the bit flip happens at super vital tree block, like chunk tree, root
tree, then the whole fs is unable to be mounted.

Although enhanced tree-checker will be able to detect such problem and
abort write before corrupted data reach disk.
So at least with those enhancement, it should not cause such problem at all.

> The same machine has ext4 and FAT file systems, and they never have had
> a problem or recovered automatically by means of fsck during the next
> reboot!

Then we should enhance btrfs-progs to detect bit flip.

Thanks,
Qu

> 
>> This should ring a little alert about the problem.
>>
>> Anyway, v5.2 or v5.3 kernel would be much better to catch such problems.
> 
> This kernel isn't even scheduled, is it? Well, I am not really in a
> hurry...
> 
>>>
>>> Btw., since the list does allow _plain_text_only, I wonder how do you
>>> quote?
>>>
>>>> If not, then no.
>>>>
>>>>> I can imagine that it is preferable to use the
>>>>> original, but in my case it is a (not mounted) partition of a bigger
>>>>> hard drive, and the other partitions are in use. The "btrfs-image"
>>>>> seems
>>>>> inappropriate to me, "dd" will probably screw things up?
>>>>
>>>> Since the fs is too large, I don't think either way is good enough.
>>>>
>>>> So in this case, the best way for me to poke around is to give me a
>>>> caged container with only read access to the larger fs.
>>>
>>> I am afraid that this machine is too weak for using containers on it
>>> (QNAP SS839Pro NAS, Intel Atom, 2GB RAM), and right now I do not have
>>> other machine, which could accommodate five hard drives. Let me consider
>>> how to organize this or give another idea. One way could be "async ssh"
>>> -  a private ssl-chat on one of my servers, so that you can write your
>>> commands there, I execute them on the machine as soon as I can and put
>>> the output back into the chat-window? Sounds silly, but could start
>>> immediately, and I have no better idea right now, sorry!
>>
>> Your btrfs check output is already good enough to locate the problem.
>>
>> The next thing would be just to help you recovery that image if that's
>> what you need.
> 
> Well, let me say it again: 1) I have a backup, but one is never sure
> which newest files are not in it. 2) It is much more important to be
> sure that the btrfs code is flawless and no other btrfs file system is
> in danger! I can live with some loses, but inability to recover even
> single file is not acceptable!
> 
>> The purposed idea is not that uncommon. In fact it's just another way of
>> "show commands, user execute and report, developer check the output"
>> loop.
>>
>> In your case, you just need latest btrfs-progs and re-run "btrfs check
>> --readonly" on it.
> 
> Will try this, but have no time before tomorrow evening.
> 
> 
>> If it just shows the same result, meaning I can't get the info about
>> which tree block is corrupted, then you could try to mount it with -o ro
>> using *LATEST* kernel.
> 
> I tried this before with the 4.15.0-46 kernel, it was impossible. Will
> try again with newer ona as soon as possible (in best case tomorrow
> evening); I will post the results.
> 
>> Latest kernel will report anything wrong pretty vocally, in that case,
>> dmesg would include the bytenr of corrupted tree block.
>>
>> Then I could craft needed commands to further debug the fs.
> 
> Ok, I will try to post more info tomorrow about this time.
> 
> Nik.
> -- 
> 
>> Thanks,
>> Qu
>>
>>>
>>> Thank you for trying to improve btrfs!
>>>
>>> Nik.
>>>>
>>>> Thanks,
>>>> Qu
>>>
>>> You are not from the 007 - lab, are you? ;-)
>>>
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Nik.
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-03-31 18:44 ` interest in post-mortem examination of a BTRFS system and improving the btrfs-code? btrfs
  2019-04-02  0:24   ` Qu Wenruo
@ 2019-04-04  2:48   ` Jeff Mahoney
  2019-04-04 15:58     ` Nik.
  2019-04-05  6:53     ` Chris Murphy
  1 sibling, 2 replies; 51+ messages in thread
From: Jeff Mahoney @ 2019-04-04  2:48 UTC (permalink / raw)
  To: btrfs, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2523 bytes --]

On 3/31/19 2:44 PM, btrfs@avgustinov.eu wrote:
> Dear all,
> 
> 
> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
> on at least four different computers. During this time, I suffered at
> least four bad btrfs-failures leading to unmountable, unreadable and
> unrecoverable file system. Since in three of the cases I did not manage
> to recover even a single file, I am beginning to lose my confidence in
> btrfs: for 35-years working with different computers no other file
> system was so bad at recovering files!
> 
> Considering the importance of btrfs and keeping in mind the number of
> similar failures, described in countless forums on the net, I have got
> an idea: to donate my last two damaged filesystems for investigation
> purposes and thus hopefully contribute to the improvement of btrfs. One
> condition: any recovered personal data (mostly pictures and audio files)
> should remain undisclosed and be deleted.
> 
> Should anybody be interested in this - feel free to contact me
> personally (I am not reading the list regularly!), otherwise I am going
> to reformat and reuse both systems in two weeks from today.
> 
> Some more info:
> 
>   - The smaller system is 83.6GB, I could either send you an image of
> this system on an unneeded hard drive or put it into a dedicated
> computer and give you root rights and ssh-access to it (the network lin
> is 100Mb down, 50Mb up, so it should be acceptable).
> 
>   - The used space on the other file system is about 3 TB (4 TB
> capacity) and it is distributed among 5 drives, so I can only offer
> remote access to this, but I will need time to organize it.
> 
> If you need additional information - please ask, but keep in mind that I
> have almost no "free time" and the answer could need a day or two.

My team is always interested in images of broken file systems.  This is
how --repair evolves.  Images with failed --repair operations are still
valuable.  That's the first step most users take and why wouldn't they?
 If --repair is misbehaving, the end result shouldn't be "I hope you
have backups."

It's not the size of the file system that matters so much.  The data on
it doesn't matter from a debugging perspective and, in any event, it's
not written to the image file anyway.  I do want a btrfs-image file from
the file system, and if btrfs-image fails to create a usable image,
that's also valuable to know and fix.

Thanks,

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-03  1:04               ` Qu Wenruo
@ 2019-04-04 15:27                 ` Nik.
  2019-04-05  0:47                   ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-04 15:27 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-03 03:04, Qu Wenruo:
> 

[snip]
...

>>> In your case, you just need latest btrfs-progs and re-run "btrfs check
>>> --readonly" on it.
>>
>> Will try this, but have no time before tomorrow evening.
>>
>>
>>> If it just shows the same result, meaning I can't get the info about
>>> which tree block is corrupted, then you could try to mount it with -o ro
>>> using *LATEST* kernel.
>>
>> I tried this before with the 4.15.0-46 kernel, it was impossible. Will
>> try again with newer one as soon as possible (in best case tomorrow
>> evening); I will post the results.

Sorry for the delay, compiling the btrfs-progs took much more time than 
expected (had to install new packages again and again). Finally had to 
give up the conversion ("make" could not find reiserfs/misc.h, although 
both libreiserfscore and reiser4fs are installed).
Output of the commands:
# uname -r
5.0.6-050006-generic
#btrfs --version
btrfs-progs v4.20.2
# btrfs check --readonly /dev/md0
Opening filesystem to check...
incorrect offsets 15003 146075
ERROR: cannot open file system

It seems that I will wait until 5.2 is out...
(the answer to Jeff Mahoney is coming with separate e-mail!)

>>> Latest kernel will report anything wrong pretty vocally, in that case,
>>> dmesg would include the bytenr of corrupted tree block.
>>>
>>> Then I could craft needed commands to further debug the fs.
>>
>> Ok, I will try to post more info tomorrow about this time.
>>
>> Nik.
>> -- 
>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> Thank you for trying to improve btrfs!
>>>>
>>>> Nik.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>
>>>> You are not from the 007 - lab, are you? ;-)
>>>>
>>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>> Nik.
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-04  2:48   ` Jeff Mahoney
@ 2019-04-04 15:58     ` Nik.
  2019-04-04 17:31       ` Chris Murphy
  2019-04-05  6:53     ` Chris Murphy
  1 sibling, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-04 15:58 UTC (permalink / raw)
  To: Jeff Mahoney, linux-btrfs


2019-04-04 04:48, Jeff Mahoney:
> On 3/31/19 2:44 PM, btrfs@avgustinov.eu wrote:
>> Dear all,
>>
>>
>> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
>> on at least four different computers. During this time, I suffered at
>> least four bad btrfs-failures leading to unmountable, unreadable and
>> unrecoverable file system. Since in three of the cases I did not manage
>> to recover even a single file, I am beginning to lose my confidence in
>> btrfs: for 35-years working with different computers no other file
>> system was so bad at recovering files!
>>
>> Considering the importance of btrfs and keeping in mind the number of
>> similar failures, described in countless forums on the net, I have got
>> an idea: to donate my last two damaged filesystems for investigation
>> purposes and thus hopefully contribute to the improvement of btrfs. One
>> condition: any recovered personal data (mostly pictures and audio files)
>> should remain undisclosed and be deleted.
>>
>> Should anybody be interested in this - feel free to contact me
>> personally (I am not reading the list regularly!), otherwise I am going
>> to reformat and reuse both systems in two weeks from today.
>>
>> Some more info:
>>
>>    - The smaller system is 83.6GB, I could either send you an image of
>> this system on an unneeded hard drive or put it into a dedicated
>> computer and give you root rights and ssh-access to it (the network lin
>> is 100Mb down, 50Mb up, so it should be acceptable).
>>
>>    - The used space on the other file system is about 3 TB (4 TB
>> capacity) and it is distributed among 5 drives, so I can only offer
>> remote access to this, but I will need time to organize it.
>>
>> If you need additional information - please ask, but keep in mind that I
>> have almost no "free time" and the answer could need a day or two.
> 
> My team is always interested in images of broken file systems.  This is
> how --repair evolves.  Images with failed --repair operations are still
> valuable.  That's the first step most users take and why wouldn't they?
>   If --repair is misbehaving, the end result shouldn't be "I hope you
> have backups."

I absolutely agree!

> It's not the size of the file system that matters so much.  The data on
> it doesn't matter from a debugging perspective and, in any event, it's
> not written to the image file anyway.  I do want a btrfs-image file from
> the file system, and if btrfs-image fails to create a usable image,
> that's also valuable to know and fix.

The larger filesystem gives me the following output (kernel 
5.0.6-050006-generic, btrfs-progs v4.20.2):

# btrfs-image -c 9 /dev/md0 /mnt/b/md.img
incorrect offsets 15003 146075
ERROR: open ctree failed
ERROR: create failed: Success

Last line is funny.
The smaller system let me create an image, but the size of the file, 
resulting from "btrfs-image -c 9 /dev/sdXY ...", is surprisingly small - 
only 536576B. I guess this is conform with the man-page: "All data will 
be zeroed, but metadata and the like is preserved. Mainly used for 
debugging purposes."

I shall send you a link to the image (in a private mail) as soon as 
possible. Please, respect any private data in case you manage to recover 
something.

Kind regards,
Nik.
--

> Thanks,
> 
> -Jeff
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-04 15:58     ` Nik.
@ 2019-04-04 17:31       ` Chris Murphy
       [not found]         ` <beab578a-ccaf-1ec7-c7b6-1ba9cd3743ad@avgustinov.eu>
  0 siblings, 1 reply; 51+ messages in thread
From: Chris Murphy @ 2019-04-04 17:31 UTC (permalink / raw)
  To: Nik.; +Cc: Jeff Mahoney, Btrfs BTRFS

On Thu, Apr 4, 2019 at 9:58 AM Nik. <btrfs@avgustinov.eu> wrote:
>
>
> 2019-04-04 04:48, Jeff Mahoney:
> > On 3/31/19 2:44 PM, btrfs@avgustinov.eu wrote:
> >> Dear all,
> >>
> >>
> >> I am a big fan of btrfs, and I am using it since 2013 - in the meantime
> >> on at least four different computers. During this time, I suffered at
> >> least four bad btrfs-failures leading to unmountable, unreadable and
> >> unrecoverable file system. Since in three of the cases I did not manage
> >> to recover even a single file, I am beginning to lose my confidence in
> >> btrfs: for 35-years working with different computers no other file
> >> system was so bad at recovering files!
> >>
> >> Considering the importance of btrfs and keeping in mind the number of
> >> similar failures, described in countless forums on the net, I have got
> >> an idea: to donate my last two damaged filesystems for investigation
> >> purposes and thus hopefully contribute to the improvement of btrfs. One
> >> condition: any recovered personal data (mostly pictures and audio files)
> >> should remain undisclosed and be deleted.
> >>
> >> Should anybody be interested in this - feel free to contact me
> >> personally (I am not reading the list regularly!), otherwise I am going
> >> to reformat and reuse both systems in two weeks from today.
> >>
> >> Some more info:
> >>
> >>    - The smaller system is 83.6GB, I could either send you an image of
> >> this system on an unneeded hard drive or put it into a dedicated
> >> computer and give you root rights and ssh-access to it (the network lin
> >> is 100Mb down, 50Mb up, so it should be acceptable).
> >>
> >>    - The used space on the other file system is about 3 TB (4 TB
> >> capacity) and it is distributed among 5 drives, so I can only offer
> >> remote access to this, but I will need time to organize it.
> >>
> >> If you need additional information - please ask, but keep in mind that I
> >> have almost no "free time" and the answer could need a day or two.
> >
> > My team is always interested in images of broken file systems.  This is
> > how --repair evolves.  Images with failed --repair operations are still
> > valuable.  That's the first step most users take and why wouldn't they?
> >   If --repair is misbehaving, the end result shouldn't be "I hope you
> > have backups."
>
> I absolutely agree!
>
> > It's not the size of the file system that matters so much.  The data on
> > it doesn't matter from a debugging perspective and, in any event, it's
> > not written to the image file anyway.  I do want a btrfs-image file from
> > the file system, and if btrfs-image fails to create a usable image,
> > that's also valuable to know and fix.
>
> The larger filesystem gives me the following output (kernel
> 5.0.6-050006-generic, btrfs-progs v4.20.2):
>
> # btrfs-image -c 9 /dev/md0 /mnt/b/md.img
> incorrect offsets 15003 146075
> ERROR: open ctree failed
> ERROR: create failed: Success
>
> Last line is funny.

I've complained about that nonsense for a while and yet it remains. A
successful failure is an ERROR. I still don't know what it means but I
suspect it's an incomplete image.


> The smaller system let me create an image, but the size of the file,
> resulting from "btrfs-image -c 9 /dev/sdXY ...", is surprisingly small -
> only 536576B. I guess this is conform with the man-page: "All data will
> be zeroed, but metadata and the like is preserved. Mainly used for
> debugging purposes."
>
> I shall send you a link to the image (in a private mail) as soon as
> possible. Please, respect any private data in case you manage to recover
> something.

You should use -ss option for this reason.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-04 15:27                 ` Nik.
@ 2019-04-05  0:47                   ` Qu Wenruo
  2019-04-05  6:58                     ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-05  0:47 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2150 bytes --]



On 2019/4/4 下午11:27, Nik. wrote:
> 
> 
> 2019-04-03 03:04, Qu Wenruo:
>>
> 
> [snip]
> ...
> 
>>>> In your case, you just need latest btrfs-progs and re-run "btrfs check
>>>> --readonly" on it.
>>>
>>> Will try this, but have no time before tomorrow evening.
>>>
>>>
>>>> If it just shows the same result, meaning I can't get the info about
>>>> which tree block is corrupted, then you could try to mount it with
>>>> -o ro
>>>> using *LATEST* kernel.
>>>
>>> I tried this before with the 4.15.0-46 kernel, it was impossible. Will
>>> try again with newer one as soon as possible (in best case tomorrow
>>> evening); I will post the results.
> 
> Sorry for the delay, compiling the btrfs-progs took much more time than
> expected (had to install new packages again and again). Finally had to
> give up the conversion ("make" could not find reiserfs/misc.h, although
> both libreiserfscore and reiser4fs are installed).
> Output of the commands:
> # uname -r
> 5.0.6-050006-generic
> #btrfs --version
> btrfs-progs v4.20.2
> # btrfs check --readonly /dev/md0
> Opening filesystem to check...
> incorrect offsets 15003 146075
> ERROR: cannot open file system
> 
> It seems that I will wait until 5.2 is out...
> (the answer to Jeff Mahoney is coming with separate e-mail!)

OK, then you can try mount it with 5.0 with -o ro.
The objective is not to make it work, but to get the dmesg, which should
contain the tree block bytenr, so that we can try to fix that offending
tree block manually.

Thanks,
Qu

> 
>>>> Latest kernel will report anything wrong pretty vocally, in that case,
>>>> dmesg would include the bytenr of corrupted tree block.
>>>>
>>>> Then I could craft needed commands to further debug the fs.
>>>
>>> Ok, I will try to post more info tomorrow about this time.
>>>
>>> Nik.
>>> -- 
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Thank you for trying to improve btrfs!
>>>>>
>>>>> Nik.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>
>>>>> You are not from the 007 - lab, are you? ;-)
>>>>>
>>>>>>>
>>>>>>> Kind regards,
>>>>>>>
>>>>>>> Nik.
>>>>>>
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-04  2:48   ` Jeff Mahoney
  2019-04-04 15:58     ` Nik.
@ 2019-04-05  6:53     ` Chris Murphy
  1 sibling, 0 replies; 51+ messages in thread
From: Chris Murphy @ 2019-04-05  6:53 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Btrfs BTRFS, Hugo Mills

On Wed, Apr 3, 2019 at 8:48 PM Jeff Mahoney <jeffm@suse.com> wrote:
>
> My team is always interested in images of broken file systems.  This is
> how --repair evolves.  Images with failed --repair operations are still
> valuable.  That's the first step most users take and why wouldn't they?
>  If --repair is misbehaving, the end result shouldn't be "I hope you
> have backups."

Ok well there's a lot of that still happening. So what switches should
users use for images these days? It's not obvious whether there's
extent tree corruption so are they best off always using -w? And I
know Qu doesn't like images with either -s or -ss, and I don't think
users care which one they use, but it's not reasonable that they
supply images that don't have file and dir names scrubbed. And then
where and how should they submit the images?

I haven't taken an image yet but do have a file system that was
working fine before using btrfs-progs 4.19.1 with --clear-space-cache
v1 that definitely corrupted the extent tree. I have 0% confidence in
--repair or --init-extent-tree fixing it so I haven't tried it yet,
and it's a backup so it doesn't really matter. I did file a bug.
https://bugzilla.kernel.org/show_bug.cgi?id=202717


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-05  0:47                   ` Qu Wenruo
@ 2019-04-05  6:58                     ` Nik.
  2019-04-05  7:08                       ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-05  6:58 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-05 02:47, Qu Wenruo:
> 
> 
> On 2019/4/4 下午11:27, Nik. wrote:
>>
>>
>> 2019-04-03 03:04, Qu Wenruo:
>>>
>>
>> [snip]
>> ...
>>
>>>>> In your case, you just need latest btrfs-progs and re-run "btrfs check
>>>>> --readonly" on it.
>>>>
>>>> Will try this, but have no time before tomorrow evening.
>>>>
>>>>
>>>>> If it just shows the same result, meaning I can't get the info about
>>>>> which tree block is corrupted, then you could try to mount it with
>>>>> -o ro
>>>>> using *LATEST* kernel.
>>>>
>>>> I tried this before with the 4.15.0-46 kernel, it was impossible. Will
>>>> try again with newer one as soon as possible (in best case tomorrow
>>>> evening); I will post the results.
>>
>> Sorry for the delay, compiling the btrfs-progs took much more time than
>> expected (had to install new packages again and again). Finally had to
>> give up the conversion ("make" could not find reiserfs/misc.h, although
>> both libreiserfscore and reiser4fs are installed).
>> Output of the commands:
>> # uname -r
>> 5.0.6-050006-generic
>> #btrfs --version
>> btrfs-progs v4.20.2
>> # btrfs check --readonly /dev/md0
>> Opening filesystem to check...
>> incorrect offsets 15003 146075
>> ERROR: cannot open file system
>>
>> It seems that I will wait until 5.2 is out...
>> (the answer to Jeff Mahoney is coming with separate e-mail!)
> 
> OK, then you can try mount it with 5.0 with -o ro.

# mount -t btrfs -o ro /dev/md0 /mnt/md0/
mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0, 
missing codepage or helper program, or other error.

> The objective is not to make it work, but to get the dmesg, which should

# dmesg|tail
[65283.442278] audit: type=1107 audit(1554438151.396:115): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel msg='Unknown class service 
exe="/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
[72504.975359] audit: type=1107 audit(1554445372.928:116): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel msg='Unknown class service 
exe="/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
[72535.214394] audit: type=1107 audit(1554445403.166:117): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel msg='Unknown class service 
exe="/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
[72535.257571] audit: type=1107 audit(1554445403.210:118): pid=1 uid=0 
auid=4294967295 ses=4294967295 subj=kernel msg='Unknown class system 
exe="/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
[73427.486853] BTRFS info (device md0): disk space caching is enabled
[73427.938260] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0
[73429.172707] BTRFS critical (device md0): corrupt leaf: root=2 
block=1894009225216 slot=30, unexpected item end, have 146075 expect 15003
[73429.176628] BTRFS critical (device md0): corrupt leaf: root=2 
block=1894009225216 slot=30, unexpected item end, have 146075 expect 15003
[73429.177153] BTRFS error (device md0): failed to read block groups: -5
[73429.197019] BTRFS error (device md0): open_ctree failed


> contain the tree block bytenr, so that we can try to fix that offending
> tree block manually.
> 
> Thanks,
> Qu

Should I try something alse?
Thank you!
Nik.
--
> 
>>
>>>>> Latest kernel will report anything wrong pretty vocally, in that case,
>>>>> dmesg would include the bytenr of corrupted tree block.
>>>>>
>>>>> Then I could craft needed commands to further debug the fs.
>>>>
>>>> Ok, I will try to post more info tomorrow about this time.
>>>>
>>>> Nik.
>>>> -- 
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>>
>>>>>> Thank you for trying to improve btrfs!
>>>>>>
>>>>>> Nik.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>
>>>>>> You are not from the 007 - lab, are you? ;-)
>>>>>>
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>>
>>>>>>>> Nik.
>>>>>>>
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
       [not found]         ` <beab578a-ccaf-1ec7-c7b6-1ba9cd3743ad@avgustinov.eu>
@ 2019-04-05  7:07           ` Chris Murphy
  2019-04-05 12:07             ` Nik.
  2019-04-12 10:52             ` Nik.
  0 siblings, 2 replies; 51+ messages in thread
From: Chris Murphy @ 2019-04-05  7:07 UTC (permalink / raw)
  To: Nik.; +Cc: Chris Murphy, Jeff Mahoney, Btrfs BTRFS

On Fri, Apr 5, 2019 at 12:45 AM Nik. <btrfs@avgustinov.eu> wrote:
>
> Sorry, I forgot this. Hier is the output:
>
> # btrfs-image -c 9 -ss /dev/sdj3 /mnt/b/sdj3.img
> WARNING: cannot find a hash collision for '..', generating garbage, it
> won't match indexes
>
> The new image is same size, and since it seems small to me I am
> attaching it to this mail.

What do you get for `btrfs insp dump-t -d /dev/` ?

Once I restore it, I get


$ sudo btrfs insp dump-t -d /dev/mapper/vg-nik
btrfs-progs v4.20.2
checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
bad tree block 90195087360, bytenr mismatch, want=90195087360,
have=7681037117263365436
Couldn't setup device tree
ERROR: unable to open /dev/mapper/vg-nik
$ sudo btrfs insp dump-t -r /dev/mapper/vg-nik
btrfs-progs v4.20.2
checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
bad tree block 90195087360, bytenr mismatch, want=90195087360,
have=7681037117263365436
Couldn't setup device tree
ERROR: unable to open /dev/mapper/vg-nik
$

There is a valid superblock however. So it restored something, just
not everything, not sure. Might be related to create failed success!

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-05  6:58                     ` Nik.
@ 2019-04-05  7:08                       ` Qu Wenruo
       [not found]                         ` <e9720559-eff2-e88b-12b4-81defb8c29c5@avgustinov.eu>
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-05  7:08 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1596 bytes --]

[snip]
> [73429.172707] BTRFS critical (device md0): corrupt leaf: root=2
> block=1894009225216 slot=30, unexpected item end, have 146075 expect 15003

Exact what we need.

Then please provide the following output:

# btrfs inspect dump-tree -t chunk /dev/md0
# btrfs inspect dump-tree -t extent /dev/dm0

The 2nd command would fail half way, but it should provide enough data
for me to craft the manual fix for you.

Thanks,
Qu

> [73429.176628] BTRFS critical (device md0): corrupt leaf: root=2
> block=1894009225216 slot=30, unexpected item end, have 146075 expect 15003
> [73429.177153] BTRFS error (device md0): failed to read block groups: -5
> [73429.197019] BTRFS error (device md0): open_ctree failed
> 
> 
>> contain the tree block bytenr, so that we can try to fix that offending
>> tree block manually.
>>
>> Thanks,
>> Qu
> 
> Should I try something alse?
> Thank you!
> Nik.
> -- 
>>
>>>
>>>>>> Latest kernel will report anything wrong pretty vocally, in that
>>>>>> case,
>>>>>> dmesg would include the bytenr of corrupted tree block.
>>>>>>
>>>>>> Then I could craft needed commands to further debug the fs.
>>>>>
>>>>> Ok, I will try to post more info tomorrow about this time.
>>>>>
>>>>> Nik.
>>>>> -- 
>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>>
>>>>>>> Thank you for trying to improve btrfs!
>>>>>>>
>>>>>>> Nik.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>
>>>>>>> You are not from the 007 - lab, are you? ;-)
>>>>>>>
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>>
>>>>>>>>> Nik.
>>>>>>>>
>>>>>>
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
       [not found]                         ` <e9720559-eff2-e88b-12b4-81defb8c29c5@avgustinov.eu>
@ 2019-04-05  8:15                           ` Qu Wenruo
  2019-04-05 19:38                             ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-05  8:15 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1472 bytes --]



On 2019/4/5 下午3:41, Nik. wrote:
> 
> Below is the stderr of both commands:
> 
> # btrfs inspect dump-tree -t chunk /dev/md0>DT-chunk.log
> # btrfs inspect dump-tree -t extent /dev/md0>DT-extent.log
> ERROR: leaf 1894009225216 slot 30 pointer invalid, offset 146038 size 37
> leaf data limit 16283
> ERROR: skip remaining slots
> 
> Since the output on stdout is pretty long even after gzip, I am
> providing only the output of the first command as attachment. The output
> of the second command (25 MB after gzip -9) can be downloaded here:
> 
> https://cloud.avgustinov.eu/index.php/s/AgbwWyCrbYjenq8

Sorry, I should have use a more specific command to get a smaller output.
But anyway, your output is good enough for me to craft the fix patch.

Here is the dirty fix branch:
https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_nik

Compile the btrfs-progs as usual.
Just a late hint, you can disable document/btrfs-convert to reduce the
dependency:
$ ./configure --disable-documentation --disable-convert

Then, inside btrfs-progs directory, call:
# ./btrfs-corrupt-block -X /dev/md0

If everything goes correctly, it should output something like:
  Successfully repaired tree block at 1894009225216
(And please ignore any grammar error in my code)

After that, please run a "btrfs check --readonly" to ensure no other bit
flip in your fs.

Thanks,
Qu



> 
> Hope this is ok.
> 
> Regards,
> Nik.
> -


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-05  7:07           ` Chris Murphy
@ 2019-04-05 12:07             ` Nik.
  2019-04-12 10:52             ` Nik.
  1 sibling, 0 replies; 51+ messages in thread
From: Nik. @ 2019-04-05 12:07 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Jeff Mahoney, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1603 bytes --]



2019-04-05 09:07, Chris Murphy:
> On Fri, Apr 5, 2019 at 12:45 AM Nik. <btrfs@avgustinov.eu> wrote:
>>
>> Sorry, I forgot this. Hier is the output:
>>
>> # btrfs-image -c 9 -ss /dev/sdj3 /mnt/b/sdj3.img
>> WARNING: cannot find a hash collision for '..', generating garbage, it
>> won't match indexes
>>
>> The new image is same size, and since it seems small to me I am
>> attaching it to this mail.
> 
> What do you get for `btrfs insp dump-t -d /dev/` ?

The output is long, so I have gzip-ed and attached the stdout of the 
command (no errors).

> Once I restore it, I get
> 
> 
> $ sudo btrfs insp dump-t -d /dev/mapper/vg-nik
> btrfs-progs v4.20.2
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> bad tree block 90195087360, bytenr mismatch, want=90195087360,
> have=7681037117263365436
> Couldn't setup device tree
> ERROR: unable to open /dev/mapper/vg-nik
> $ sudo btrfs insp dump-t -r /dev/mapper/vg-nik
> btrfs-progs v4.20.2
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> bad tree block 90195087360, bytenr mismatch, want=90195087360,
> have=7681037117263365436
> Couldn't setup device tree
> ERROR: unable to open /dev/mapper/vg-nik
> $
> 
> There is a valid superblock however. So it restored something, just
> not everything, not sure. Might be related to create failed success!

This does not "ring a bell", I do not understand it.
Should I try something else - tell me.

Kind regards,

Nik.
--

[-- Attachment #2: sdj3_dump-tree.log.gz --]
[-- Type: application/gzip, Size: 17317 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-05  8:15                           ` Qu Wenruo
@ 2019-04-05 19:38                             ` Nik.
  2019-04-06  0:03                               ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-05 19:38 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-05 10:15, Qu Wenruo:
> 
> 
> On 2019/4/5 下午3:41, Nik. wrote:
>>
>> Below is the stderr of both commands:
>>
>> # btrfs inspect dump-tree -t chunk /dev/md0>DT-chunk.log
>> # btrfs inspect dump-tree -t extent /dev/md0>DT-extent.log
>> ERROR: leaf 1894009225216 slot 30 pointer invalid, offset 146038 size 37
>> leaf data limit 16283
>> ERROR: skip remaining slots
>>
>> Since the output on stdout is pretty long even after gzip, I am
>> providing only the output of the first command as attachment. The output
>> of the second command (25 MB after gzip -9) can be downloaded here:
>>
>> https://cloud.avgustinov.eu/index.php/s/AgbwWyCrbYjenq8
> 
> Sorry, I should have use a more specific command to get a smaller output.
> But anyway, your output is good enough for me to craft the fix patch.
> 
> Here is the dirty fix branch:
> https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_nik
> 
> Compile the btrfs-progs as usual.
> Just a late hint, you can disable document/btrfs-convert to reduce the
> dependency:
> $ ./configure --disable-documentation --disable-convert
> 
> Then, inside btrfs-progs directory, call:
> # ./btrfs-corrupt-block -X /dev/md0
incorrect offsets 15003 146075
Open ctree failed

Actually there was one warning during make, I don't know of it is relevant:
     [CC]     check/main.o
check/main.c: In function ‘try_repair_inode’:
check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
   if (!ret) {
      ^
check/main.c:2666:6: note: ‘ret’ was declared here
   int ret;
       ^~~

The previous steps were as follow (output ommited, since nothing 
unexpected happened):
#git clone --single-branch -v -b dirty_fix_for_nik 
https://github.com/adam900710/btrfs-progs.git
#cd btrfs-progs/
#./autogen.sh
#./configure --disable-documentation --disable-convert
#make

Did I got the right branch? Or miss any step?

Kind regards,
Nik.
--

> If everything goes correctly, it should output something like:
>    Successfully repaired tree block at 1894009225216
> (And please ignore any grammar error in my code)
> 
> After that, please run a "btrfs check --readonly" to ensure no other bit
> flip in your fs.
> 
> Thanks,
> Qu
> 
> 
> 
>>
>> Hope this is ok.
>>
>> Regards,
>> Nik.
>> -
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-05 19:38                             ` Nik.
@ 2019-04-06  0:03                               ` Qu Wenruo
  2019-04-06  7:16                                 ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-06  0:03 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2702 bytes --]



On 2019/4/6 上午3:38, Nik. wrote:
> 
> 
> 2019-04-05 10:15, Qu Wenruo:
>>
>>
>> On 2019/4/5 下午3:41, Nik. wrote:
>>>
>>> Below is the stderr of both commands:
>>>
>>> # btrfs inspect dump-tree -t chunk /dev/md0>DT-chunk.log
>>> # btrfs inspect dump-tree -t extent /dev/md0>DT-extent.log
>>> ERROR: leaf 1894009225216 slot 30 pointer invalid, offset 146038 size 37
>>> leaf data limit 16283
>>> ERROR: skip remaining slots
>>>
>>> Since the output on stdout is pretty long even after gzip, I am
>>> providing only the output of the first command as attachment. The output
>>> of the second command (25 MB after gzip -9) can be downloaded here:
>>>
>>> https://cloud.avgustinov.eu/index.php/s/AgbwWyCrbYjenq8
>>
>> Sorry, I should have use a more specific command to get a smaller output.
>> But anyway, your output is good enough for me to craft the fix patch.
>>
>> Here is the dirty fix branch:
>> https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_nik
>>
>> Compile the btrfs-progs as usual.
>> Just a late hint, you can disable document/btrfs-convert to reduce the
>> dependency:
>> $ ./configure --disable-documentation --disable-convert
>>
>> Then, inside btrfs-progs directory, call:
>> # ./btrfs-corrupt-block -X /dev/md0
> incorrect offsets 15003 146075
> Open ctree failed

Oh, I forgot it's in extent tree, which may need to be read out at mount
time.

Just a new flag can handle it.

The branch is updated, please check.

Thanks,
Qu

> 
> Actually there was one warning during make, I don't know of it is relevant:
>     [CC]     check/main.o
> check/main.c: In function ‘try_repair_inode’:
> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this
> function [-Wmaybe-uninitialized]
>   if (!ret) {
>      ^
> check/main.c:2666:6: note: ‘ret’ was declared here
>   int ret;
>       ^~~
> 
> The previous steps were as follow (output ommited, since nothing
> unexpected happened):
> #git clone --single-branch -v -b dirty_fix_for_nik
> https://github.com/adam900710/btrfs-progs.git
> #cd btrfs-progs/
> #./autogen.sh
> #./configure --disable-documentation --disable-convert
> #make
> 
> Did I got the right branch? Or miss any step?
> 
> Kind regards,
> Nik.
> -- 
> 
>> If everything goes correctly, it should output something like:
>>    Successfully repaired tree block at 1894009225216
>> (And please ignore any grammar error in my code)
>>
>> After that, please run a "btrfs check --readonly" to ensure no other bit
>> flip in your fs.
>>
>> Thanks,
>> Qu
>>
>>
>>
>>>
>>> Hope this is ok.
>>>
>>> Regards,
>>> Nik.
>>> -
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06  0:03                               ` Qu Wenruo
@ 2019-04-06  7:16                                 ` Nik.
  2019-04-06  7:45                                   ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-06  7:16 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-06 02:03, Qu Wenruo:
> 
> 
> On 2019/4/6 上午3:38, Nik. wrote:
>>
>>
>> 2019-04-05 10:15, Qu Wenruo:
>>>
>>>
>>> On 2019/4/5 下午3:41, Nik. wrote:
>>>>
>>>> Below is the stderr of both commands:
>>>>
>>>> # btrfs inspect dump-tree -t chunk /dev/md0>DT-chunk.log
>>>> # btrfs inspect dump-tree -t extent /dev/md0>DT-extent.log
>>>> ERROR: leaf 1894009225216 slot 30 pointer invalid, offset 146038 size 37
>>>> leaf data limit 16283
>>>> ERROR: skip remaining slots
>>>>
>>>> Since the output on stdout is pretty long even after gzip, I am
>>>> providing only the output of the first command as attachment. The output
>>>> of the second command (25 MB after gzip -9) can be downloaded here:
>>>>
>>>> https://cloud.avgustinov.eu/index.php/s/AgbwWyCrbYjenq8
>>>
>>> Sorry, I should have use a more specific command to get a smaller output.
>>> But anyway, your output is good enough for me to craft the fix patch.
>>>
>>> Here is the dirty fix branch:
>>> https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_nik
>>>
>>> Compile the btrfs-progs as usual.
>>> Just a late hint, you can disable document/btrfs-convert to reduce the
>>> dependency:
>>> $ ./configure --disable-documentation --disable-convert
>>>
>>> Then, inside btrfs-progs directory, call:
>>> # ./btrfs-corrupt-block -X /dev/md0
>> incorrect offsets 15003 146075
>> Open ctree failed
> 
> Oh, I forgot it's in extent tree, which may need to be read out at mount
> time.
> 
> Just a new flag can handle it.
> 
> The branch is updated, please check.

New output:
Successfully repair tree block at 1894009225216

# mount -t btrfs -o ro /dev/md0 /mnt/md0/
mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0, 
missing codepage or helper program, or other error.

# dmesg|tail
...
[34848.784117] BTRFS info (device md0): disk space caching is enabled
[34848.954741] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0
[34850.150789] BTRFS critical (device md0): corrupt leaf: root=1 
block=1894009225216 slot=30, unexpected item end, have 131072 expect 15003
[34850.151209] BTRFS error (device md0): failed to read block groups: -5
[34850.196156] BTRFS error (device md0): open_ctree failed

It seems that there is improvement...

Thank you.
Nik.
--

> Thanks,
> Qu
> 
>>
>> Actually there was one warning during make, I don't know of it is relevant:
>>      [CC]     check/main.o
>> check/main.c: In function ‘try_repair_inode’:
>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this
>> function [-Wmaybe-uninitialized]
>>    if (!ret) {
>>       ^
>> check/main.c:2666:6: note: ‘ret’ was declared here
>>    int ret;
>>        ^~~
>>
>> The previous steps were as follow (output ommited, since nothing
>> unexpected happened):
>> #git clone --single-branch -v -b dirty_fix_for_nik
>> https://github.com/adam900710/btrfs-progs.git
>> #cd btrfs-progs/
>> #./autogen.sh
>> #./configure --disable-documentation --disable-convert
>> #make
>>
>> Did I got the right branch? Or miss any step?
>>
>> Kind regards,
>> Nik.
>> -- 
>>
>>> If everything goes correctly, it should output something like:
>>>     Successfully repaired tree block at 1894009225216
>>> (And please ignore any grammar error in my code)
>>>
>>> After that, please run a "btrfs check --readonly" to ensure no other bit
>>> flip in your fs.
>>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>
>>>>
>>>> Hope this is ok.
>>>>
>>>> Regards,
>>>> Nik.
>>>> -
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06  7:16                                 ` Nik.
@ 2019-04-06  7:45                                   ` Qu Wenruo
  2019-04-06  8:44                                     ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-06  7:45 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3960 bytes --]



On 2019/4/6 下午3:16, Nik. wrote:
> 
> 
> 2019-04-06 02:03, Qu Wenruo:
>>
>>
>> On 2019/4/6 上午3:38, Nik. wrote:
>>>
>>>
>>> 2019-04-05 10:15, Qu Wenruo:
>>>>
>>>>
>>>> On 2019/4/5 下午3:41, Nik. wrote:
>>>>>
>>>>> Below is the stderr of both commands:
>>>>>
>>>>> # btrfs inspect dump-tree -t chunk /dev/md0>DT-chunk.log
>>>>> # btrfs inspect dump-tree -t extent /dev/md0>DT-extent.log
>>>>> ERROR: leaf 1894009225216 slot 30 pointer invalid, offset 146038
>>>>> size 37
>>>>> leaf data limit 16283
>>>>> ERROR: skip remaining slots
>>>>>
>>>>> Since the output on stdout is pretty long even after gzip, I am
>>>>> providing only the output of the first command as attachment. The
>>>>> output
>>>>> of the second command (25 MB after gzip -9) can be downloaded here:
>>>>>
>>>>> https://cloud.avgustinov.eu/index.php/s/AgbwWyCrbYjenq8
>>>>
>>>> Sorry, I should have use a more specific command to get a smaller
>>>> output.
>>>> But anyway, your output is good enough for me to craft the fix patch.
>>>>
>>>> Here is the dirty fix branch:
>>>> https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_nik
>>>>
>>>> Compile the btrfs-progs as usual.
>>>> Just a late hint, you can disable document/btrfs-convert to reduce the
>>>> dependency:
>>>> $ ./configure --disable-documentation --disable-convert
>>>>
>>>> Then, inside btrfs-progs directory, call:
>>>> # ./btrfs-corrupt-block -X /dev/md0
>>> incorrect offsets 15003 146075
>>> Open ctree failed
>>
>> Oh, I forgot it's in extent tree, which may need to be read out at mount
>> time.
>>
>> Just a new flag can handle it.
>>
>> The branch is updated, please check.
> 
> New output:
> Successfully repair tree block at 1894009225216
> 
> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
> missing codepage or helper program, or other error.
> 
> # dmesg|tail
> ...
> [34848.784117] BTRFS info (device md0): disk space caching is enabled
> [34848.954741] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
> flush 0, corrupt 2181, gen 0
> [34850.150789] BTRFS critical (device md0): corrupt leaf: root=1
> block=1894009225216 slot=30, unexpected item end, have 131072 expect 15003
> [34850.151209] BTRFS error (device md0): failed to read block groups: -5
> [34850.196156] BTRFS error (device md0): open_ctree failed
> 
> It seems that there is improvement...

Debug info added.

Please try again, and sorry for the inconvenience. Hopes this is the
last try.

Thanks,
Qu
> 
> Thank you.
> Nik.
> -- 
> 
>> Thanks,
>> Qu
>>
>>>
>>> Actually there was one warning during make, I don't know of it is
>>> relevant:
>>>      [CC]     check/main.o
>>> check/main.c: In function ‘try_repair_inode’:
>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this
>>> function [-Wmaybe-uninitialized]
>>>    if (!ret) {
>>>       ^
>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>    int ret;
>>>        ^~~
>>>
>>> The previous steps were as follow (output ommited, since nothing
>>> unexpected happened):
>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>> https://github.com/adam900710/btrfs-progs.git
>>> #cd btrfs-progs/
>>> #./autogen.sh
>>> #./configure --disable-documentation --disable-convert
>>> #make
>>>
>>> Did I got the right branch? Or miss any step?
>>>
>>> Kind regards,
>>> Nik.
>>> -- 
>>>
>>>> If everything goes correctly, it should output something like:
>>>>     Successfully repaired tree block at 1894009225216
>>>> (And please ignore any grammar error in my code)
>>>>
>>>> After that, please run a "btrfs check --readonly" to ensure no other
>>>> bit
>>>> flip in your fs.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>
>>>>
>>>>>
>>>>> Hope this is ok.
>>>>>
>>>>> Regards,
>>>>> Nik.
>>>>> -
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06  7:45                                   ` Qu Wenruo
@ 2019-04-06  8:44                                     ` Nik.
  2019-04-06  9:06                                       ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-06  8:44 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-06 09:45, Qu Wenruo:
> 
> 
> On 2019/4/6 下午3:16, Nik. wrote:
>>
>>
>> 2019-04-06 02:03, Qu Wenruo:
>>>
>>>
>>> On 2019/4/6 上午3:38, Nik. wrote:
>>>>
>>>>
>>>> 2019-04-05 10:15, Qu Wenruo:
>>>>>
>>>>>
>>>>> On 2019/4/5 下午3:41, Nik. wrote:
>>>>>>
>>>>>> Below is the stderr of both commands:
>>>>>>
>>>>>> # btrfs inspect dump-tree -t chunk /dev/md0>DT-chunk.log
>>>>>> # btrfs inspect dump-tree -t extent /dev/md0>DT-extent.log
>>>>>> ERROR: leaf 1894009225216 slot 30 pointer invalid, offset 146038
>>>>>> size 37
>>>>>> leaf data limit 16283
>>>>>> ERROR: skip remaining slots
>>>>>>
>>>>>> Since the output on stdout is pretty long even after gzip, I am
>>>>>> providing only the output of the first command as attachment. The
>>>>>> output
>>>>>> of the second command (25 MB after gzip -9) can be downloaded here:
>>>>>>
>>>>>> https://cloud.avgustinov.eu/index.php/s/AgbwWyCrbYjenq8
>>>>>
>>>>> Sorry, I should have use a more specific command to get a smaller
>>>>> output.
>>>>> But anyway, your output is good enough for me to craft the fix patch.
>>>>>
>>>>> Here is the dirty fix branch:
>>>>> https://github.com/adam900710/btrfs-progs/tree/dirty_fix_for_nik
>>>>>
>>>>> Compile the btrfs-progs as usual.
>>>>> Just a late hint, you can disable document/btrfs-convert to reduce the
>>>>> dependency:
>>>>> $ ./configure --disable-documentation --disable-convert
>>>>>
>>>>> Then, inside btrfs-progs directory, call:
>>>>> # ./btrfs-corrupt-block -X /dev/md0
>>>> incorrect offsets 15003 146075
>>>> Open ctree failed
>>>
>>> Oh, I forgot it's in extent tree, which may need to be read out at mount
>>> time.
>>>
>>> Just a new flag can handle it.
>>>
>>> The branch is updated, please check.
>>
>> New output:
>> Successfully repair tree block at 1894009225216
>>
>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>> missing codepage or helper program, or other error.
>>
>> # dmesg|tail
>> ...
>> [34848.784117] BTRFS info (device md0): disk space caching is enabled
>> [34848.954741] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>> flush 0, corrupt 2181, gen 0
>> [34850.150789] BTRFS critical (device md0): corrupt leaf: root=1
>> block=1894009225216 slot=30, unexpected item end, have 131072 expect 15003
>> [34850.151209] BTRFS error (device md0): failed to read block groups: -5
>> [34850.196156] BTRFS error (device md0): open_ctree failed
>>
>> It seems that there is improvement...
> 
> Debug info added.
> 
> Please try again, and sorry for the inconvenience. Hopes this is the
> last try.

#sudo ./btrfs-corrupt-block -X /dev/md0
old offset=131072 len=0
new offset=0 len=0
Successfully repair tree block at 1894009225216
# mount -t btrfs -o ro /dev/md0 /mnt/md0/
mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0, 
missing codepage or helper program, or other error.
root@bach:~# dmesg|tail
...
[39342.860715] BTRFS info (device md0): disk space caching is enabled
[39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0
[39344.197411] BTRFS critical (device md0): corrupt leaf: root=1 
block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
[39344.197915] BTRFS error (device md0): failed to read block groups: -5
[39344.248137] BTRFS error (device md0): open_ctree failed

Sorry, I forgot to tell: this and previous attempt were with kernel 
4.15.0-47-generic. My Ubuntu 18.04 LTS is having enormous problems with 
Kernel 5.0.2 - very long boot; network, login and other services cycling 
trough "start, timeout, fail, stop" again and again, etc. If kernel 5 is 
important I will need time to get it right (maybe even assistance from 
another(?) developer group).
Actually with 5.0.2 each boot sends me an email about an empty and not 
automatically mounted btrfs filesystem with raid1 profile, consisting 
from two devices (sdb and sdi):

kernel: [    9.625619] BTRFS: device fsid 
05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi

Scrubbing it finishes almost immediately (see below), but during next 
boot the email comes again:

#btrfs scrub status /mnt/b
scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
         scrub started at Sat Apr  6 10:42:15 2019 and finished after 
00:00:00
         total bytes scrubbed: 1.51MiB with 0 errors

Should I be worried about it?

Kind regards,
Nik.
--

> Thanks,
> Qu
>>
>> Thank you.
>> Nik.
>> -- 
>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> Actually there was one warning during make, I don't know of it is
>>>> relevant:
>>>>       [CC]     check/main.o
>>>> check/main.c: In function ‘try_repair_inode’:
>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this
>>>> function [-Wmaybe-uninitialized]
>>>>     if (!ret) {
>>>>        ^
>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>     int ret;
>>>>         ^~~
>>>>
>>>> The previous steps were as follow (output ommited, since nothing
>>>> unexpected happened):
>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>> https://github.com/adam900710/btrfs-progs.git
>>>> #cd btrfs-progs/
>>>> #./autogen.sh
>>>> #./configure --disable-documentation --disable-convert
>>>> #make
>>>>
>>>> Did I got the right branch? Or miss any step?
>>>>
>>>> Kind regards,
>>>> Nik.
>>>> -- 
>>>>
>>>>> If everything goes correctly, it should output something like:
>>>>>      Successfully repaired tree block at 1894009225216
>>>>> (And please ignore any grammar error in my code)
>>>>>
>>>>> After that, please run a "btrfs check --readonly" to ensure no other
>>>>> bit
>>>>> flip in your fs.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Hope this is ok.
>>>>>>
>>>>>> Regards,
>>>>>> Nik.
>>>>>> -
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06  8:44                                     ` Nik.
@ 2019-04-06  9:06                                       ` Qu Wenruo
  2019-04-06 13:20                                         ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-06  9:06 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4102 bytes --]

>>
>> Please try again, and sorry for the inconvenience. Hopes this is the
>> last try.
> 
> #sudo ./btrfs-corrupt-block -X /dev/md0
> old offset=131072 len=0
> new offset=0 len=0

My bad, the first fix is bad, leading the bad result.

(And that's why we need to review patches)

Fortunately we have everything we need to manually set the value, no
magic any more.

The only uncertain part is the size.
If mount still fails, dmesg will tell me the size I need.


> Successfully repair tree block at 1894009225216
> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
> missing codepage or helper program, or other error.
> root@bach:~# dmesg|tail
> ...
> [39342.860715] BTRFS info (device md0): disk space caching is enabled
> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
> flush 0, corrupt 2181, gen 0
> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
> [39344.197915] BTRFS error (device md0): failed to read block groups: -5
> [39344.248137] BTRFS error (device md0): open_ctree failed
> 
> Sorry, I forgot to tell: this and previous attempt were with kernel
> 4.15.0-47-generic.

As long as it can output above message, the kernel version doesn't make
much difference.


> My Ubuntu 18.04 LTS is having enormous problems with
> Kernel 5.0.2 - very long boot; network, login and other services cycling
> trough "start, timeout, fail, stop" again and again, etc. If kernel 5 is
> important I will need time to get it right (maybe even assistance from
> another(?) developer group).
> Actually with 5.0.2 each boot sends me an email about an empty and not
> automatically mounted btrfs filesystem with raid1 profile, consisting
> from two devices (sdb and sdi):
> 
> kernel: [    9.625619] BTRFS: device fsid
> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
> 
> Scrubbing it finishes almost immediately (see below), but during next
> boot the email comes again:
> 
> #btrfs scrub status /mnt/b
> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>         scrub started at Sat Apr  6 10:42:15 2019 and finished after
> 00:00:00
>         total bytes scrubbed: 1.51MiB with 0 errors
> 
> Should I be worried about it?

You could try btrfs check --readonly and see what's going on.
If btrfs check --readonly is OK, then it should be mostly OK.

Thanks,
Qu


> 
> Kind regards,
> Nik.
> -- 
> 
>> Thanks,
>> Qu
>>>
>>> Thank you.
>>> Nik.
>>> -- 
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Actually there was one warning during make, I don't know of it is
>>>>> relevant:
>>>>>       [CC]     check/main.o
>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this
>>>>> function [-Wmaybe-uninitialized]
>>>>>     if (!ret) {
>>>>>        ^
>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>     int ret;
>>>>>         ^~~
>>>>>
>>>>> The previous steps were as follow (output ommited, since nothing
>>>>> unexpected happened):
>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>> #cd btrfs-progs/
>>>>> #./autogen.sh
>>>>> #./configure --disable-documentation --disable-convert
>>>>> #make
>>>>>
>>>>> Did I got the right branch? Or miss any step?
>>>>>
>>>>> Kind regards,
>>>>> Nik.
>>>>> -- 
>>>>>
>>>>>> If everything goes correctly, it should output something like:
>>>>>>      Successfully repaired tree block at 1894009225216
>>>>>> (And please ignore any grammar error in my code)
>>>>>>
>>>>>> After that, please run a "btrfs check --readonly" to ensure no other
>>>>>> bit
>>>>>> flip in your fs.
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hope this is ok.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nik.
>>>>>>> -
>>>>>>
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06  9:06                                       ` Qu Wenruo
@ 2019-04-06 13:20                                         ` Nik.
  2019-04-06 13:22                                           ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-06 13:20 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-06 11:06, Qu Wenruo:
>>>
>>> Please try again, and sorry for the inconvenience. Hopes this is the
>>> last try.
>>
>> #sudo ./btrfs-corrupt-block -X /dev/md0
>> old offset=131072 len=0
>> new offset=0 len=0
> 
> My bad, the first fix is bad, leading the bad result.
> 
> (And that's why we need to review patches)
> 
> Fortunately we have everything we need to manually set the value, no
> magic any more.

So I gues the next steps were git fetch, make and run again the above 
two commands:

#git fetch
 From https://github.com/adam900710/btrfs-progs
  + c7bfe8cc...a8c26abd dirty_fix_for_nik -> origin/dirty_fix_for_nik 
(forced update)
#make
     [PY]     libbtrfsutil

#./btrfs-corrupt-block -X /dev/md0
old offset=0 len=0
new offset=0 len=0
Successfully repair tree block at 1894009225216

# mount -t btrfs -o ro /dev/md0 /mnt/md0/
mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0, 
missing codepage or helper program, or other error.

# dmesg|tail
...
[56146.672395] BTRFS info (device md0): disk space caching is enabled
[56146.841632] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0
[56148.097242] BTRFS critical (device md0): corrupt leaf: root=1 
block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
[56148.097583] BTRFS error (device md0): failed to read block groups: -5
[56148.140137] BTRFS error (device md0): open_ctree failed

If the above steps were wrong - please, correct!

> The only uncertain part is the size.
> If mount still fails, dmesg will tell me the size I need.
> 
> 
>> Successfully repair tree block at 1894009225216
>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>> missing codepage or helper program, or other error.
>> root@bach:~# dmesg|tail
>> ...
>> [39342.860715] BTRFS info (device md0): disk space caching is enabled
>> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>> flush 0, corrupt 2181, gen 0
>> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>> [39344.197915] BTRFS error (device md0): failed to read block groups: -5
>> [39344.248137] BTRFS error (device md0): open_ctree failed
>>
>> Sorry, I forgot to tell: this and previous attempt were with kernel
>> 4.15.0-47-generic.
> 
> As long as it can output above message, the kernel version doesn't make
> much difference.
> 
> 
>> My Ubuntu 18.04 LTS is having enormous problems with
>> Kernel 5.0.2 - very long boot; network, login and other services cycling
>> trough "start, timeout, fail, stop" again and again, etc. If kernel 5 is
>> important I will need time to get it right (maybe even assistance from
>> another(?) developer group).
>> Actually with 5.0.2 each boot sends me an email about an empty and not
>> automatically mounted btrfs filesystem with raid1 profile, consisting
>> from two devices (sdb and sdi):
>>
>> kernel: [    9.625619] BTRFS: device fsid
>> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
>>
>> Scrubbing it finishes almost immediately (see below), but during next
>> boot the email comes again:
>>
>> #btrfs scrub status /mnt/b
>> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>>          scrub started at Sat Apr  6 10:42:15 2019 and finished after
>> 00:00:00
>>          total bytes scrubbed: 1.51MiB with 0 errors
>>
>> Should I be worried about it?
> 
> You could try btrfs check --readonly and see what's going on.
> If btrfs check --readonly is OK, then it should be mostly OK.

Then it seems to be ok, thank you!


> Thanks,
> Qu
> 
> 
>>
>> Kind regards,
>> Nik.
>> -- 
>>
>>> Thanks,
>>> Qu
>>>>
>>>> Thank you.
>>>> Nik.
>>>> -- 
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>>
>>>>>> Actually there was one warning during make, I don't know of it is
>>>>>> relevant:
>>>>>>        [CC]     check/main.o
>>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in this
>>>>>> function [-Wmaybe-uninitialized]
>>>>>>      if (!ret) {
>>>>>>         ^
>>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>>      int ret;
>>>>>>          ^~~
>>>>>>
>>>>>> The previous steps were as follow (output ommited, since nothing
>>>>>> unexpected happened):
>>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>>> #cd btrfs-progs/
>>>>>> #./autogen.sh
>>>>>> #./configure --disable-documentation --disable-convert
>>>>>> #make
>>>>>>
>>>>>> Did I got the right branch? Or miss any step?
>>>>>>
>>>>>> Kind regards,
>>>>>> Nik.
>>>>>> -- 
>>>>>>
>>>>>>> If everything goes correctly, it should output something like:
>>>>>>>       Successfully repaired tree block at 1894009225216
>>>>>>> (And please ignore any grammar error in my code)
>>>>>>>
>>>>>>> After that, please run a "btrfs check --readonly" to ensure no other
>>>>>>> bit
>>>>>>> flip in your fs.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Hope this is ok.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nik.
>>>>>>>> -
>>>>>>>
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06 13:20                                         ` Nik.
@ 2019-04-06 13:22                                           ` Qu Wenruo
  2019-04-06 13:28                                             ` Qu Wenruo
  2019-04-06 14:19                                             ` Nik.
  0 siblings, 2 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-06 13:22 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 5879 bytes --]



On 2019/4/6 下午9:20, Nik. wrote:
> 
> 
> 2019-04-06 11:06, Qu Wenruo:
>>>>
>>>> Please try again, and sorry for the inconvenience. Hopes this is the
>>>> last try.
>>>
>>> #sudo ./btrfs-corrupt-block -X /dev/md0
>>> old offset=131072 len=0
>>> new offset=0 len=0
>>
>> My bad, the first fix is bad, leading the bad result.
>>
>> (And that's why we need to review patches)
>>
>> Fortunately we have everything we need to manually set the value, no
>> magic any more.
> 
> So I gues the next steps were git fetch, make and run again the above
> two commands:
> 
> #git fetch
> From https://github.com/adam900710/btrfs-progs
>  + c7bfe8cc...a8c26abd dirty_fix_for_nik -> origin/dirty_fix_for_nik
> (forced update)

It looks like you haven't checked out to the correct branch.

You could use command 'git checkout origin/dirty_fix_for_nik' to change
to the latest branch.

Thanks,
Qu

> #make
>     [PY]     libbtrfsutil
> 
> #./btrfs-corrupt-block -X /dev/md0
> old offset=0 len=0
> new offset=0 len=0
> Successfully repair tree block at 1894009225216
> 
> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
> missing codepage or helper program, or other error.
> 
> # dmesg|tail
> ...
> [56146.672395] BTRFS info (device md0): disk space caching is enabled
> [56146.841632] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
> flush 0, corrupt 2181, gen 0
> [56148.097242] BTRFS critical (device md0): corrupt leaf: root=1
> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
> [56148.097583] BTRFS error (device md0): failed to read block groups: -5
> [56148.140137] BTRFS error (device md0): open_ctree failed
> 
> If the above steps were wrong - please, correct!
> 
>> The only uncertain part is the size.
>> If mount still fails, dmesg will tell me the size I need.
>>
>>
>>> Successfully repair tree block at 1894009225216
>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>>> missing codepage or helper program, or other error.
>>> root@bach:~# dmesg|tail
>>> ...
>>> [39342.860715] BTRFS info (device md0): disk space caching is enabled
>>> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>>> flush 0, corrupt 2181, gen 0
>>> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>> [39344.197915] BTRFS error (device md0): failed to read block groups: -5
>>> [39344.248137] BTRFS error (device md0): open_ctree failed
>>>
>>> Sorry, I forgot to tell: this and previous attempt were with kernel
>>> 4.15.0-47-generic.
>>
>> As long as it can output above message, the kernel version doesn't make
>> much difference.
>>
>>
>>> My Ubuntu 18.04 LTS is having enormous problems with
>>> Kernel 5.0.2 - very long boot; network, login and other services cycling
>>> trough "start, timeout, fail, stop" again and again, etc. If kernel 5 is
>>> important I will need time to get it right (maybe even assistance from
>>> another(?) developer group).
>>> Actually with 5.0.2 each boot sends me an email about an empty and not
>>> automatically mounted btrfs filesystem with raid1 profile, consisting
>>> from two devices (sdb and sdi):
>>>
>>> kernel: [    9.625619] BTRFS: device fsid
>>> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
>>>
>>> Scrubbing it finishes almost immediately (see below), but during next
>>> boot the email comes again:
>>>
>>> #btrfs scrub status /mnt/b
>>> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>>>          scrub started at Sat Apr  6 10:42:15 2019 and finished after
>>> 00:00:00
>>>          total bytes scrubbed: 1.51MiB with 0 errors
>>>
>>> Should I be worried about it?
>>
>> You could try btrfs check --readonly and see what's going on.
>> If btrfs check --readonly is OK, then it should be mostly OK.
> 
> Then it seems to be ok, thank you!
> 
> 
>> Thanks,
>> Qu
>>
>>
>>>
>>> Kind regards,
>>> Nik.
>>> -- 
>>>
>>>> Thanks,
>>>> Qu
>>>>>
>>>>> Thank you.
>>>>> Nik.
>>>>> -- 
>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>>
>>>>>>> Actually there was one warning during make, I don't know of it is
>>>>>>> relevant:
>>>>>>>        [CC]     check/main.o
>>>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in
>>>>>>> this
>>>>>>> function [-Wmaybe-uninitialized]
>>>>>>>      if (!ret) {
>>>>>>>         ^
>>>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>>>      int ret;
>>>>>>>          ^~~
>>>>>>>
>>>>>>> The previous steps were as follow (output ommited, since nothing
>>>>>>> unexpected happened):
>>>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>>>> #cd btrfs-progs/
>>>>>>> #./autogen.sh
>>>>>>> #./configure --disable-documentation --disable-convert
>>>>>>> #make
>>>>>>>
>>>>>>> Did I got the right branch? Or miss any step?
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Nik.
>>>>>>> -- 
>>>>>>>
>>>>>>>> If everything goes correctly, it should output something like:
>>>>>>>>       Successfully repaired tree block at 1894009225216
>>>>>>>> (And please ignore any grammar error in my code)
>>>>>>>>
>>>>>>>> After that, please run a "btrfs check --readonly" to ensure no
>>>>>>>> other
>>>>>>>> bit
>>>>>>>> flip in your fs.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hope this is ok.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nik.
>>>>>>>>> -
>>>>>>>>
>>>>>>
>>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06 13:22                                           ` Qu Wenruo
@ 2019-04-06 13:28                                             ` Qu Wenruo
  2019-04-06 14:19                                             ` Nik.
  1 sibling, 0 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-06 13:28 UTC (permalink / raw)
  To: Nik., linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 6204 bytes --]



On 2019/4/6 下午9:22, Qu Wenruo wrote:
> 
> 
> On 2019/4/6 下午9:20, Nik. wrote:
>>
>>
>> 2019-04-06 11:06, Qu Wenruo:
>>>>>
>>>>> Please try again, and sorry for the inconvenience. Hopes this is the
>>>>> last try.
>>>>
>>>> #sudo ./btrfs-corrupt-block -X /dev/md0
>>>> old offset=131072 len=0
>>>> new offset=0 len=0
>>>
>>> My bad, the first fix is bad, leading the bad result.
>>>
>>> (And that's why we need to review patches)
>>>
>>> Fortunately we have everything we need to manually set the value, no
>>> magic any more.
>>
>> So I gues the next steps were git fetch, make and run again the above
>> two commands:
>>
>> #git fetch
>> From https://github.com/adam900710/btrfs-progs
>>  + c7bfe8cc...a8c26abd dirty_fix_for_nik -> origin/dirty_fix_for_nik
>> (forced update)
> 
> It looks like you haven't checked out to the correct branch.
> 
> You could use command 'git checkout origin/dirty_fix_for_nik' to change
> to the latest branch.

BTW, you could combine the fetch + checkout to 'git pull' directly.

Thanks,
Qu

> 
> Thanks,
> Qu
> 
>> #make
>>     [PY]     libbtrfsutil
>>
>> #./btrfs-corrupt-block -X /dev/md0
>> old offset=0 len=0
>> new offset=0 len=0
>> Successfully repair tree block at 1894009225216
>>
>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>> missing codepage or helper program, or other error.
>>
>> # dmesg|tail
>> ...
>> [56146.672395] BTRFS info (device md0): disk space caching is enabled
>> [56146.841632] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>> flush 0, corrupt 2181, gen 0
>> [56148.097242] BTRFS critical (device md0): corrupt leaf: root=1
>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>> [56148.097583] BTRFS error (device md0): failed to read block groups: -5
>> [56148.140137] BTRFS error (device md0): open_ctree failed
>>
>> If the above steps were wrong - please, correct!
>>
>>> The only uncertain part is the size.
>>> If mount still fails, dmesg will tell me the size I need.
>>>
>>>
>>>> Successfully repair tree block at 1894009225216
>>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>>>> missing codepage or helper program, or other error.
>>>> root@bach:~# dmesg|tail
>>>> ...
>>>> [39342.860715] BTRFS info (device md0): disk space caching is enabled
>>>> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>>>> flush 0, corrupt 2181, gen 0
>>>> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
>>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>>> [39344.197915] BTRFS error (device md0): failed to read block groups: -5
>>>> [39344.248137] BTRFS error (device md0): open_ctree failed
>>>>
>>>> Sorry, I forgot to tell: this and previous attempt were with kernel
>>>> 4.15.0-47-generic.
>>>
>>> As long as it can output above message, the kernel version doesn't make
>>> much difference.
>>>
>>>
>>>> My Ubuntu 18.04 LTS is having enormous problems with
>>>> Kernel 5.0.2 - very long boot; network, login and other services cycling
>>>> trough "start, timeout, fail, stop" again and again, etc. If kernel 5 is
>>>> important I will need time to get it right (maybe even assistance from
>>>> another(?) developer group).
>>>> Actually with 5.0.2 each boot sends me an email about an empty and not
>>>> automatically mounted btrfs filesystem with raid1 profile, consisting
>>>> from two devices (sdb and sdi):
>>>>
>>>> kernel: [    9.625619] BTRFS: device fsid
>>>> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
>>>>
>>>> Scrubbing it finishes almost immediately (see below), but during next
>>>> boot the email comes again:
>>>>
>>>> #btrfs scrub status /mnt/b
>>>> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>>>>          scrub started at Sat Apr  6 10:42:15 2019 and finished after
>>>> 00:00:00
>>>>          total bytes scrubbed: 1.51MiB with 0 errors
>>>>
>>>> Should I be worried about it?
>>>
>>> You could try btrfs check --readonly and see what's going on.
>>> If btrfs check --readonly is OK, then it should be mostly OK.
>>
>> Then it seems to be ok, thank you!
>>
>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>>
>>>> Kind regards,
>>>> Nik.
>>>> -- 
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>>
>>>>>> Thank you.
>>>>>> Nik.
>>>>>> -- 
>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>
>>>>>>>>
>>>>>>>> Actually there was one warning during make, I don't know of it is
>>>>>>>> relevant:
>>>>>>>>        [CC]     check/main.o
>>>>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in
>>>>>>>> this
>>>>>>>> function [-Wmaybe-uninitialized]
>>>>>>>>      if (!ret) {
>>>>>>>>         ^
>>>>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>>>>      int ret;
>>>>>>>>          ^~~
>>>>>>>>
>>>>>>>> The previous steps were as follow (output ommited, since nothing
>>>>>>>> unexpected happened):
>>>>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>>>>> #cd btrfs-progs/
>>>>>>>> #./autogen.sh
>>>>>>>> #./configure --disable-documentation --disable-convert
>>>>>>>> #make
>>>>>>>>
>>>>>>>> Did I got the right branch? Or miss any step?
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Nik.
>>>>>>>> -- 
>>>>>>>>
>>>>>>>>> If everything goes correctly, it should output something like:
>>>>>>>>>       Successfully repaired tree block at 1894009225216
>>>>>>>>> (And please ignore any grammar error in my code)
>>>>>>>>>
>>>>>>>>> After that, please run a "btrfs check --readonly" to ensure no
>>>>>>>>> other
>>>>>>>>> bit
>>>>>>>>> flip in your fs.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Qu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hope this is ok.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nik.
>>>>>>>>>> -
>>>>>>>>>
>>>>>>>
>>>>>
>>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06 13:22                                           ` Qu Wenruo
  2019-04-06 13:28                                             ` Qu Wenruo
@ 2019-04-06 14:19                                             ` Nik.
  2019-04-06 23:18                                               ` Qu Wenruo
  1 sibling, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-06 14:19 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-06 15:22, Qu Wenruo:
> 
> 
> On 2019/4/6 下午9:20, Nik. wrote:
>>
>>
>> 2019-04-06 11:06, Qu Wenruo:
>>>>>
>>>>> Please try again, and sorry for the inconvenience. Hopes this is the
>>>>> last try.
>>>>
>>>> #sudo ./btrfs-corrupt-block -X /dev/md0
>>>> old offset=131072 len=0
>>>> new offset=0 len=0
>>>
>>> My bad, the first fix is bad, leading the bad result.
>>>
>>> (And that's why we need to review patches)
>>>
>>> Fortunately we have everything we need to manually set the value, no
>>> magic any more.
>>
>> So I gues the next steps were git fetch, make and run again the above
>> two commands:
>>
>> #git fetch
>>  From https://github.com/adam900710/btrfs-progs
>>   + c7bfe8cc...a8c26abd dirty_fix_for_nik -> origin/dirty_fix_for_nik
>> (forced update)
> 
> It looks like you haven't checked out to the correct branch.
> 
> You could use command 'git checkout origin/dirty_fix_for_nik' to change
> to the latest branch.

Sorry about this. Once again:

#git checkout origin/dirty_fix_for_nik
HEAD is now at a8c26abd btrfs-progs: corrupt-block: Manually fix bit 
flip for Nik.
# make
     [PY]     libbtrfsutil

#./btrfs-corrupt-block -X /dev/md0
old offset=0 len=0
new offset=14966 len=37
Successfully repair tree block at 1894009225216

# mount -t btrfs -o ro /dev/md0 /mnt/md0/
mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0, 
missing codepage or helper program, or other error.

root@bach:~# dmesg|tail
...
[59138.540585] BTRFS info (device md0): disk space caching is enabled
[59138.697727] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0
[59139.944682] BTRFS critical (device md0): corrupt leaf: root=1 
block=1894009225216 slot=83, bad key order, prev (564984271564800 168 
962560) current (2034319192064 168 262144)
[59139.945109] BTRFS error (device md0): failed to read block groups: -5
[59139.984122] BTRFS error (device md0): open_ctree failed

Kind regards,
Nik.
--

> Thanks,
> Qu
> 
>> #make
>>      [PY]     libbtrfsutil
>>
>> #./btrfs-corrupt-block -X /dev/md0
>> old offset=0 len=0
>> new offset=0 len=0
>> Successfully repair tree block at 1894009225216
>>
>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>> missing codepage or helper program, or other error.
>>
>> # dmesg|tail
>> ...
>> [56146.672395] BTRFS info (device md0): disk space caching is enabled
>> [56146.841632] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>> flush 0, corrupt 2181, gen 0
>> [56148.097242] BTRFS critical (device md0): corrupt leaf: root=1
>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>> [56148.097583] BTRFS error (device md0): failed to read block groups: -5
>> [56148.140137] BTRFS error (device md0): open_ctree failed
>>
>> If the above steps were wrong - please, correct!
>>
>>> The only uncertain part is the size.
>>> If mount still fails, dmesg will tell me the size I need.
>>>
>>>
>>>> Successfully repair tree block at 1894009225216
>>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>>>> missing codepage or helper program, or other error.
>>>> root@bach:~# dmesg|tail
>>>> ...
>>>> [39342.860715] BTRFS info (device md0): disk space caching is enabled
>>>> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>>>> flush 0, corrupt 2181, gen 0
>>>> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
>>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>>> [39344.197915] BTRFS error (device md0): failed to read block groups: -5
>>>> [39344.248137] BTRFS error (device md0): open_ctree failed
>>>>
>>>> Sorry, I forgot to tell: this and previous attempt were with kernel
>>>> 4.15.0-47-generic.
>>>
>>> As long as it can output above message, the kernel version doesn't make
>>> much difference.
>>>
>>>
>>>> My Ubuntu 18.04 LTS is having enormous problems with
>>>> Kernel 5.0.2 - very long boot; network, login and other services cycling
>>>> trough "start, timeout, fail, stop" again and again, etc. If kernel 5 is
>>>> important I will need time to get it right (maybe even assistance from
>>>> another(?) developer group).
>>>> Actually with 5.0.2 each boot sends me an email about an empty and not
>>>> automatically mounted btrfs filesystem with raid1 profile, consisting
>>>> from two devices (sdb and sdi):
>>>>
>>>> kernel: [    9.625619] BTRFS: device fsid
>>>> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
>>>>
>>>> Scrubbing it finishes almost immediately (see below), but during next
>>>> boot the email comes again:
>>>>
>>>> #btrfs scrub status /mnt/b
>>>> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>>>>           scrub started at Sat Apr  6 10:42:15 2019 and finished after
>>>> 00:00:00
>>>>           total bytes scrubbed: 1.51MiB with 0 errors
>>>>
>>>> Should I be worried about it?
>>>
>>> You could try btrfs check --readonly and see what's going on.
>>> If btrfs check --readonly is OK, then it should be mostly OK.
>>
>> Then it seems to be ok, thank you!
>>
>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>>
>>>> Kind regards,
>>>> Nik.
>>>> -- 
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>>
>>>>>> Thank you.
>>>>>> Nik.
>>>>>> -- 
>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>
>>>>>>>>
>>>>>>>> Actually there was one warning during make, I don't know of it is
>>>>>>>> relevant:
>>>>>>>>         [CC]     check/main.o
>>>>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in
>>>>>>>> this
>>>>>>>> function [-Wmaybe-uninitialized]
>>>>>>>>       if (!ret) {
>>>>>>>>          ^
>>>>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>>>>       int ret;
>>>>>>>>           ^~~
>>>>>>>>
>>>>>>>> The previous steps were as follow (output ommited, since nothing
>>>>>>>> unexpected happened):
>>>>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>>>>> #cd btrfs-progs/
>>>>>>>> #./autogen.sh
>>>>>>>> #./configure --disable-documentation --disable-convert
>>>>>>>> #make
>>>>>>>>
>>>>>>>> Did I got the right branch? Or miss any step?
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Nik.
>>>>>>>> -- 
>>>>>>>>
>>>>>>>>> If everything goes correctly, it should output something like:
>>>>>>>>>        Successfully repaired tree block at 1894009225216
>>>>>>>>> (And please ignore any grammar error in my code)
>>>>>>>>>
>>>>>>>>> After that, please run a "btrfs check --readonly" to ensure no
>>>>>>>>> other
>>>>>>>>> bit
>>>>>>>>> flip in your fs.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Qu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hope this is ok.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nik.
>>>>>>>>>> -
>>>>>>>>>
>>>>>>>
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06 14:19                                             ` Nik.
@ 2019-04-06 23:18                                               ` Qu Wenruo
  2019-04-07  7:41                                                 ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-06 23:18 UTC (permalink / raw)
  To: Nik., linux-btrfs



On 2019/4/6 下午10:19, Nik. wrote:
>
>
> 2019-04-06 15:22, Qu Wenruo:
>>
>>
>> On 2019/4/6 下午9:20, Nik. wrote:
>>>
>>>
>>> 2019-04-06 11:06, Qu Wenruo:
>>>>>>
>>>>>> Please try again, and sorry for the inconvenience. Hopes this is the
>>>>>> last try.
>>>>>
>>>>> #sudo ./btrfs-corrupt-block -X /dev/md0
>>>>> old offset=131072 len=0
>>>>> new offset=0 len=0
>>>>
>>>> My bad, the first fix is bad, leading the bad result.
>>>>
>>>> (And that's why we need to review patches)
>>>>
>>>> Fortunately we have everything we need to manually set the value, no
>>>> magic any more.
>>>
>>> So I gues the next steps were git fetch, make and run again the above
>>> two commands:
>>>
>>> #git fetch
>>>  From https://github.com/adam900710/btrfs-progs
>>>   + c7bfe8cc...a8c26abd dirty_fix_for_nik -> origin/dirty_fix_for_nik
>>> (forced update)
>>
>> It looks like you haven't checked out to the correct branch.
>>
>> You could use command 'git checkout origin/dirty_fix_for_nik' to change
>> to the latest branch.
>
> Sorry about this. Once again:
>
> #git checkout origin/dirty_fix_for_nik
> HEAD is now at a8c26abd btrfs-progs: corrupt-block: Manually fix bit
> flip for Nik.
> # make
>     [PY]     libbtrfsutil
>
> #./btrfs-corrupt-block -X /dev/md0
> old offset=0 len=0
> new offset=14966 len=37
> Successfully repair tree block at 1894009225216
>
> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
> missing codepage or helper program, or other error.
>
> root@bach:~# dmesg|tail
> ...
> [59138.540585] BTRFS info (device md0): disk space caching is enabled
> [59138.697727] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
> flush 0, corrupt 2181, gen 0
> [59139.944682] BTRFS critical (device md0): corrupt leaf: root=1
> block=1894009225216 slot=83, bad key order, prev (564984271564800 168
> 962560) current (2034319192064 168 262144)

Now it's a different problem at different slot.

slot 82 has key (0x201d9a6cf7000, 168, 962560)
slot 83 has key (0x001d9a6df7000, 168, 262144)

You have 2 bits flipped just in one tree block!

Anyway, I have updated the branch, and please try it again.

Thanks,
Qu


> [59139.945109] BTRFS error (device md0): failed to read block groups: -5
> [59139.984122] BTRFS error (device md0): open_ctree failed
>
> Kind regards,
> Nik.
> --
>
>> Thanks,
>> Qu
>>
>>> #make
>>>      [PY]     libbtrfsutil
>>>
>>> #./btrfs-corrupt-block -X /dev/md0
>>> old offset=0 len=0
>>> new offset=0 len=0
>>> Successfully repair tree block at 1894009225216
>>>
>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>>> missing codepage or helper program, or other error.
>>>
>>> # dmesg|tail
>>> ...
>>> [56146.672395] BTRFS info (device md0): disk space caching is enabled
>>> [56146.841632] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>>> flush 0, corrupt 2181, gen 0
>>> [56148.097242] BTRFS critical (device md0): corrupt leaf: root=1
>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>> [56148.097583] BTRFS error (device md0): failed to read block groups: -5
>>> [56148.140137] BTRFS error (device md0): open_ctree failed
>>>
>>> If the above steps were wrong - please, correct!
>>>
>>>> The only uncertain part is the size.
>>>> If mount still fails, dmesg will tell me the size I need.
>>>>
>>>>
>>>>> Successfully repair tree block at 1894009225216
>>>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on
>>>>> /dev/md0,
>>>>> missing codepage or helper program, or other error.
>>>>> root@bach:~# dmesg|tail
>>>>> ...
>>>>> [39342.860715] BTRFS info (device md0): disk space caching is enabled
>>>>> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0,
>>>>> rd 0,
>>>>> flush 0, corrupt 2181, gen 0
>>>>> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
>>>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>>>> [39344.197915] BTRFS error (device md0): failed to read block
>>>>> groups: -5
>>>>> [39344.248137] BTRFS error (device md0): open_ctree failed
>>>>>
>>>>> Sorry, I forgot to tell: this and previous attempt were with kernel
>>>>> 4.15.0-47-generic.
>>>>
>>>> As long as it can output above message, the kernel version doesn't make
>>>> much difference.
>>>>
>>>>
>>>>> My Ubuntu 18.04 LTS is having enormous problems with
>>>>> Kernel 5.0.2 - very long boot; network, login and other services
>>>>> cycling
>>>>> trough "start, timeout, fail, stop" again and again, etc. If kernel
>>>>> 5 is
>>>>> important I will need time to get it right (maybe even assistance from
>>>>> another(?) developer group).
>>>>> Actually with 5.0.2 each boot sends me an email about an empty and not
>>>>> automatically mounted btrfs filesystem with raid1 profile, consisting
>>>>> from two devices (sdb and sdi):
>>>>>
>>>>> kernel: [    9.625619] BTRFS: device fsid
>>>>> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
>>>>>
>>>>> Scrubbing it finishes almost immediately (see below), but during next
>>>>> boot the email comes again:
>>>>>
>>>>> #btrfs scrub status /mnt/b
>>>>> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>>>>>           scrub started at Sat Apr  6 10:42:15 2019 and finished after
>>>>> 00:00:00
>>>>>           total bytes scrubbed: 1.51MiB with 0 errors
>>>>>
>>>>> Should I be worried about it?
>>>>
>>>> You could try btrfs check --readonly and see what's going on.
>>>> If btrfs check --readonly is OK, then it should be mostly OK.
>>>
>>> Then it seems to be ok, thank you!
>>>
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>
>>>>>
>>>>> Kind regards,
>>>>> Nik.
>>>>> -- 
>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>>
>>>>>>> Thank you.
>>>>>>> Nik.
>>>>>>> -- 
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qu
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Actually there was one warning during make, I don't know of it is
>>>>>>>>> relevant:
>>>>>>>>>         [CC]     check/main.o
>>>>>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in
>>>>>>>>> this
>>>>>>>>> function [-Wmaybe-uninitialized]
>>>>>>>>>       if (!ret) {
>>>>>>>>>          ^
>>>>>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>>>>>       int ret;
>>>>>>>>>           ^~~
>>>>>>>>>
>>>>>>>>> The previous steps were as follow (output ommited, since nothing
>>>>>>>>> unexpected happened):
>>>>>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>>>>>> #cd btrfs-progs/
>>>>>>>>> #./autogen.sh
>>>>>>>>> #./configure --disable-documentation --disable-convert
>>>>>>>>> #make
>>>>>>>>>
>>>>>>>>> Did I got the right branch? Or miss any step?
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>> Nik.
>>>>>>>>> -- 
>>>>>>>>>
>>>>>>>>>> If everything goes correctly, it should output something like:
>>>>>>>>>>        Successfully repaired tree block at 1894009225216
>>>>>>>>>> (And please ignore any grammar error in my code)
>>>>>>>>>>
>>>>>>>>>> After that, please run a "btrfs check --readonly" to ensure no
>>>>>>>>>> other
>>>>>>>>>> bit
>>>>>>>>>> flip in your fs.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Qu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hope this is ok.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nik.
>>>>>>>>>>> -
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-06 23:18                                               ` Qu Wenruo
@ 2019-04-07  7:41                                                 ` Nik.
  2019-04-07 18:45                                                   ` Chris Murphy
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-07  7:41 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



2019-04-07 01:18, Qu Wenruo:
> 
> 
> On 2019/4/6 下午10:19, Nik. wrote:
>>
>>
>> 2019-04-06 15:22, Qu Wenruo:
>>>
>>>
>>> On 2019/4/6 下午9:20, Nik. wrote:
>>>>
>>>>
>>>> 2019-04-06 11:06, Qu Wenruo:
>>>>>>>
>>>>>>> Please try again, and sorry for the inconvenience. Hopes this is the
>>>>>>> last try.
>>>>>>
>>>>>> #sudo ./btrfs-corrupt-block -X /dev/md0
>>>>>> old offset=131072 len=0
>>>>>> new offset=0 len=0
>>>>>
>>>>> My bad, the first fix is bad, leading the bad result.
>>>>>
>>>>> (And that's why we need to review patches)
>>>>>
>>>>> Fortunately we have everything we need to manually set the value, no
>>>>> magic any more.
>>>>
>>>> So I gues the next steps were git fetch, make and run again the above
>>>> two commands:
>>>>
>>>> #git fetch
>>>>   From https://github.com/adam900710/btrfs-progs
>>>>    + c7bfe8cc...a8c26abd dirty_fix_for_nik -> origin/dirty_fix_for_nik
>>>> (forced update)
>>>
>>> It looks like you haven't checked out to the correct branch.
>>>
>>> You could use command 'git checkout origin/dirty_fix_for_nik' to change
>>> to the latest branch.
>>
>> Sorry about this. Once again:
>>
>> #git checkout origin/dirty_fix_for_nik
>> HEAD is now at a8c26abd btrfs-progs: corrupt-block: Manually fix bit
>> flip for Nik.
>> # make
>>      [PY]     libbtrfsutil
>>
>> #./btrfs-corrupt-block -X /dev/md0
>> old offset=0 len=0
>> new offset=14966 len=37
>> Successfully repair tree block at 1894009225216
>>
>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>> missing codepage or helper program, or other error.
>>
>> root@bach:~# dmesg|tail
>> ...
>> [59138.540585] BTRFS info (device md0): disk space caching is enabled
>> [59138.697727] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>> flush 0, corrupt 2181, gen 0
>> [59139.944682] BTRFS critical (device md0): corrupt leaf: root=1
>> block=1894009225216 slot=83, bad key order, prev (564984271564800 168
>> 962560) current (2034319192064 168 262144)
> 
> Now it's a different problem at different slot.
> 
> slot 82 has key (0x201d9a6cf7000, 168, 962560)
> slot 83 has key (0x001d9a6df7000, 168, 262144)
> 
> You have 2 bits flipped just in one tree block!
> 
> Anyway, I have updated the branch, and please try it again.
> 
> Thanks,
> Qu

#./btrfs-corrupt-block -X /dev/md0
old key = 564984271564800, 168, 962560
new key = 2034318143488, 168, 962560
Successfully repair tree block at 1894009225216

# mount -t btrfs -o ro /dev/md0 /mnt/md0/
mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0, 
missing codepage or helper program, or other error.

# dmesg|tail
...
[111221.376675] md: md0: data-check done.
[122291.559537] BTRFS info (device md0): disk space caching is enabled
[122291.704292] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0
[122293.101782] BTRFS critical (device md0): corrupt leaf: root=1 
block=1894009225216 slot=82, bad key order, prev (2034321682432 168 
262144) current (2034318143488 168 962560)
[122293.102334] BTRFS error (device md0): failed to read block groups: -5
[122293.156546] BTRFS error (device md0): open_ctree failed

If the data-tree structures alone have so many bits flipped, how much 
flipped bits are to be expected in the data itself? What should a normal 
btrfs user do in order to prevent such disasters?
And another thing: if I am getting it right, it should have been more 
reliable/appropriate to let btrfs manage the five disks behind the md0 
with a raid1 profile instead binding them in a RAID5 and "giving" just a 
single device to btrfs.

Kind regards,
Nik.
--

> 
> 
>> [59139.945109] BTRFS error (device md0): failed to read block groups: -5
>> [59139.984122] BTRFS error (device md0): open_ctree failed
>>
>> Kind regards,
>> Nik.
>> --
>>
>>> Thanks,
>>> Qu
>>>
>>>> #make
>>>>       [PY]     libbtrfsutil
>>>>
>>>> #./btrfs-corrupt-block -X /dev/md0
>>>> old offset=0 len=0
>>>> new offset=0 len=0
>>>> Successfully repair tree block at 1894009225216
>>>>
>>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on /dev/md0,
>>>> missing codepage or helper program, or other error.
>>>>
>>>> # dmesg|tail
>>>> ...
>>>> [56146.672395] BTRFS info (device md0): disk space caching is enabled
>>>> [56146.841632] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0,
>>>> flush 0, corrupt 2181, gen 0
>>>> [56148.097242] BTRFS critical (device md0): corrupt leaf: root=1
>>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>>> [56148.097583] BTRFS error (device md0): failed to read block groups: -5
>>>> [56148.140137] BTRFS error (device md0): open_ctree failed
>>>>
>>>> If the above steps were wrong - please, correct!
>>>>
>>>>> The only uncertain part is the size.
>>>>> If mount still fails, dmesg will tell me the size I need.
>>>>>
>>>>>
>>>>>> Successfully repair tree block at 1894009225216
>>>>>> # mount -t btrfs -o ro /dev/md0 /mnt/md0/
>>>>>> mount: /mnt/md0: wrong fs type, bad option, bad superblock on
>>>>>> /dev/md0,
>>>>>> missing codepage or helper program, or other error.
>>>>>> root@bach:~# dmesg|tail
>>>>>> ...
>>>>>> [39342.860715] BTRFS info (device md0): disk space caching is enabled
>>>>>> [39342.933380] BTRFS info (device md0): bdev /dev/md0 errs: wr 0,
>>>>>> rd 0,
>>>>>> flush 0, corrupt 2181, gen 0
>>>>>> [39344.197411] BTRFS critical (device md0): corrupt leaf: root=1
>>>>>> block=1894009225216 slot=30, unexpected item end, have 0 expect 15003
>>>>>> [39344.197915] BTRFS error (device md0): failed to read block
>>>>>> groups: -5
>>>>>> [39344.248137] BTRFS error (device md0): open_ctree failed
>>>>>>
>>>>>> Sorry, I forgot to tell: this and previous attempt were with kernel
>>>>>> 4.15.0-47-generic.
>>>>>
>>>>> As long as it can output above message, the kernel version doesn't make
>>>>> much difference.
>>>>>
>>>>>
>>>>>> My Ubuntu 18.04 LTS is having enormous problems with
>>>>>> Kernel 5.0.2 - very long boot; network, login and other services
>>>>>> cycling
>>>>>> trough "start, timeout, fail, stop" again and again, etc. If kernel
>>>>>> 5 is
>>>>>> important I will need time to get it right (maybe even assistance from
>>>>>> another(?) developer group).
>>>>>> Actually with 5.0.2 each boot sends me an email about an empty and not
>>>>>> automatically mounted btrfs filesystem with raid1 profile, consisting
>>>>>> from two devices (sdb and sdi):
>>>>>>
>>>>>> kernel: [    9.625619] BTRFS: device fsid
>>>>>> 05bd214a-8961-4165-9205-a5089a65b59b devid 2 transid 832 /dev/sdi
>>>>>>
>>>>>> Scrubbing it finishes almost immediately (see below), but during next
>>>>>> boot the email comes again:
>>>>>>
>>>>>> #btrfs scrub status /mnt/b
>>>>>> scrub status for 05bd214a-8961-4165-9205-a5089a65b59b
>>>>>>            scrub started at Sat Apr  6 10:42:15 2019 and finished after
>>>>>> 00:00:00
>>>>>>            total bytes scrubbed: 1.51MiB with 0 errors
>>>>>>
>>>>>> Should I be worried about it?
>>>>>
>>>>> You could try btrfs check --readonly and see what's going on.
>>>>> If btrfs check --readonly is OK, then it should be mostly OK.
>>>>
>>>> Then it seems to be ok, thank you!
>>>>
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>
>>>>>>
>>>>>> Kind regards,
>>>>>> Nik.
>>>>>> -- 
>>>>>>
>>>>>>> Thanks,
>>>>>>> Qu
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>> Nik.
>>>>>>>> -- 
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Qu
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Actually there was one warning during make, I don't know of it is
>>>>>>>>>> relevant:
>>>>>>>>>>          [CC]     check/main.o
>>>>>>>>>> check/main.c: In function ‘try_repair_inode’:
>>>>>>>>>> check/main.c:2688:5: warning: ‘ret’ may be used uninitialized in
>>>>>>>>>> this
>>>>>>>>>> function [-Wmaybe-uninitialized]
>>>>>>>>>>        if (!ret) {
>>>>>>>>>>           ^
>>>>>>>>>> check/main.c:2666:6: note: ‘ret’ was declared here
>>>>>>>>>>        int ret;
>>>>>>>>>>            ^~~
>>>>>>>>>>
>>>>>>>>>> The previous steps were as follow (output ommited, since nothing
>>>>>>>>>> unexpected happened):
>>>>>>>>>> #git clone --single-branch -v -b dirty_fix_for_nik
>>>>>>>>>> https://github.com/adam900710/btrfs-progs.git
>>>>>>>>>> #cd btrfs-progs/
>>>>>>>>>> #./autogen.sh
>>>>>>>>>> #./configure --disable-documentation --disable-convert
>>>>>>>>>> #make
>>>>>>>>>>
>>>>>>>>>> Did I got the right branch? Or miss any step?
>>>>>>>>>>
>>>>>>>>>> Kind regards,
>>>>>>>>>> Nik.
>>>>>>>>>> -- 
>>>>>>>>>>
>>>>>>>>>>> If everything goes correctly, it should output something like:
>>>>>>>>>>>         Successfully repaired tree block at 1894009225216
>>>>>>>>>>> (And please ignore any grammar error in my code)
>>>>>>>>>>>
>>>>>>>>>>> After that, please run a "btrfs check --readonly" to ensure no
>>>>>>>>>>> other
>>>>>>>>>>> bit
>>>>>>>>>>> flip in your fs.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Qu
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hope this is ok.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Nik.
>>>>>>>>>>>> -
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-07  7:41                                                 ` Nik.
@ 2019-04-07 18:45                                                   ` Chris Murphy
  2019-04-08 13:09                                                     ` Qu Wenruo
  2019-04-10 21:03                                                     ` Nik.
  0 siblings, 2 replies; 51+ messages in thread
From: Chris Murphy @ 2019-04-07 18:45 UTC (permalink / raw)
  To: Nik., Btrfs BTRFS, Qu Wenruo

On Sun, Apr 7, 2019 at 1:42 AM Nik. <btrfs@avgustinov.eu> wrote:
> 2019-04-07 01:18, Qu Wenruo:

> > You have 2 bits flipped just in one tree block!
> >
> If the data-tree structures alone have so many bits flipped, how much
> flipped bits are to be expected in the data itself? What should a normal
> btrfs user do in order to prevent such disasters?

I think the corruption in your case is inferred by Btrfs only by bad
key ordering, not csum failure for the leaf? I can't tell for sure
from the error, but I don't see a csum complaint.

I'd expect a RAM caused corruption could affect a metadata leaf data,
followed by csum computation. Therefore no csum failure on subsequent
read. Whereas if the corruption is storage stack related, we'd see a
csum error on subsequent read.

Once there's corruption in a block address, the corruption can
propagate into anything else that depends on that block address even
if there isn't another corruption event. So one event, multiple
corruptions.

> And another thing: if I am getting it right, it should have been more
> reliable/appropriate to let btrfs manage the five disks behind the md0
> with a raid1 profile instead binding them in a RAID5 and "giving" just a
> single device to btrfs.

Not necessarily. If corruption happens early enough, it gets baked
into all copies of the metadata.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-07 18:45                                                   ` Chris Murphy
@ 2019-04-08 13:09                                                     ` Qu Wenruo
  2019-04-08 21:22                                                       ` Nik.
  2019-04-10 21:03                                                     ` Nik.
  1 sibling, 1 reply; 51+ messages in thread
From: Qu Wenruo @ 2019-04-08 13:09 UTC (permalink / raw)
  To: Chris Murphy, Nik., Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 2314 bytes --]

Unfortunately, I didn't receive the last mail from Nik.

So I'm using the content from lore.kernel.org.

[122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
block=1894009225216 slot=82, bad key order, prev (2034321682432 168
262144) current (2034318143488 168 962560)

Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
occur in extent tree, not in root tree.

This means either the leaf owner, or some tree blocks get totally
screwed up.

This is not easy to fix, if possible.

Would you please try this kernel branch and mount it with
"rescue=skip_bg,ro"?
https://github.com/adam900710/linux/tree/rescue_options

I think that's the last method. Before that, you could try
btrfs-restore, which is purely user-space and should be easier to setup
than custom kernel.

Thanks,
Qu

On 2019/4/8 上午2:45, Chris Murphy wrote:
> On Sun, Apr 7, 2019 at 1:42 AM Nik. <btrfs@avgustinov.eu> wrote:
>> 2019-04-07 01:18, Qu Wenruo:
> 
>>> You have 2 bits flipped just in one tree block!
>>>
>> If the data-tree structures alone have so many bits flipped, how much
>> flipped bits are to be expected in the data itself? What should a normal
>> btrfs user do in order to prevent such disasters?
> 
> I think the corruption in your case is inferred by Btrfs only by bad
> key ordering, not csum failure for the leaf? I can't tell for sure
> from the error, but I don't see a csum complaint.
> 
> I'd expect a RAM caused corruption could affect a metadata leaf data,
> followed by csum computation. Therefore no csum failure on subsequent
> read. Whereas if the corruption is storage stack related, we'd see a
> csum error on subsequent read.
> 
> Once there's corruption in a block address, the corruption can
> propagate into anything else that depends on that block address even
> if there isn't another corruption event. So one event, multiple
> corruptions.
> 
> 
>> And another thing: if I am getting it right, it should have been more
>> reliable/appropriate to let btrfs manage the five disks behind the md0
>> with a raid1 profile instead binding them in a RAID5 and "giving" just a
>> single device to btrfs.
> 
> Not necessarily. If corruption happens early enough, it gets baked
> into all copies of the metadata.
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-08 13:09                                                     ` Qu Wenruo
@ 2019-04-08 21:22                                                       ` Nik.
  2019-04-12 10:44                                                         ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-08 21:22 UTC (permalink / raw)
  To: Qu Wenruo, Chris Murphy, Btrfs BTRFS



2019-04-08 15:09, Qu Wenruo:
> Unfortunately, I didn't receive the last mail from Nik.
> 
> So I'm using the content from lore.kernel.org.
> 
> [122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
> block=1894009225216 slot=82, bad key order, prev (2034321682432 168
> 262144) current (2034318143488 168 962560)
> 
> Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
> occur in extent tree, not in root tree.
> 
> This means either the leaf owner, or some tree blocks get totally
> screwed up.
> 
> This is not easy to fix, if possible.
> 
> Would you please try this kernel branch and mount it with
> "rescue=skip_bg,ro"?
> https://github.com/adam900710/linux/tree/rescue_options
> 
> I think that's the last method. Before that, you could try
> btrfs-restore, which is purely user-space and should be easier to setup
> than custom kernel.
> 
> Thanks,
> Qu

Tried "btrfs restore -vsxmi ..." (it did not work before my first 
email), it is processing for at least 6 hours until now. It seems that 
despite many error messages files are getting restored. As soon as it 
finishes will check what is the result and give feedback. Will also test 
the mentioned kernel branch.

Kind regards,
Nik.
--
> 
> On 2019/4/8 上午2:45, Chris Murphy wrote:
>> On Sun, Apr 7, 2019 at 1:42 AM Nik. <btrfs@avgustinov.eu> wrote:
>>> 2019-04-07 01:18, Qu Wenruo:
>>
>>>> You have 2 bits flipped just in one tree block!
>>>>
>>> If the data-tree structures alone have so many bits flipped, how much
>>> flipped bits are to be expected in the data itself? What should a normal
>>> btrfs user do in order to prevent such disasters?
>>
>> I think the corruption in your case is inferred by Btrfs only by bad
>> key ordering, not csum failure for the leaf? I can't tell for sure
>> from the error, but I don't see a csum complaint.
>>
>> I'd expect a RAM caused corruption could affect a metadata leaf data,
>> followed by csum computation. Therefore no csum failure on subsequent
>> read. Whereas if the corruption is storage stack related, we'd see a
>> csum error on subsequent read.
>>
>> Once there's corruption in a block address, the corruption can
>> propagate into anything else that depends on that block address even
>> if there isn't another corruption event. So one event, multiple
>> corruptions.
>>
>>
>>> And another thing: if I am getting it right, it should have been more
>>> reliable/appropriate to let btrfs manage the five disks behind the md0
>>> with a raid1 profile instead binding them in a RAID5 and "giving" just a
>>> single device to btrfs.
>>
>> Not necessarily. If corruption happens early enough, it gets baked
>> into all copies of the metadata.
>>
>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-07 18:45                                                   ` Chris Murphy
  2019-04-08 13:09                                                     ` Qu Wenruo
@ 2019-04-10 21:03                                                     ` Nik.
  2019-04-11  0:45                                                       ` Qu Wenruo
  1 sibling, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-10 21:03 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS, Qu Wenruo

2019-04-07 20:45, Chris Murphy:
> On Sun, Apr 7, 2019 at 1:42 AM Nik. <btrfs@avgustinov.eu> wrote:
>> 2019-04-07 01:18, Qu Wenruo:
> 
>>> You have 2 bits flipped just in one tree block!
>>>
>> If the data-tree structures alone have so many bits flipped, how much
>> flipped bits are to be expected in the data itself? What should a normal
>> btrfs user do in order to prevent such disasters?
> 
> I think the corruption in your case is inferred by Btrfs only by bad
> key ordering, not csum failure for the leaf? I can't tell for sure
> from the error, but I don't see a csum complaint.

I do not quite understand where the "bad key ordering" came from, but my 
question why (in my case) it keeps happening only to the btrfs file 
systems? Is it relevant, that all four failed systems have had initially 
ext4 format and were converted to btrfs (with the btrfs-progs used 5-6 
years ago)?

Another question: I am sure that many btrfs users are ready in some 
cases to trade reliability for performance; wouldn't it be interesting 
to introduce a kind of switch/option like the "verify on", used many 
years ago on msdos-systems to ensure that write operations (especially 
on floppy disks) were successful? Just an idea...

My btrfs-restore is still running (since Monday evening, until now about 
50% restored), and I am on a business trip. As soon as it finishes and I 
am back home I will compare with the backup and give more info, but it 
seems that this would need another day or two.

Kind regards,

Nik.
--

> I'd expect a RAM caused corruption could affect a metadata leaf data,
> followed by csum computation. Therefore no csum failure on subsequent
> read. Whereas if the corruption is storage stack related, we'd see a
> csum error on subsequent read.
> 
> Once there's corruption in a block address, the corruption can
> propagate into anything else that depends on that block address even
> if there isn't another corruption event. So one event, multiple
> corruptions.
> 
> 
>> And another thing: if I am getting it right, it should have been more
>> reliable/appropriate to let btrfs manage the five disks behind the md0
>> with a raid1 profile instead binding them in a RAID5 and "giving" just a
>> single device to btrfs.
> 
> Not necessarily. If corruption happens early enough, it gets baked
> into all copies of the metadata.
> 
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-10 21:03                                                     ` Nik.
@ 2019-04-11  0:45                                                       ` Qu Wenruo
  0 siblings, 0 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-11  0:45 UTC (permalink / raw)
  To: Nik., Chris Murphy, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 3658 bytes --]



On 2019/4/11 上午5:03, Nik. wrote:
> 
> 
> 2019-04-07 20:45, Chris Murphy:
>> On Sun, Apr 7, 2019 at 1:42 AM Nik. <btrfs@avgustinov.eu> wrote:
>>> 2019-04-07 01:18, Qu Wenruo:
>>
>>>> You have 2 bits flipped just in one tree block!
>>>>
>>> If the data-tree structures alone have so many bits flipped, how much
>>> flipped bits are to be expected in the data itself? What should a normal
>>> btrfs user do in order to prevent such disasters?
>>
>> I think the corruption in your case is inferred by Btrfs only by bad
>> key ordering, not csum failure for the leaf? I can't tell for sure
>> from the error, but I don't see a csum complaint.
> 
> I do not quite understand where the "bad key ordering" came from, but my
> question why (in my case) it keeps happening only to the btrfs file
> systems?

Because btrfs uses a more generic tree structure, to keep everything in
other.

Unlike other fs (xfs/ext*), they have their own special structure for
its inode, its regular file, its directory. Btrfs use one single but
more complex structure to record everything.

This also means, there are more somewhat redundancy in the tree
structure. Thus easier to get corrupted.
E.g. If xfs only needs 3 blocks to record its data structures, btrfs may
need 7 blocks. Thus if one bit get flipped in memory (either by hardware
of fs itself) it's easier to hit btrfs than xfs.


> Is it relevant, that all four failed systems have had initially
> ext4 format and were converted to btrfs (with the btrfs-progs used 5-6
> years ago)?

Converted to btrfs has some problem, especially when it comes to 5~6
years ago.
That old convert uses (almost abuse) a certain feature of btrfs,
creating a very strange chunk layout. It's valid but very tricky.
I'm not sure if it's related, but possible.

> 
> Another question: I am sure that many btrfs users are ready in some
> cases to trade reliability for performance; wouldn't it be interesting
> to introduce a kind of switch/option like the "verify on", used many
> years ago on msdos-systems to ensure that write operations (especially
> on floppy disks) were successful? Just an idea...

My personal take is, reliability is beyond everything, especially for an
already somewhat unstable or easy to corrupt fs.

So from recent kernel releases, we have more and more mandatory
verifications.
At least we're trying to make btrfs more and more robust.

Thanks,
Qu

> 
> My btrfs-restore is still running (since Monday evening, until now about
> 50% restored), and I am on a business trip. As soon as it finishes and I
> am back home I will compare with the backup and give more info, but it
> seems that this would need another day or two.
> 
> Kind regards,
> 
> Nik.
> -- 
> 
>> I'd expect a RAM caused corruption could affect a metadata leaf data,
>> followed by csum computation. Therefore no csum failure on subsequent
>> read. Whereas if the corruption is storage stack related, we'd see a
>> csum error on subsequent read.
>>
>> Once there's corruption in a block address, the corruption can
>> propagate into anything else that depends on that block address even
>> if there isn't another corruption event. So one event, multiple
>> corruptions.
>>
>>
>>> And another thing: if I am getting it right, it should have been more
>>> reliable/appropriate to let btrfs manage the five disks behind the md0
>>> with a raid1 profile instead binding them in a RAID5 and "giving" just a
>>> single device to btrfs.
>>
>> Not necessarily. If corruption happens early enough, it gets baked
>> into all copies of the metadata.
>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-08 21:22                                                       ` Nik.
@ 2019-04-12 10:44                                                         ` Nik.
  2019-04-12 10:50                                                           ` Qu Wenruo
  0 siblings, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-12 10:44 UTC (permalink / raw)
  To: Qu Wenruo, Chris Murphy, Btrfs BTRFS


2019-04-08 23:22, Nik.:
>
>
> 2019-04-08 15:09, Qu Wenruo:
>> Unfortunately, I didn't receive the last mail from Nik.
>>
>> So I'm using the content from lore.kernel.org.
>>
>> [122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
>> block=1894009225216 slot=82, bad key order, prev (2034321682432 168
>> 262144) current (2034318143488 168 962560)
>>
>> Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
>> occur in extent tree, not in root tree.
>>
>> This means either the leaf owner, or some tree blocks get totally
>> screwed up.
>>
>> This is not easy to fix, if possible.
>>
>> Would you please try this kernel branch and mount it with
>> "rescue=skip_bg,ro"?
>> https://github.com/adam900710/linux/tree/rescue_options
>>
>> I think that's the last method. Before that, you could try
>> btrfs-restore, which is purely user-space and should be easier to setup
>> than custom kernel.
>>
>> Thanks,
>> Qu
>
> Tried "btrfs restore -vsxmi ..." (it did not work before my first 
> email), it is processing for at least 6 hours until now. It seems that 
> despite many error messages files are getting restored. As soon as it 
> finishes will check what is the result and give feedback. Will also 
> test the mentioned kernel branch.
>
After almost four days only about 120 GB of my initially 3.7TB of free 
space remain free, and the restore is still working (how about 
introducing a "progress" switch?)... I guess that due to the option "-s" 
and the lack of deduplication the snapshots are going to fill all the 
space without reaching the "end" of the file restoring system.

Until now I still did not have chance (and time) to compare the restored 
with backups, but at this point I would like to ask you: what else would 
you like to know|try|do? Should I try the mentioned above kernel and its 
rescue options? Something else, which is risky and should not be tried 
on a production system?

Kind regards,

Nik.

--

[snip]


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-12 10:44                                                         ` Nik.
@ 2019-04-12 10:50                                                           ` Qu Wenruo
  2019-04-12 11:38                                                             ` Nik.
  2019-05-07 17:17                                                             ` Nik.
  0 siblings, 2 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-12 10:50 UTC (permalink / raw)
  To: Nik., Chris Murphy, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 2344 bytes --]



On 2019/4/12 下午6:44, Nik. wrote:
> 
> 2019-04-08 23:22, Nik.:
>>
>>
>> 2019-04-08 15:09, Qu Wenruo:
>>> Unfortunately, I didn't receive the last mail from Nik.
>>>
>>> So I'm using the content from lore.kernel.org.
>>>
>>> [122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
>>> block=1894009225216 slot=82, bad key order, prev (2034321682432 168
>>> 262144) current (2034318143488 168 962560)
>>>
>>> Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
>>> occur in extent tree, not in root tree.
>>>
>>> This means either the leaf owner, or some tree blocks get totally
>>> screwed up.
>>>
>>> This is not easy to fix, if possible.
>>>
>>> Would you please try this kernel branch and mount it with
>>> "rescue=skip_bg,ro"?
>>> https://github.com/adam900710/linux/tree/rescue_options
>>>
>>> I think that's the last method. Before that, you could try
>>> btrfs-restore, which is purely user-space and should be easier to setup
>>> than custom kernel.
>>>
>>> Thanks,
>>> Qu
>>
>> Tried "btrfs restore -vsxmi ..." (it did not work before my first
>> email), it is processing for at least 6 hours until now. It seems that
>> despite many error messages files are getting restored. As soon as it
>> finishes will check what is the result and give feedback. Will also
>> test the mentioned kernel branch.
>>
> After almost four days only about 120 GB of my initially 3.7TB of free
> space remain free, and the restore is still working (how about
> introducing a "progress" switch?)... I guess that due to the option "-s"
> and the lack of deduplication the snapshots are going to fill all the
> space without reaching the "end" of the file restoring system.
> 
> Until now I still did not have chance (and time) to compare the restored
> with backups, but at this point I would like to ask you: what else would
> you like to know|try|do? Should I try the mentioned above kernel and its
> rescue options?

That's the only remaining thing you need.

In fact, I didn't consider the size of the fs, and for that large fs,
rescue mount option should be the first choice before btrfs-restore.

Thanks,
Qu

> Something else, which is risky and should not be tried
> on a production system?
> 
> Kind regards,
> 
> Nik.
> 
> -- 
> 
> [snip]
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-05  7:07           ` Chris Murphy
  2019-04-05 12:07             ` Nik.
@ 2019-04-12 10:52             ` Nik.
  1 sibling, 0 replies; 51+ messages in thread
From: Nik. @ 2019-04-12 10:52 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Jeff Mahoney, Btrfs BTRFS



2019-04-05 09:07, Chris Murphy:
> On Fri, Apr 5, 2019 at 12:45 AM Nik. <btrfs@avgustinov.eu> wrote:
>>
>> Sorry, I forgot this. Hier is the output:
>>
>> # btrfs-image -c 9 -ss /dev/sdj3 /mnt/b/sdj3.img
>> WARNING: cannot find a hash collision for '..', generating garbage, it
>> won't match indexes
>>
>> The new image is same size, and since it seems small to me I am
>> attaching it to this mail.
> 
> What do you get for `btrfs insp dump-t -d /dev/` ?
> 
> Once I restore it, I get
> 
> 
> $ sudo btrfs insp dump-t -d /dev/mapper/vg-nik
> btrfs-progs v4.20.2
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> bad tree block 90195087360, bytenr mismatch, want=90195087360,
> have=7681037117263365436
> Couldn't setup device tree
> ERROR: unable to open /dev/mapper/vg-nik
> $ sudo btrfs insp dump-t -r /dev/mapper/vg-nik
> btrfs-progs v4.20.2
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> checksum verify failed on 90195087360 found 6036BAAE wanted 7C05A75D
> bad tree block 90195087360, bytenr mismatch, want=90195087360,
> have=7681037117263365436
> Couldn't setup device tree
> ERROR: unable to open /dev/mapper/vg-nik
> $
> 
> There is a valid superblock however. So it restored something, just
> not everything, not sure. Might be related to create failed success!
> 

Do anybody need something else from this filesystem? If not - I would be 
glad to reformat and reuse this ssd partition.

Kind regards,
Nik.
--

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-12 10:50                                                           ` Qu Wenruo
@ 2019-04-12 11:38                                                             ` Nik.
  2019-04-12 12:45                                                               ` Qu Wenruo
  2019-05-07 17:17                                                             ` Nik.
  1 sibling, 1 reply; 51+ messages in thread
From: Nik. @ 2019-04-12 11:38 UTC (permalink / raw)
  To: Qu Wenruo, Chris Murphy, Btrfs BTRFS



2019-04-12 12:50, Qu Wenruo:
> 
> 
> On 2019/4/12 下午6:44, Nik. wrote:
>>
>> 2019-04-08 23:22, Nik.:
>>>
>>>
>>> 2019-04-08 15:09, Qu Wenruo:
>>>> Unfortunately, I didn't receive the last mail from Nik.
>>>>
>>>> So I'm using the content from lore.kernel.org.
>>>>
>>>> [122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
>>>> block=1894009225216 slot=82, bad key order, prev (2034321682432 168
>>>> 262144) current (2034318143488 168 962560)
>>>>
>>>> Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
>>>> occur in extent tree, not in root tree.
>>>>
>>>> This means either the leaf owner, or some tree blocks get totally
>>>> screwed up.
>>>>
>>>> This is not easy to fix, if possible.
>>>>
>>>> Would you please try this kernel branch and mount it with
>>>> "rescue=skip_bg,ro"?
>>>> https://github.com/adam900710/linux/tree/rescue_options
>>>>
>>>> I think that's the last method. Before that, you could try
>>>> btrfs-restore, which is purely user-space and should be easier to setup
>>>> than custom kernel.
>>>>
>>>> Thanks,
>>>> Qu
>>>
>>> Tried "btrfs restore -vsxmi ..." (it did not work before my first
>>> email), it is processing for at least 6 hours until now. It seems that
>>> despite many error messages files are getting restored. As soon as it
>>> finishes will check what is the result and give feedback. Will also
>>> test the mentioned kernel branch.
>>>
>> After almost four days only about 120 GB of my initially 3.7TB of free
>> space remain free, and the restore is still working (how about
>> introducing a "progress" switch?)... I guess that due to the option "-s"
>> and the lack of deduplication the snapshots are going to fill all the
>> space without reaching the "end" of the file restoring system.
>>
>> Until now I still did not have chance (and time) to compare the restored
>> with backups, but at this point I would like to ask you: what else would
>> you like to know|try|do? Should I try the mentioned above kernel and its
>> rescue options?
> 
> That's the only remaining thing you need.

Ok, the "git clone ..." just finished, but in your earlier mail you 
spoke about kernel 5.1/5.2, and the readme of the above repository is 
talking about "Linux kernel release 4.x"? Since building the kernel is 
not a "minute" task (especially when building on atom processor), I 
would like to double check the steps to be done.
Until now:
$ git clone https://github.com/adam900710/linux.git
$ git checkout --track origin/rescue_options

What next? No autogen, no configure. ... The Readme refers to 
"Documentation/process/changes.rst", so am I going to follow its section 
"Configuring the kernel"?
If there is another description of the steps to be taken somewhere - 
please provide a link.

Best,
Nik.

> In fact, I didn't consider the size of the fs, and for that large fs,
> rescue mount option should be the first choice before btrfs-restore.
> 
> Thanks,
> Qu
> 
>> Something else, which is risky and should not be tried
>> on a production system?
>>
>> Kind regards,
>>
>> Nik.
>>
>> -- 
>>
>> [snip]
>>
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-12 11:38                                                             ` Nik.
@ 2019-04-12 12:45                                                               ` Qu Wenruo
  0 siblings, 0 replies; 51+ messages in thread
From: Qu Wenruo @ 2019-04-12 12:45 UTC (permalink / raw)
  To: Nik., Chris Murphy, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 3766 bytes --]



On 2019/4/12 下午7:38, Nik. wrote:
> 
> 
> 2019-04-12 12:50, Qu Wenruo:
>>
>>
>> On 2019/4/12 下午6:44, Nik. wrote:
>>>
>>> 2019-04-08 23:22, Nik.:
>>>>
>>>>
>>>> 2019-04-08 15:09, Qu Wenruo:
>>>>> Unfortunately, I didn't receive the last mail from Nik.
>>>>>
>>>>> So I'm using the content from lore.kernel.org.
>>>>>
>>>>> [122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
>>>>> block=1894009225216 slot=82, bad key order, prev (2034321682432 168
>>>>> 262144) current (2034318143488 168 962560)
>>>>>
>>>>> Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
>>>>> occur in extent tree, not in root tree.
>>>>>
>>>>> This means either the leaf owner, or some tree blocks get totally
>>>>> screwed up.
>>>>>
>>>>> This is not easy to fix, if possible.
>>>>>
>>>>> Would you please try this kernel branch and mount it with
>>>>> "rescue=skip_bg,ro"?
>>>>> https://github.com/adam900710/linux/tree/rescue_options
>>>>>
>>>>> I think that's the last method. Before that, you could try
>>>>> btrfs-restore, which is purely user-space and should be easier to
>>>>> setup
>>>>> than custom kernel.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>
>>>> Tried "btrfs restore -vsxmi ..." (it did not work before my first
>>>> email), it is processing for at least 6 hours until now. It seems that
>>>> despite many error messages files are getting restored. As soon as it
>>>> finishes will check what is the result and give feedback. Will also
>>>> test the mentioned kernel branch.
>>>>
>>> After almost four days only about 120 GB of my initially 3.7TB of free
>>> space remain free, and the restore is still working (how about
>>> introducing a "progress" switch?)... I guess that due to the option "-s"
>>> and the lack of deduplication the snapshots are going to fill all the
>>> space without reaching the "end" of the file restoring system.
>>>
>>> Until now I still did not have chance (and time) to compare the restored
>>> with backups, but at this point I would like to ask you: what else would
>>> you like to know|try|do? Should I try the mentioned above kernel and its
>>> rescue options?
>>
>> That's the only remaining thing you need.
> 
> Ok, the "git clone ..." just finished, but in your earlier mail you
> spoke about kernel 5.1/5.2, and the readme of the above repository is
> talking about "Linux kernel release 4.x"? Since building the kernel is
> not a "minute" task (especially when building on atom processor), I
> would like to double check the steps to be done.

It's based on v4.20 kernel I think.

I should refresh the patchset on latest branch before re-sending it to
the mail list.

> Until now:
> $ git clone https://github.com/adam900710/linux.git
> $ git checkout --track origin/rescue_options
> 
> What next? No autogen, no configure. ... The Readme refers to
> "Documentation/process/changes.rst", so am I going to follow its section
> "Configuring the kernel"?

Oh, that's will be problem.

Kernel config is done by "make menuconfig", but if you're not familiar
with that, you can easily get a kernel which can't even boot.

So just forget this, it's not risky, but very time consuming for end
users, too many things need to be learned.

Thanks,
Qu

> If there is another description of the steps to be taken somewhere -
> please provide a link.
> 
> Best,
> Nik.
> 
>> In fact, I didn't consider the size of the fs, and for that large fs,
>> rescue mount option should be the first choice before btrfs-restore.
>>
>> Thanks,
>> Qu
>>
>>> Something else, which is risky and should not be tried
>>> on a production system?
>>>
>>> Kind regards,
>>>
>>> Nik.
>>>
>>> -- 
>>>
>>> [snip]
>>>
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-04-12 10:50                                                           ` Qu Wenruo
  2019-04-12 11:38                                                             ` Nik.
@ 2019-05-07 17:17                                                             ` Nik.
  2019-05-07 17:30                                                               ` Chris Murphy
  1 sibling, 1 reply; 51+ messages in thread
From: Nik. @ 2019-05-07 17:17 UTC (permalink / raw)
  To: Qu Wenruo, Chris Murphy, Btrfs BTRFS

Finally found some time to finish the compilation of the requested 
kernel and accomplish the required test - s. below.

2019-04-12 12:50, Qu Wenruo:
> 
> 
> On 2019/4/12 下午6:44, Nik. wrote:
>>
>> 2019-04-08 23:22, Nik.:
>>>
>>>
>>> 2019-04-08 15:09, Qu Wenruo:
>>>> Unfortunately, I didn't receive the last mail from Nik.
>>>>
>>>> So I'm using the content from lore.kernel.org.
>>>>
>>>> [122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
>>>> block=1894009225216 slot=82, bad key order, prev (2034321682432 168
>>>> 262144) current (2034318143488 168 962560)
>>>>
>>>> Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
>>>> occur in extent tree, not in root tree.
>>>>
>>>> This means either the leaf owner, or some tree blocks get totally
>>>> screwed up.
>>>>
>>>> This is not easy to fix, if possible.
>>>>
>>>> Would you please try this kernel branch and mount it with
>>>> "rescue=skip_bg,ro"?
>>>> https://github.com/adam900710/linux/tree/rescue_options

# uname -a
Linux bach 5.1.0-rc4_Bach_+ #2 SMP Sun May 5 22:28:03 CEST 2019 i686 
i686 i686 GNU/Linux

# mount -t btrfs -o rescue=skip_bg,ro /dev/md0 /mnt/tmp/
# dmesg |tail
[  265.410408] BTRFS info (device md0): skip mount time block group 
searching
[  265.410417] BTRFS info (device md0): disk space caching is enabled
[  265.763877] BTRFS info (device md0): bdev /dev/md0 errs: wr 0, rd 0, 
flush 0, corrupt 2181, gen 0

It took about 18 hours to compare the mounted volume with the backup 
(used rsync, without the "--checksum" option, because it was too slow; I 
can run it again with it, if you wish). Only about 300kB were not in my 
backup. Given the backup is also on a btrfs system, is there a more 
"intelligent" way to compare this huge tree with the backup? Optimally 
the fs would keep the check-sums and compare only them?
The thing which gave me tho think was the reported free space on the 
disk, I thought you should have a look at it, too:

# df -hT /mnt/*
Filesystem     Type   Size  Used Avail Use% Mounted on
/dev/sdb       btrfs  3.7T  3.2T  452G  88% /mnt/btraid	#backup
/dev/md0       btrfs  3.7T  3.1T   11P   1% /mnt/tmp	#original fs

I wish I had the reported disk space really :-D

Should I try something else (e.g., btrfs fsck) before reformat?

Best,
Nik.
[snip]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-05-07 17:17                                                             ` Nik.
@ 2019-05-07 17:30                                                               ` Chris Murphy
  2019-05-13 12:19                                                                 ` Nik.
  0 siblings, 1 reply; 51+ messages in thread
From: Chris Murphy @ 2019-05-07 17:30 UTC (permalink / raw)
  To: Nik.; +Cc: Qu Wenruo, Chris Murphy, Btrfs BTRFS

On Tue, May 7, 2019 at 11:17 AM Nik. <btrfs@avgustinov.eu> wrote:
>
> It took about 18 hours to compare the mounted volume with the backup
> (used rsync, without the "--checksum" option, because it was too slow; I

If you're comparing without --checksum, it's just checking file size
and timestamp. It's not checking file contents.

> can run it again with it, if you wish). Only about 300kB were not in my
> backup. Given the backup is also on a btrfs system, is there a more
> "intelligent" way to compare this huge tree with the backup?

Not if you have reason to distrust one of them. If you trust them
both, comparison isn't needed. So you're kinda stuck having to use a
separate tool to independently verify the files.

>Optimally
> the fs would keep the check-sums and compare only them?

No such tool exists. Btrfs doesn't checksum files, it checksums file
extents in 4KiB increments. And I don't even think there's an ioctl to
extract only checksums, in order to do a comparison in user space. The
checksums are, as far as I'm aware, only used internally within Btrfs
kernel space.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?
  2019-05-07 17:30                                                               ` Chris Murphy
@ 2019-05-13 12:19                                                                 ` Nik.
  0 siblings, 0 replies; 51+ messages in thread
From: Nik. @ 2019-05-13 12:19 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Qu Wenruo, Btrfs BTRFS

2019-05-07 19:30, Chris Murphy:
<snip>

>> Optimally
>> the fs would keep the check-sums and compare only them?
> 
> No such tool exists. Btrfs doesn't checksum files, it checksums file
> extents in 4KiB increments. And I don't even think there's an ioctl to
> extract only checksums, in order to do a comparison in user space. The
> checksums are, as far as I'm aware, only used internally within Btrfs
> kernel space.

Just in case it is interesting for you: such a tool seems to exist and 
is not new, have a look at 
https://stackoverflow.com/questions/32761299/btrfs-ioctl-get-file-checksums-from-userspace. 
IMHO a rsync (or btrfs-send|receive), capable of utilizing the 
checksums, could be a great tool. Therefore, I believe that it would be 
better if this project merges into the main btrfs code.

=== Recapitulation ===

Since it seems that there is no more need of experiments with the 
damaged RAID-fs, I am going to reformat it at about 19 o'clock UCT today.

Many thanks to the developer team for the support and even more for the 
creation of this smart file system!

 From my point of view this thread can be closed.

Best regards

Nik.
--

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2019-05-13 12:19 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <aa81a49a-d5ca-0f1c-fa75-9ed3656cff55@avgustinov.eu>
2019-03-31 18:44 ` interest in post-mortem examination of a BTRFS system and improving the btrfs-code? btrfs
2019-04-02  0:24   ` Qu Wenruo
2019-04-02 13:06     ` Nik.
2019-04-02 13:24       ` Qu Wenruo
2019-04-02 13:29         ` Hugo Mills
2019-04-02 14:05           ` Nik.
2019-04-02 13:59         ` Nik.
2019-04-02 14:12           ` Qu Wenruo
2019-04-02 14:19             ` Hans van Kranenburg
2019-04-02 15:04               ` Nik.
2019-04-02 15:07                 ` Hans van Kranenburg
2019-04-02 21:22             ` Nik.
2019-04-03  1:04               ` Qu Wenruo
2019-04-04 15:27                 ` Nik.
2019-04-05  0:47                   ` Qu Wenruo
2019-04-05  6:58                     ` Nik.
2019-04-05  7:08                       ` Qu Wenruo
     [not found]                         ` <e9720559-eff2-e88b-12b4-81defb8c29c5@avgustinov.eu>
2019-04-05  8:15                           ` Qu Wenruo
2019-04-05 19:38                             ` Nik.
2019-04-06  0:03                               ` Qu Wenruo
2019-04-06  7:16                                 ` Nik.
2019-04-06  7:45                                   ` Qu Wenruo
2019-04-06  8:44                                     ` Nik.
2019-04-06  9:06                                       ` Qu Wenruo
2019-04-06 13:20                                         ` Nik.
2019-04-06 13:22                                           ` Qu Wenruo
2019-04-06 13:28                                             ` Qu Wenruo
2019-04-06 14:19                                             ` Nik.
2019-04-06 23:18                                               ` Qu Wenruo
2019-04-07  7:41                                                 ` Nik.
2019-04-07 18:45                                                   ` Chris Murphy
2019-04-08 13:09                                                     ` Qu Wenruo
2019-04-08 21:22                                                       ` Nik.
2019-04-12 10:44                                                         ` Nik.
2019-04-12 10:50                                                           ` Qu Wenruo
2019-04-12 11:38                                                             ` Nik.
2019-04-12 12:45                                                               ` Qu Wenruo
2019-05-07 17:17                                                             ` Nik.
2019-05-07 17:30                                                               ` Chris Murphy
2019-05-13 12:19                                                                 ` Nik.
2019-04-10 21:03                                                     ` Nik.
2019-04-11  0:45                                                       ` Qu Wenruo
2019-04-02 18:28         ` Chris Murphy
2019-04-02 19:02           ` Hugo Mills
2019-04-04  2:48   ` Jeff Mahoney
2019-04-04 15:58     ` Nik.
2019-04-04 17:31       ` Chris Murphy
     [not found]         ` <beab578a-ccaf-1ec7-c7b6-1ba9cd3743ad@avgustinov.eu>
2019-04-05  7:07           ` Chris Murphy
2019-04-05 12:07             ` Nik.
2019-04-12 10:52             ` Nik.
2019-04-05  6:53     ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.