linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Timothy Pearson <tpearson@raptorengineering.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Unusual crash -- data rolled back ~2 weeks?
Date: Sun, 10 Nov 2019 01:18:13 -0600 (CST)	[thread overview]
Message-ID: <1503948411.128656.1573370293214.JavaMail.zimbra@raptorengineeringinc.com> (raw)
In-Reply-To: <64be1293-5845-4054-8d5f-b9ff79168a17@gmx.com>



----- Original Message -----
> From: "Qu Wenruo" <quwenruo.btrfs@gmx.com>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linux-btrfs" <linux-btrfs@vger.kernel.org>
> Sent: Sunday, November 10, 2019 6:54:55 AM
> Subject: Re: Unusual crash -- data rolled back ~2 weeks?

> On 2019/11/10 下午2:47, Timothy Pearson wrote:
>> 
>> 
>> ----- Original Message -----
>>> From: "Qu Wenruo" <quwenruo.btrfs@gmx.com>
>>> To: "Timothy Pearson" <tpearson@raptorengineering.com>, "linux-btrfs"
>>> <linux-btrfs@vger.kernel.org>
>>> Sent: Saturday, November 9, 2019 9:38:21 PM
>>> Subject: Re: Unusual crash -- data rolled back ~2 weeks?
>> 
>>> On 2019/11/10 上午6:33, Timothy Pearson wrote:
>>>> We just experienced a very unusual crash on a Linux 5.3 file server using NFS to
>>>> serve a BTRFS filesystem.  NFS went into deadlock (D wait) with no apparent
>>>> underlying disk subsystem problems, and when the server was hard rebooted to
>>>> clear the D wait the BTRFS filesystem remounted itself in the state that it was
>>>> in approximately two weeks earlier (!).
>>>
>>> This means during two weeks, the btrfs is not committed.
>> 
>> Is there any hope of getting the data from that interval back via btrfs-recover
>> or a similar tool, or does the lack of commit mean the data was stored in RAM
>> only and is therefore gone after the server reboot?
> 
> If it's deadlock preventing new transaction to be committed, then no
> metadata is even written back to disk, so no way to recover metadata.
> Maybe you can find some data written, but without metadata it makes no
> sense.

OK, I'll just assume the data written in that window is unrecoverable at this point then.

Would the commit deadlock affect only one btrfs filesystem or all of them on the machine?  I take it there is no automatic dmesg spew on extended deadlock?  dmesg was completely clean at the time of the fault / reboot.

>> 
>> If the latter, I'm somewhat surprised given the I/O load on the disk array in
>> question, but it would also offer a clue as to why it hard locked the
>> filesystem eventually (presumably on memory exhaustion -- the server has
>> something like 128GB of RAM, so it could go quite a while before hitting the
>> physical RAM limits).
>> 
>>>
>>>>  There was also significant corruption of certain files (e.g. LDAP MDB and MySQL
>>>>  InnoDB) noted -- we restored from backup for those files, but are concerned
>>>>  about the status of the entire filesystem at this point.
>>>
>>> Btrfs check is needed to ensure no metadata corruption.
>>>
>>> Also, we need sysrq+w output to determine where we are deadlocking.
>>> Otherwise, it's really hard to find any clue from the report.
>> 
>> It would have been gathered if we'd known the filesystem was in this bad state.
>> At the time, the priority was on restoring service and we had assumed NFS had
>> just wedged itself (again).  It was only after reboot and remount that the
>> damage slowly came to light.
>> 
>> Do the described symptoms (what little we know of them at this point) line up
>> with the issues fixed by https://patchwork.kernel.org/patch/11141559/ ?  Right
>> now we're hoping that this particular issue was fixed by that series, but if
>> not we might consider increasing backup frequency to nightly for this
>> particular array and seeing if it happens again.
> 
> That fix is already in v5.3, thus I don't think that's the case.
> 
> Thanks,
> Qu

Looking more carefully, the server in question had been booted on 5.3-rc3 somehow.  It's possible that this was because earlier versions were showing driver problems with the other hardware, but somehow this machine was running 5.3-rc3 and the patch was created *after* rc3 release.

Thanks!

  reply	other threads:[~2019-11-10  7:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-09 22:33 Unusual crash -- data rolled back ~2 weeks? Timothy Pearson
2019-11-09 22:48 ` Timothy Pearson
2019-11-10  3:38 ` Qu Wenruo
2019-11-10  6:47   ` Timothy Pearson
2019-11-10  6:54     ` Qu Wenruo
2019-11-10  7:18       ` Timothy Pearson [this message]
2019-11-10  7:45         ` Qu Wenruo
2019-11-10  7:48           ` Timothy Pearson
2019-11-10 10:02           ` Timothy Pearson
2019-11-10 20:10             ` Zygo Blaxell
2019-11-11 23:28           ` Timothy Pearson
2019-11-11 23:33             ` Timothy Pearson
2019-11-12 11:30             ` Chris Murphy
2019-11-10  8:04         ` Andrei Borzenkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1503948411.128656.1573370293214.JavaMail.zimbra@raptorengineeringinc.com \
    --to=tpearson@raptorengineering.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).