linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: "Scott E. Blomquist" <sb@techsquare.com>
Cc: Jojo <jojo@automatix.de>, linux-btrfs@vger.kernel.org
Subject: Re: btrfs hang on nfs?
Date: Mon, 14 Jan 2019 15:28:45 +0200	[thread overview]
Message-ID: <9644378c-d06c-9747-4f15-7f1d0804f54e@suse.com> (raw)
In-Reply-To: <23612.35592.599043.773332@techsquare.com>



On 14.01.19 г. 15:13 ч., Scott E. Blomquist wrote:
> 
> Nikolay Borisov writes:
>  > 
>  > On 14.01.19 г. 13:42 ч., Scott E. Blomquist wrote:
>  > > 
>  <snip>
>  > > 
>  > > The file system hung again below is the sysrq output
>  > > 
>  > > Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
>  > > 
>  > > btrfs-progs v4.19.1 
>  > > 
>  > > # btrfs fi df /export/
>  > > Data, single: total=79.61TiB, used=79.61TiB
>  > > System, single: total=36.00MiB, used=8.31MiB
>  > > Metadata, single: total=192.01GiB, used=190.19GiB
>  > > GlobalReserve, single: total=512.00MiB, used=0.00B
>  > 
>  > So this btrfs is hosted on your local machine but it is exported via
>  > NFS, correct?
> 
> Correct and via samba also
> 
>  > > 
>  > > #  btrfs fi show
>  > > Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
>  > >         Total devices 3 FS bytes used 79.79TiB
>  > >         devid    1 size 47.30TiB used 43.58TiB path /dev/sda1
>  > >         devid    2 size 21.83TiB used 18.11TiB path /dev/sdb1
>  > >         devid    3 size 21.83TiB used 18.11TiB path /dev/sdc1
>  > 
>  > What kind of disks are those, presumably spinning rust due to their size
>  > but what model/make?
>  > 
> 
> 3 x raid 6 on a LSI MegaRAID SAS 9271-8i

Has your controller been updated to the latest firmware? In my
experience LSI Megaraid are rubbish controllers and in the past, in a
datacenter environment, we've had a batch of bad controllers which
resulted in controllers resets, causing all IO to die on 10s of machines.

There was a way to query the controller's built-in log for firmware
errors. I can't remember the exact command but googling suggests using:

MegaCli -AdpEventLog -GetEvents -f events.log -aALL && cat events.log

Can you run that and also attach it when a hang occurs?

> 
>  > > [Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State
>  > 
>  > <snip>
>  > 
>  > > [Mon Jan 14 06:24:26 2019] btrfs-transacti D    0  6808      2 0x80000000
>  > > [Mon Jan 14 06:24:26 2019] Call Trace:
>  > > [Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
>  > > [Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  commit_cowonly_roots+0x1ed/0x280 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  transaction_kthread+0x14d/0x180 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
>  > > [Mon Jan 14 06:24:26 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
>  > > [Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
>  > 
>  > So the transaction is being committed as a result of that
>  > btrfs_start_ordered_extent, which flushes data to disk. Since you've
>  > compiled your kernel can you run the following command from the kernel's
>  > source:
>  > 
>  > ./scripts/faddr2line  vmlinux  btrfs_start_ordered_extent+0xca/0x100
>  > 
>  > 'vmlinux' should be the kernel executable with debug info that results
>  > from compiling the kernel. I want to figure out which line exactly
>  > btrfs_start_ordered_extent+0xca/0x100 resolves to.
> 
>  <snip>
> 
> I'll have to rebuild the kernel with debug symbols.  Do I have to be
> booted into the kernel for that command to be useful?

Well the running kernel needs to correspond to the vmlinux since
otherwise the offsets might not match. In any case try rebuilding the
kernel and running it to see if it's going to result in a sane output.

> 
> Cheers and Thanks,
> 
> sb. Scott Blomquist
> 
> 
> 

  reply	other threads:[~2019-01-14 13:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-09 10:54 btrfs hang on nfs? Scott E. Blomquist
2019-01-09 12:14 ` Jojo
2019-01-09 13:31   ` Scott E. Blomquist
2019-01-10 11:46     ` Scott E. Blomquist
2019-01-10 11:51       ` Nikolay Borisov
2019-01-10 12:00         ` Scott E. Blomquist
2019-01-14 11:42           ` Scott E. Blomquist
2019-01-14 12:11             ` Nikolay Borisov
2019-01-14 13:13               ` Scott E. Blomquist
2019-01-14 13:28                 ` Nikolay Borisov [this message]
2019-01-15  8:16                 ` Nikolay Borisov
2019-01-15 14:36                   ` Scott E. Blomquist

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9644378c-d06c-9747-4f15-7f1d0804f54e@suse.com \
    --to=nborisov@suse.com \
    --cc=jojo@automatix.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sb@techsquare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).