linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Carlos Maiolino <cmaiolino@redhat.com>
To: Sitsofe Wheeler <sitsofe@gmail.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Tasks blocking forever with XFS stack traces
Date: Tue, 5 Nov 2019 09:54:46 +0100	[thread overview]
Message-ID: <20191105085446.abx27ahchg2k7d2w@orion> (raw)
In-Reply-To: <CALjAwxiuTYAVvGGUXLx6Bo-zNuW5+WXL=A8DqR5oD6D5tsKwng@mail.gmail.com>

Hi.

On Tue, Nov 05, 2019 at 07:27:16AM +0000, Sitsofe Wheeler wrote:
> Hi,
> 
> We have a system that has been seeing tasks with XFS calls in their
> stacks. Once these tasks start hanging with uninterruptible sleep any
> write I/O to the directory they were doing I/O to will also hang
> forever. The I/O they doing is being done to a bind mounted directory
> atop an XFS filesystem on top an MD device (the MD device seems to be
> still functional and isn't offline). The kernel is fairly old but I
> thought I'd post a stack in case anyone can describe this or has seen
> it before:
> 
> kernel: [425684.110424] INFO: task kworker/u162:0:58843 blocked for
> more than 120 seconds.
> kernel: [425684.110800]       Tainted: G           OE
> 4.15.0-64-generic #73-Ubuntu
> kernel: [425684.111164] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kernel: [425684.111568] kworker/u162:0  D    0 58843      2 0x80000080
> kernel: [425684.111581] Workqueue: writeback wb_workfn (flush-9:126)
> kernel: [425684.111585] Call Trace:
> kernel: [425684.111595]  __schedule+0x24e/0x880
> kernel: [425684.111664]  ? xfs_map_blocks+0x82/0x250 [xfs]
> kernel: [425684.111668]  schedule+0x2c/0x80
> kernel: [425684.111671]  rwsem_down_read_failed+0xf0/0x160
> kernel: [425684.111675]  ? bitmap_startwrite+0x9f/0x1f0
> kernel: [425684.111679]  call_rwsem_down_read_failed+0x18/0x30
> kernel: [425684.111682]  ? call_rwsem_down_read_failed+0x18/0x30
> kernel: [425684.111685]  down_read+0x20/0x40
> kernel: [425684.111736]  xfs_ilock+0xd5/0x100 [xfs]
> kernel: [425684.111782]  xfs_map_blocks+0x82/0x250 [xfs]
> kernel: [425684.111823]  xfs_do_writepage+0x167/0x6a0 [xfs]
> kernel: [425684.111830]  ? clear_page_dirty_for_io+0x19f/0x1f0
> kernel: [425684.111834]  write_cache_pages+0x207/0x4e0
> kernel: [425684.111869]  ? xfs_vm_writepages+0xf0/0xf0 [xfs]
> kernel: [425684.111875]  ? submit_bio+0x73/0x140
> kernel: [425684.111878]  ? submit_bio+0x73/0x140
> kernel: [425684.111911]  ? xfs_setfilesize_trans_alloc.isra.13+0x3e/0x90 [xfs]
> kernel: [425684.111944]  xfs_vm_writepages+0xbe/0xf0 [xfs]
> kernel: [425684.111949]  do_writepages+0x4b/0xe0
> kernel: [425684.111954]  ? fprop_fraction_percpu+0x2f/0x80
> kernel: [425684.111958]  ? __wb_calc_thresh+0x3e/0x130
> kernel: [425684.111963]  __writeback_single_inode+0x45/0x350
> kernel: [425684.111966]  ? __writeback_single_inode+0x45/0x350
> kernel: [425684.111970]  writeback_sb_inodes+0x1e1/0x510
> kernel: [425684.111975]  __writeback_inodes_wb+0x67/0xb0
> kernel: [425684.111979]  wb_writeback+0x271/0x300
> kernel: [425684.111983]  wb_workfn+0x1bb/0x400
> kernel: [425684.111986]  ? wb_workfn+0x1bb/0x400
> kernel: [425684.111992]  process_one_work+0x1de/0x420
> kernel: [425684.111996]  worker_thread+0x32/0x410
> kernel: [425684.111999]  kthread+0x121/0x140
> kernel: [425684.112003]  ? process_one_work+0x420/0x420
> kernel: [425684.112005]  ? kthread_create_worker_on_cpu+0x70/0x70
> kernel: [425684.112009]  ret_from_fork+0x35/0x40
> kernel: [425684.112024] INFO: task kworker/74:0:9623 blocked for more
> than 120 seconds.
> kernel: [425684.112461]       Tainted: G           OE
> 4.15.0-64-generic #73-Ubuntu
> kernel: [425684.112925] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kernel: [425684.113438] kworker/74:0    D    0  9623      2 0x80000080
> kernel: [425684.113500] Workqueue: xfs-cil/md126 xlog_cil_push_work [xfs]
> kernel: [425684.113502] Call Trace:
> kernel: [425684.113508]  __schedule+0x24e/0x880
> kernel: [425684.113559]  ? xlog_bdstrat+0x2b/0x60 [xfs]
> kernel: [425684.113564]  schedule+0x2c/0x80
> kernel: [425684.113609]  xlog_state_get_iclog_space+0x105/0x2d0 [xfs]
> kernel: [425684.113614]  ? wake_up_q+0x80/0x80
> kernel: [425684.113656]  xlog_write+0x163/0x6e0 [xfs]
> kernel: [425684.113699]  xlog_cil_push+0x2a7/0x410 [xfs]
> kernel: [425684.113740]  xlog_cil_push_work+0x15/0x20 [xfs]
> kernel: [425684.113743]  process_one_work+0x1de/0x420
> kernel: [425684.113747]  worker_thread+0x32/0x410
> kernel: [425684.113750]  kthread+0x121/0x140
> kernel: [425684.113753]  ? process_one_work+0x420/0x420
> kernel: [425684.113756]  ? kthread_create_worker_on_cpu+0x70/0x70
> kernel: [425684.113759]  ret_from_fork+0x35/0x40
> 
> Other directories on the same filesystem seem fine as do other XFS
> filesystems on the same system.

The fact you mention other directories seems to work, and the first stack trace
you posted, it sounds like you've been keeping a singe AG too busy to almost
make it unusable. But, you didn't provide enough information we can really make
any progress here, and to be honest I'm more inclined to point the finger to
your MD device.

Can you describe your MD device? RAID array? What kind? How many disks?
What's your filesystem configuration? (xfs_info <mount point>) 
Do you have anything else on your dmesg other than these two stack traces? I'd
suggest posting the whole dmesg, not only what you think is relevant.

Better yet:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

Cheers.

> 
> -- 
> Sitsofe | http://sucs.org/~sits/

-- 
Carlos


P.S. I'm removing Darrick and linux-fsdevel from CC to avoid spamming too many.


  reply	other threads:[~2019-11-05  8:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05  7:27 Tasks blocking forever with XFS stack traces Sitsofe Wheeler
2019-11-05  8:54 ` Carlos Maiolino [this message]
2019-11-05  9:32   ` Sitsofe Wheeler
2019-11-05 10:36     ` Carlos Maiolino
2019-11-05 11:58       ` Carlos Maiolino
2019-11-05 14:12       ` Sitsofe Wheeler
2019-11-05 16:09         ` Carlos Maiolino
2019-11-07  0:12         ` Chris Murphy
2019-11-13 10:04       ` Sitsofe Wheeler
2020-12-23  8:45         ` Sitsofe Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191105085446.abx27ahchg2k7d2w@orion \
    --to=cmaiolino@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sitsofe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).