linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sitsofe Wheeler <sitsofe@gmail.com>
To: Carlos Maiolino <cmaiolino@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Tasks blocking forever with XFS stack traces
Date: Tue, 5 Nov 2019 14:12:53 +0000	[thread overview]
Message-ID: <CALjAwxiMqjfBX3tZJv3MqMQ776v1aNcwme0B-AuhmEgMNUqgMw@mail.gmail.com> (raw)
In-Reply-To: <20191105103652.n5zwf6ty3wvhti5f@orion>

On Tue, 5 Nov 2019 at 10:37, Carlos Maiolino <cmaiolino@redhat.com> wrote:
>
>
> Hi Sitsofe.
>
> ...
> > <snip>
> > > >
> > > > Other directories on the same filesystem seem fine as do other XFS
> > > > filesystems on the same system.
> > >
> > > The fact you mention other directories seems to work, and the first stack trace
> > > you posted, it sounds like you've been keeping a singe AG too busy to almost
> > > make it unusable. But, you didn't provide enough information we can really make
> > > any progress here, and to be honest I'm more inclined to point the finger to
> > > your MD device.
> >
> > Let's see if we can pinpoint something :-)
> >
> > > Can you describe your MD device? RAID array? What kind? How many disks?
> >
> > RAID6 8 disks.
>
> >
> > > What's your filesystem configuration? (xfs_info <mount point>)
> >
> > meta-data=/dev/md126             isize=512    agcount=32, agsize=43954432 blks
> >          =                       sectsz=4096  attr=2, projid32bit=1
> >          =                       crc=1        finobt=1 spinodes=0 rmapbt=0
> >          =                       reflink=0
> > data     =                       bsize=4096   blocks=1406538240, imaxpct=5
> >          =                       sunit=128    swidth=768 blks
>
> > naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> > log      =internal               bsize=4096   blocks=521728, version=2
> >          =                       sectsz=4096  sunit=1 blks, lazy-count=1
>                                                 ^^^^^^  This should have been
>                                                         configured to 8 blocks, not 1
>
> > Yes there's more. See a slightly elided dmesg from a longer run on
> > https://sucs.org/~sits/test/kern-20191024.log.gz .
>
> At a first quick look, it looks like you are having lots of IO contention in the
> log, and this is slowing down the rest of the filesystem. What caught my

Should it become so slow that a task freezes entirely and never
finishes? Once the problem hits it's not like anything makes any more
progress on those directories nor was there very much generating dirty
data.

If this were to happen again though what extra information would be
helpful (I'm guessing things like /proc/meminfo output)?

> attention at first was the wrong configured log striping for the filesystem and
> I wonder if this isn't the responsible for the amount of IO contention you are
> having in the log. This might well be generating lots of RMW cycles while
> writing to the log generating the IO contention and slowing down the rest of the
> filesystem, I'll try to take a more careful look later on.

My understanding is that the md "chunk size" is 64k so basically
you're saying the sectsz should have been manually set to be as big as
possible at mkfs time? I never realised this never happened by default
(I see the sunit seems to be correct given the block size of 4096 but
I'm unsure about swidth)...

> I can't say anything if there is any bug related with the issue first because I
> honestly don't remember, second because you are using an old distro kernel which
> I have no idea to know which bug fixes have been backported or not. Maybe

Very true.

> somebody else can remember of any bug that might be related, but the amount of
> threads you have waiting for log IO, and that misconfigured striping for the log
> smells smoke to me.
>
> I let you know if I can identify anything else later.

Thanks.

--
Sitsofe | http://sucs.org/~sits/

  parent reply	other threads:[~2019-11-05 14:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05  7:27 Tasks blocking forever with XFS stack traces Sitsofe Wheeler
2019-11-05  8:54 ` Carlos Maiolino
2019-11-05  9:32   ` Sitsofe Wheeler
2019-11-05 10:36     ` Carlos Maiolino
2019-11-05 11:58       ` Carlos Maiolino
2019-11-05 14:12       ` Sitsofe Wheeler [this message]
2019-11-05 16:09         ` Carlos Maiolino
2019-11-07  0:12         ` Chris Murphy
2019-11-13 10:04       ` Sitsofe Wheeler
2020-12-23  8:45         ` Sitsofe Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALjAwxiMqjfBX3tZJv3MqMQ776v1aNcwme0B-AuhmEgMNUqgMw@mail.gmail.com \
    --to=sitsofe@gmail.com \
    --cc=cmaiolino@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).