Fwd: XFS Memory allocation deadlock in kmem

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Fwd: XFS Memory allocation deadlock in kmem_alloc
       [not found] <CAKQeeLMxJR-ToX5HG9Q-z0-AL9vZG-OMjHyM+rnEEBP6k6nxHw@mail.gmail.com>
@ 2019-11-15 19:11 ` Andrew Carr
  2019-11-15 19:52   ` Eric Sandeen
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Carr @ 2019-11-15 19:11 UTC (permalink / raw)
  To: linux-xfs

Hello,

This list has recommended enabling stack traces to determine the root
cause of issues with XFS deadlocks occurring in Centos 7.7
(3.10.0-1062).

Based on what was recommended by Eric Sandeen, we have tried updating
the following files to generate XFS stack traces:

# echo 11 > /proc/sys/fs/xfs/error_level

And

# echo 3 > /proc/sys/fs/xfs/error_level

But no stack traces are printed to dmesg.  I was thinking of
re-compiling the kernel with debug flags enabled.  Do you think this
is necessary?

Thanks so much for your time and keep up the good work!

Sincerely,
--
Andrew Carr | Enterprise Architect
Rogue Wave Software, Inc.
Innovate with Confidence
P 720.295.8044
www.roguewave.com | andrew.carr@roguewave.com

--
With Regards,
Andrew Carr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-15 19:11 ` Fwd: XFS Memory allocation deadlock in kmem_alloc Andrew Carr
@ 2019-11-15 19:52   ` Eric Sandeen
  2019-11-15 23:43     ` Dave Chinner
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Sandeen @ 2019-11-15 19:52 UTC (permalink / raw)
  To: Andrew Carr, linux-xfs

On 11/15/19 1:11 PM, Andrew Carr wrote:
> Hello,
> 
> This list has recommended enabling stack traces to determine the root
> cause of issues with XFS deadlocks occurring in Centos 7.7
> (3.10.0-1062).
> 
> Based on what was recommended by Eric Sandeen, we have tried updating
> the following files to generate XFS stack traces:
> 
> # echo 11 > /proc/sys/fs/xfs/error_level
> 
> And
> 
> # echo 3 > /proc/sys/fs/xfs/error_level
> 
> But no stack traces are printed to dmesg.  I was thinking of
> re-compiling the kernel with debug flags enabled.  Do you think this
> is necessary?
> 
> Thanks so much for your time and keep up the good work!

I've looked over the way xfs_err() gets defined, and I cannot see how
we can call xfs_err with error_level == 11 and not get a stack trace.

Maybe other eyes can spot something...

-Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-15 19:52   ` Eric Sandeen
@ 2019-11-15 23:43     ` Dave Chinner
  2019-11-16 16:19       ` Andrew Carr
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2019-11-15 23:43 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Andrew Carr, linux-xfs

On Fri, Nov 15, 2019 at 01:52:57PM -0600, Eric Sandeen wrote:
> On 11/15/19 1:11 PM, Andrew Carr wrote:
> > Hello,
> > 
> > This list has recommended enabling stack traces to determine the root
> > cause of issues with XFS deadlocks occurring in Centos 7.7
> > (3.10.0-1062).
> > 
> > Based on what was recommended by Eric Sandeen, we have tried updating
> > the following files to generate XFS stack traces:
> > 
> > # echo 11 > /proc/sys/fs/xfs/error_level
> > 
> > And
> > 
> > # echo 3 > /proc/sys/fs/xfs/error_level
> > 
> > But no stack traces are printed to dmesg.  I was thinking of
> > re-compiling the kernel with debug flags enabled.  Do you think this
> > is necessary?

dmesg -n 7 will remove all filters on the console/dmesg output - if
you've utrned this down in the past you may not be seeing messages
of the error level XFS is using...

Did you check syslog - that should have all the unfiltered messages
in it...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-15 23:43     ` Dave Chinner
@ 2019-11-16 16:19       ` Andrew Carr
  2019-11-19 15:49         ` Andrew Carr
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Carr @ 2019-11-16 16:19 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, linux-xfs

Thanks Dave,
Checking now.

On Fri, Nov 15, 2019 at 6:43 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Fri, Nov 15, 2019 at 01:52:57PM -0600, Eric Sandeen wrote:
> > On 11/15/19 1:11 PM, Andrew Carr wrote:
> > > Hello,
> > >
> > > This list has recommended enabling stack traces to determine the root
> > > cause of issues with XFS deadlocks occurring in Centos 7.7
> > > (3.10.0-1062).
> > >
> > > Based on what was recommended by Eric Sandeen, we have tried updating
> > > the following files to generate XFS stack traces:
> > >
> > > # echo 11 > /proc/sys/fs/xfs/error_level
> > >
> > > And
> > >
> > > # echo 3 > /proc/sys/fs/xfs/error_level
> > >
> > > But no stack traces are printed to dmesg.  I was thinking of
> > > re-compiling the kernel with debug flags enabled.  Do you think this
> > > is necessary?
>
> dmesg -n 7 will remove all filters on the console/dmesg output - if
> you've utrned this down in the past you may not be seeing messages
> of the error level XFS is using...
>
> Did you check syslog - that should have all the unfiltered messages
> in it...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com



-- 
With Regards,
Andrew Carr

e. andrewlanecarr@gmail.com
w. andrew.carr@openlogic.com
c. 4239489206
a. P.O. Box 1231, Greeneville, TN, 37744

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-16 16:19       ` Andrew Carr
@ 2019-11-19 15:49         ` Andrew Carr
  2019-11-19 20:20           ` Dave Chinner
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Carr @ 2019-11-19 15:49 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, linux-xfs

Dave / Eric / Others,

Syslog: https://pastebin.com/QYQYpPFY

Dmesg: https://pastebin.com/MdBCPmp9

-Andrew Carr


On Sat, Nov 16, 2019 at 11:19 AM Andrew Carr <andrewlanecarr@gmail.com> wrote:
>
> Thanks Dave,
> Checking now.
>
> On Fri, Nov 15, 2019 at 6:43 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Fri, Nov 15, 2019 at 01:52:57PM -0600, Eric Sandeen wrote:
> > > On 11/15/19 1:11 PM, Andrew Carr wrote:
> > > > Hello,
> > > >
> > > > This list has recommended enabling stack traces to determine the root
> > > > cause of issues with XFS deadlocks occurring in Centos 7.7
> > > > (3.10.0-1062).
> > > >
> > > > Based on what was recommended by Eric Sandeen, we have tried updating
> > > > the following files to generate XFS stack traces:
> > > >
> > > > # echo 11 > /proc/sys/fs/xfs/error_level
> > > >
> > > > And
> > > >
> > > > # echo 3 > /proc/sys/fs/xfs/error_level
> > > >
> > > > But no stack traces are printed to dmesg.  I was thinking of
> > > > re-compiling the kernel with debug flags enabled.  Do you think this
> > > > is necessary?
> >
> > dmesg -n 7 will remove all filters on the console/dmesg output - if
> > you've utrned this down in the past you may not be seeing messages
> > of the error level XFS is using...
> >
> > Did you check syslog - that should have all the unfiltered messages
> > in it...
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > david@fromorbit.com
>
>
>
> --
> With Regards,
> Andrew Carr
>
> e. andrewlanecarr@gmail.com
> w. andrew.carr@openlogic.com
> c. 4239489206
> a. P.O. Box 1231, Greeneville, TN, 37744



-- 
With Regards,
Andrew Carr

e. andrewlanecarr@gmail.com
w. andrew.carr@openlogic.com
c. 4239489206
a. P.O. Box 1231, Greeneville, TN, 37744

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-19 15:49         ` Andrew Carr
@ 2019-11-19 20:20           ` Dave Chinner
  2019-11-20 15:43             ` Andrew Carr
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2019-11-19 20:20 UTC (permalink / raw)
  To: Andrew Carr; +Cc: Eric Sandeen, linux-xfs

On Tue, Nov 19, 2019 at 10:49:56AM -0500, Andrew Carr wrote:
> Dave / Eric / Others,
> 
> Syslog: https://pastebin.com/QYQYpPFY
> 
> Dmesg: https://pastebin.com/MdBCPmp9

which shows no stack traces, again.



Anyway, you've twiddled mkfs knobs on these filesystems, and that
is the likely cause of the issue: the filesystem is using 64k
directory blocks - the allocation size is larger than 64kB:

[Sun Nov 17 21:40:05 2019] XFS: nginx(31293) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x250)

Upstream fixed this some time ago:

$ ▶ gl -n 1 -p cb0a8d23024e
commit cb0a8d23024e7bd234dea4d0fc5c4902a8dda766
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Mar 6 17:03:28 2018 -0800

    xfs: fall back to vmalloc when allocation log vector buffers
    
    When using large directory blocks, we regularly see memory
    allocations of >64k being made for the shadow log vector buffer.
    When we are under memory pressure, kmalloc() may not be able to find
    contiguous memory chunks large enough to satisfy these allocations
    easily, and if memory is fragmented we can potentially stall here.
    
    TO avoid this problem, switch the log vector buffer allocation to
    use kmem_alloc_large(). This will allow failed allocations to fall
    back to vmalloc and so remove the dependency on large contiguous
    regions of memory being available. This should prevent slowdowns
    and potential stalls when memory is low and/or fragmented.
    
    Signed-Off-By: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-19 20:20           ` Dave Chinner
@ 2019-11-20 15:43             ` Andrew Carr
  2019-11-22 14:08               ` Andrew Carr
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Carr @ 2019-11-20 15:43 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, linux-xfs

Genius Dave, Thanks so much!

On Tue, Nov 19, 2019 at 3:21 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Tue, Nov 19, 2019 at 10:49:56AM -0500, Andrew Carr wrote:
> > Dave / Eric / Others,
> >
> > Syslog: https://pastebin.com/QYQYpPFY
> >
> > Dmesg: https://pastebin.com/MdBCPmp9
>
> which shows no stack traces, again.
>
>
>
> Anyway, you've twiddled mkfs knobs on these filesystems, and that
> is the likely cause of the issue: the filesystem is using 64k
> directory blocks - the allocation size is larger than 64kB:
>
> [Sun Nov 17 21:40:05 2019] XFS: nginx(31293) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x250)
>
> Upstream fixed this some time ago:
>
> $ ▶ gl -n 1 -p cb0a8d23024e
> commit cb0a8d23024e7bd234dea4d0fc5c4902a8dda766
> Author: Dave Chinner <dchinner@redhat.com>
> Date:   Tue Mar 6 17:03:28 2018 -0800
>
>     xfs: fall back to vmalloc when allocation log vector buffers
>
>     When using large directory blocks, we regularly see memory
>     allocations of >64k being made for the shadow log vector buffer.
>     When we are under memory pressure, kmalloc() may not be able to find
>     contiguous memory chunks large enough to satisfy these allocations
>     easily, and if memory is fragmented we can potentially stall here.
>
>     TO avoid this problem, switch the log vector buffer allocation to
>     use kmem_alloc_large(). This will allow failed allocations to fall
>     back to vmalloc and so remove the dependency on large contiguous
>     regions of memory being available. This should prevent slowdowns
>     and potential stalls when memory is low and/or fragmented.
>
>     Signed-Off-By: Dave Chinner <dchinner@redhat.com>
>     Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>     Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com



-- 
With Regards,
Andrew Carr

e. andrewlanecarr@gmail.com
w. andrew.carr@openlogic.com
c. 4239489206
a. P.O. Box 1231, Greeneville, TN, 37744

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-20 15:43             ` Andrew Carr
@ 2019-11-22 14:08               ` Andrew Carr
  2019-11-22 16:12                 ` Darrick J. Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Carr @ 2019-11-22 14:08 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, linux-xfs

Hi Dave  / Others,

It appears upgrading to 4.17+ has indeed fixed the deadlock issue, or
at least no deadlocks are occurring now.

There are segfaults in xfs_db appearing now though.  I am attempting
to get the full syslog, here is an example.... thoughts?

[Thu Nov 21 10:43:20 2019] xfs_db[13076]: segfault at 12ff6001 ip
0000000000407922 sp 00007ffe1a27b2e0 error 4 in xfs_db[400000+8a000]
[Thu Nov 21 10:43:20 2019] Code: 89 cc 55 48 89 d5 53 48 89 f3 48 83
ec 48 0f b6 57 01 44 0f b6 4f 02 64 48 8b 04 25 28 00 00 00 48 89 44
24 38 31 c0 0f b6 07 <44> 0f b6 57 0d 48 8d 74 24 10 c1 e2 10 41 c1 e1
08 c1 e0 18 41 c1

Thanks so much in advance!
Andrew

On Wed, Nov 20, 2019 at 10:43 AM Andrew Carr <andrewlanecarr@gmail.com> wrote:
>
> Genius Dave, Thanks so much!
>
> On Tue, Nov 19, 2019 at 3:21 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Tue, Nov 19, 2019 at 10:49:56AM -0500, Andrew Carr wrote:
> > > Dave / Eric / Others,
> > >
> > > Syslog: https://pastebin.com/QYQYpPFY
> > >
> > > Dmesg: https://pastebin.com/MdBCPmp9
> >
> > which shows no stack traces, again.
> >
> >
> >
> > Anyway, you've twiddled mkfs knobs on these filesystems, and that
> > is the likely cause of the issue: the filesystem is using 64k
> > directory blocks - the allocation size is larger than 64kB:
> >
> > [Sun Nov 17 21:40:05 2019] XFS: nginx(31293) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x250)
> >
> > Upstream fixed this some time ago:
> >
> > $ ▶ gl -n 1 -p cb0a8d23024e
> > commit cb0a8d23024e7bd234dea4d0fc5c4902a8dda766
> > Author: Dave Chinner <dchinner@redhat.com>
> > Date:   Tue Mar 6 17:03:28 2018 -0800
> >
> >     xfs: fall back to vmalloc when allocation log vector buffers
> >
> >     When using large directory blocks, we regularly see memory
> >     allocations of >64k being made for the shadow log vector buffer.
> >     When we are under memory pressure, kmalloc() may not be able to find
> >     contiguous memory chunks large enough to satisfy these allocations
> >     easily, and if memory is fragmented we can potentially stall here.
> >
> >     TO avoid this problem, switch the log vector buffer allocation to
> >     use kmem_alloc_large(). This will allow failed allocations to fall
> >     back to vmalloc and so remove the dependency on large contiguous
> >     regions of memory being available. This should prevent slowdowns
> >     and potential stalls when memory is low and/or fragmented.
> >
> >     Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> >     Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> >     Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> >
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > david@fromorbit.com
>
>
>
> --
> With Regards,
> Andrew Carr
>
> e. andrewlanecarr@gmail.com
> w. andrew.carr@openlogic.com
> c. 4239489206
> a. P.O. Box 1231, Greeneville, TN, 37744



-- 
With Regards,
Andrew Carr

e. andrewlanecarr@gmail.com
w. andrew.carr@openlogic.com
c. 4239489206
a. P.O. Box 1231, Greeneville, TN, 37744

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
  2019-11-22 14:08               ` Andrew Carr
@ 2019-11-22 16:12                 ` Darrick J. Wong
       [not found]                   ` <CAC752A=7x+gh9Jr8-koQtuZDvMzrs6qRc+saj=TMC3js9EdHbg@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Darrick J. Wong @ 2019-11-22 16:12 UTC (permalink / raw)
  To: Andrew Carr; +Cc: Dave Chinner, Eric Sandeen, linux-xfs

On Fri, Nov 22, 2019 at 09:08:26AM -0500, Andrew Carr wrote:
> Hi Dave  / Others,
> 
> It appears upgrading to 4.17+ has indeed fixed the deadlock issue, or
> at least no deadlocks are occurring now.
> 
> There are segfaults in xfs_db appearing now though.  I am attempting
> to get the full syslog, here is an example.... thoughts?
> 
> [Thu Nov 21 10:43:20 2019] xfs_db[13076]: segfault at 12ff6001 ip
> 0000000000407922 sp 00007ffe1a27b2e0 error 4 in xfs_db[400000+8a000]
> [Thu Nov 21 10:43:20 2019] Code: 89 cc 55 48 89 d5 53 48 89 f3 48 83
> ec 48 0f b6 57 01 44 0f b6 4f 02 64 48 8b 04 25 28 00 00 00 48 89 44
> 24 38 31 c0 0f b6 07 <44> 0f b6 57 0d 48 8d 74 24 10 c1 e2 10 41 c1 e1
> 08 c1 e0 18 41 c1

Actual coredumps of the crashed xfs_db would help.

--D

> Thanks so much in advance!
> Andrew
> 
> On Wed, Nov 20, 2019 at 10:43 AM Andrew Carr <andrewlanecarr@gmail.com> wrote:
> >
> > Genius Dave, Thanks so much!
> >
> > On Tue, Nov 19, 2019 at 3:21 PM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Tue, Nov 19, 2019 at 10:49:56AM -0500, Andrew Carr wrote:
> > > > Dave / Eric / Others,
> > > >
> > > > Syslog: https://pastebin.com/QYQYpPFY
> > > >
> > > > Dmesg: https://pastebin.com/MdBCPmp9
> > >
> > > which shows no stack traces, again.
> > >
> > >
> > >
> > > Anyway, you've twiddled mkfs knobs on these filesystems, and that
> > > is the likely cause of the issue: the filesystem is using 64k
> > > directory blocks - the allocation size is larger than 64kB:
> > >
> > > [Sun Nov 17 21:40:05 2019] XFS: nginx(31293) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x250)
> > >
> > > Upstream fixed this some time ago:
> > >
> > > $ ▶ gl -n 1 -p cb0a8d23024e
> > > commit cb0a8d23024e7bd234dea4d0fc5c4902a8dda766
> > > Author: Dave Chinner <dchinner@redhat.com>
> > > Date:   Tue Mar 6 17:03:28 2018 -0800
> > >
> > >     xfs: fall back to vmalloc when allocation log vector buffers
> > >
> > >     When using large directory blocks, we regularly see memory
> > >     allocations of >64k being made for the shadow log vector buffer.
> > >     When we are under memory pressure, kmalloc() may not be able to find
> > >     contiguous memory chunks large enough to satisfy these allocations
> > >     easily, and if memory is fragmented we can potentially stall here.
> > >
> > >     TO avoid this problem, switch the log vector buffer allocation to
> > >     use kmem_alloc_large(). This will allow failed allocations to fall
> > >     back to vmalloc and so remove the dependency on large contiguous
> > >     regions of memory being available. This should prevent slowdowns
> > >     and potential stalls when memory is low and/or fragmented.
> > >
> > >     Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> > >     Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > >     Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > >
> > > Cheers,
> > >
> > > Dave.
> > > --
> > > Dave Chinner
> > > david@fromorbit.com
> >
> >
> >
> > --
> > With Regards,
> > Andrew Carr
> >
> > e. andrewlanecarr@gmail.com
> > w. andrew.carr@openlogic.com
> > c. 4239489206
> > a. P.O. Box 1231, Greeneville, TN, 37744
> 
> 
> 
> -- 
> With Regards,
> Andrew Carr
> 
> e. andrewlanecarr@gmail.com
> w. andrew.carr@openlogic.com
> c. 4239489206
> a. P.O. Box 1231, Greeneville, TN, 37744

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <CAC752A=7x+gh9Jr8-koQtuZDvMzrs6qRc+saj=TMC3js9EdHbg@mail.gmail.com>]

* Re: Fwd: XFS Memory allocation deadlock in kmem_alloc
       [not found]                   ` <CAC752A=7x+gh9Jr8-koQtuZDvMzrs6qRc+saj=TMC3js9EdHbg@mail.gmail.com>
@ 2019-11-22 18:49                     ` Darrick J. Wong
  0 siblings, 0 replies; 10+ messages in thread
From: Darrick J. Wong @ 2019-11-22 18:49 UTC (permalink / raw)
  To: Blake Golliher; +Cc: Andrew Carr, Dave Chinner, Eric Sandeen, linux-xfs

On Fri, Nov 22, 2019 at 10:17:33AM -0800, Blake Golliher wrote:
> Where would those core dumps be?  Are they automatically dumped or do we
> have to set a flag, then trigger the condition?

ulimit -c 9999999999, then run whatever it was you were running that
invokes xfs_db.

--D

> On Fri, Nov 22, 2019 at 8:12 AM Darrick J. Wong <darrick.wong@oracle.com>
> wrote:
> 
> >
> >
> > CAUTION: External Email
> >
> >
> >
> >
> > On Fri, Nov 22, 2019 at 09:08:26AM -0500, Andrew Carr wrote:
> > > Hi Dave  / Others,
> > >
> > > It appears upgrading to 4.17+ has indeed fixed the deadlock issue, or
> > > at least no deadlocks are occurring now.
> > >
> > > There are segfaults in xfs_db appearing now though.  I am attempting
> > > to get the full syslog, here is an example.... thoughts?
> > >
> > > [Thu Nov 21 10:43:20 2019] xfs_db[13076]: segfault at 12ff6001 ip
> > > 0000000000407922 sp 00007ffe1a27b2e0 error 4 in xfs_db[400000+8a000]
> > > [Thu Nov 21 10:43:20 2019] Code: 89 cc 55 48 89 d5 53 48 89 f3 48 83
> > > ec 48 0f b6 57 01 44 0f b6 4f 02 64 48 8b 04 25 28 00 00 00 48 89 44
> > > 24 38 31 c0 0f b6 07 <44> 0f b6 57 0d 48 8d 74 24 10 c1 e2 10 41 c1 e1
> > > 08 c1 e0 18 41 c1
> >
> > Actual coredumps of the crashed xfs_db would help.
> >
> > --D
> >
> > > Thanks so much in advance!
> > > Andrew
> > >
> > > On Wed, Nov 20, 2019 at 10:43 AM Andrew Carr <andrewlanecarr@gmail.com>
> > wrote:
> > > >
> > > > Genius Dave, Thanks so much!
> > > >
> > > > On Tue, Nov 19, 2019 at 3:21 PM Dave Chinner <david@fromorbit.com>
> > wrote:
> > > > >
> > > > > On Tue, Nov 19, 2019 at 10:49:56AM -0500, Andrew Carr wrote:
> > > > > > Dave / Eric / Others,
> > > > > >
> > > > > > Syslog:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__pastebin.com_QYQYpPFY&d=DwIDaQ&c=4dvmKrCYCD_MWOWC_k7VMw&r=NRVQX89iLxYf06dcpbIrijtLC-DKd-z7vxj002MWTmI&m=gtReaQZA21GCSFtWKk0Ycbpr-Ra30apUfn69fetsCyI&s=cFo_9R18qcbqlKAa2jfsMB02h74aHd4m04zbNPYS1-I&e=
> > > > > >
> > > > > > Dmesg:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__pastebin.com_MdBCPmp9&d=DwIDaQ&c=4dvmKrCYCD_MWOWC_k7VMw&r=NRVQX89iLxYf06dcpbIrijtLC-DKd-z7vxj002MWTmI&m=gtReaQZA21GCSFtWKk0Ycbpr-Ra30apUfn69fetsCyI&s=E9ryV4GnH02exSAsoFGbq1arjRLkyffNjka_kZ4MV60&e=
> > > > >
> > > > > which shows no stack traces, again.
> > > > >
> > > > >
> > > > >
> > > > > Anyway, you've twiddled mkfs knobs on these filesystems, and that
> > > > > is the likely cause of the issue: the filesystem is using 64k
> > > > > directory blocks - the allocation size is larger than 64kB:
> > > > >
> > > > > [Sun Nov 17 21:40:05 2019] XFS: nginx(31293) possible memory
> > allocation deadlock size 65728 in kmem_alloc (mode:0x250)
> > > > >
> > > > > Upstream fixed this some time ago:
> > > > >
> > > > > $ ▶ gl -n 1 -p cb0a8d23024e
> > > > > commit cb0a8d23024e7bd234dea4d0fc5c4902a8dda766
> > > > > Author: Dave Chinner <dchinner@redhat.com>
> > > > > Date:   Tue Mar 6 17:03:28 2018 -0800
> > > > >
> > > > >     xfs: fall back to vmalloc when allocation log vector buffers
> > > > >
> > > > >     When using large directory blocks, we regularly see memory
> > > > >     allocations of >64k being made for the shadow log vector buffer.
> > > > >     When we are under memory pressure, kmalloc() may not be able to
> > find
> > > > >     contiguous memory chunks large enough to satisfy these
> > allocations
> > > > >     easily, and if memory is fragmented we can potentially stall
> > here.
> > > > >
> > > > >     TO avoid this problem, switch the log vector buffer allocation to
> > > > >     use kmem_alloc_large(). This will allow failed allocations to
> > fall
> > > > >     back to vmalloc and so remove the dependency on large contiguous
> > > > >     regions of memory being available. This should prevent slowdowns
> > > > >     and potential stalls when memory is low and/or fragmented.
> > > > >
> > > > >     Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> > > > >     Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > >     Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Dave.
> > > > > --
> > > > > Dave Chinner
> > > > > david@fromorbit.com
> > > >
> > > >
> > > >
> > > > --
> > > > With Regards,
> > > > Andrew Carr
> > > >
> > > > e. andrewlanecarr@gmail.com
> > > > w. andrew.carr@openlogic.com
> > > > c. 4239489206
> > > > a. P.O. Box 1231, Greeneville, TN, 37744
> > >
> > >
> > >
> > > --
> > > With Regards,
> > > Andrew Carr
> > >
> > > e. andrewlanecarr@gmail.com
> > > w. andrew.carr@openlogic.com
> > > c. 4239489206
> > > a. P.O. Box 1231, Greeneville, TN, 37744
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-11-22 18:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAKQeeLMxJR-ToX5HG9Q-z0-AL9vZG-OMjHyM+rnEEBP6k6nxHw@mail.gmail.com>
2019-11-15 19:11 ` Fwd: XFS Memory allocation deadlock in kmem_alloc Andrew Carr
2019-11-15 19:52   ` Eric Sandeen
2019-11-15 23:43     ` Dave Chinner
2019-11-16 16:19       ` Andrew Carr
2019-11-19 15:49         ` Andrew Carr
2019-11-19 20:20           ` Dave Chinner
2019-11-20 15:43             ` Andrew Carr
2019-11-22 14:08               ` Andrew Carr
2019-11-22 16:12                 ` Darrick J. Wong
     [not found]                   ` <CAC752A=7x+gh9Jr8-koQtuZDvMzrs6qRc+saj=TMC3js9EdHbg@mail.gmail.com>
2019-11-22 18:49                     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).