All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@kernel.org, david@fromorbit.com, dchinner@redhat.com,
	hch@lst.de, mgorman@suse.de, viro@ZenIV.linux.org.uk,
	linux-mm@kvack.org, hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org, darrick.wong@oracle.com,
	linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone
Date: Mon, 6 Feb 2017 09:35:33 -0500	[thread overview]
Message-ID: <20170206143533.GC57865@bfoster.bfoster> (raw)
In-Reply-To: <201702061529.ABC60444.FFFJOOHLVQSMtO@I-love.SAKURA.ne.jp>

On Mon, Feb 06, 2017 at 03:29:24PM +0900, Tetsuo Handa wrote:
> Brian Foster wrote:
> > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote:
> > > [Let's CC more xfs people]
> > > 
> > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote:
> > > [...]
> > > > (1) I got an assertion failure.
> > > 
> > > I suspect this is a result of
> > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
> > > I have no idea what the assert means though.
> > > 
> > > > 
> > > > [  969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB
> > > > [  969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> > > > [  972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867
> > 
> > Indirect block reservation underrun on delayed allocation extent merge.
> > These are extra blocks are used for the inode bmap btree when a delalloc
> > extent is converted to physical blocks. We're in a case where we expect
> > to only ever free excess blocks due to a merge of extents with
> > independent reservations, but a situation occurs where we actually need
> > blocks and hence the assert fails. This can occur if an extent is merged
> > with one that has a reservation less than the expected worst case
> > reservation for its size (due to previous extent splits due to hole
> > punches, for example). Therefore, I think the core expectation that
> > xfs_bmap_add_extent_hole_delay() will always have enough blocks
> > pre-reserved is invalid.
> > 
> > Can you describe the workload that reproduces this? FWIW, I think the
> > way xfs_bmap_add_extent_hole_delay() currently works is likely broken
> > and have a couple patches to fix up indlen reservation that I haven't
> > posted yet. The diff that deals with this particular bit is appended.
> > Care to give that a try?
> 
> The workload is to write to a single file on XFS from 10 processes demonstrated at
> http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp
> using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM.
> With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures.
> 

Thanks for testing. Well, that's an interesting workload. I couldn't
reproduce on a few quick tries in a similarly configured vm.

Normally I'd expect to see this kind of thing on a hole punching
workload or dealing with large, sparse files that make use of
speculative preallocation (post-eof blocks allocated in anticipation of
file extending writes). I'm wondering if what is happening here is that
the appending writes and file closes due to oom kills are generating
speculative preallocs and prealloc truncates, respectively, and that
causes prealloc extents at the eof boundary to be split up and then
re-merged by surviving appending writers.

/tmp/file _is_ on an XFS filesystem in your test, correct? If so and if
you still have the output file from a test that reproduced, could you
get the 'xfs_io -c "fiemap -v" <file>' output?

I suppose another possibility is that prealloc occurs, write failure(s)
leads to extent splits via unmapping the target range of the write, and
then surviving writers generate the warning on a delalloc extent merge..

Brian

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Brian Foster <bfoster@redhat.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@kernel.org, david@fromorbit.com, dchinner@redhat.com,
	hch@lst.de, mgorman@suse.de, viro@ZenIV.linux.org.uk,
	linux-mm@kvack.org, hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org, darrick.wong@oracle.com,
	linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone
Date: Mon, 6 Feb 2017 09:35:33 -0500	[thread overview]
Message-ID: <20170206143533.GC57865@bfoster.bfoster> (raw)
In-Reply-To: <201702061529.ABC60444.FFFJOOHLVQSMtO@I-love.SAKURA.ne.jp>

On Mon, Feb 06, 2017 at 03:29:24PM +0900, Tetsuo Handa wrote:
> Brian Foster wrote:
> > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote:
> > > [Let's CC more xfs people]
> > > 
> > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote:
> > > [...]
> > > > (1) I got an assertion failure.
> > > 
> > > I suspect this is a result of
> > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
> > > I have no idea what the assert means though.
> > > 
> > > > 
> > > > [  969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB
> > > > [  969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> > > > [  972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867
> > 
> > Indirect block reservation underrun on delayed allocation extent merge.
> > These are extra blocks are used for the inode bmap btree when a delalloc
> > extent is converted to physical blocks. We're in a case where we expect
> > to only ever free excess blocks due to a merge of extents with
> > independent reservations, but a situation occurs where we actually need
> > blocks and hence the assert fails. This can occur if an extent is merged
> > with one that has a reservation less than the expected worst case
> > reservation for its size (due to previous extent splits due to hole
> > punches, for example). Therefore, I think the core expectation that
> > xfs_bmap_add_extent_hole_delay() will always have enough blocks
> > pre-reserved is invalid.
> > 
> > Can you describe the workload that reproduces this? FWIW, I think the
> > way xfs_bmap_add_extent_hole_delay() currently works is likely broken
> > and have a couple patches to fix up indlen reservation that I haven't
> > posted yet. The diff that deals with this particular bit is appended.
> > Care to give that a try?
> 
> The workload is to write to a single file on XFS from 10 processes demonstrated at
> http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp
> using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM.
> With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures.
> 

Thanks for testing. Well, that's an interesting workload. I couldn't
reproduce on a few quick tries in a similarly configured vm.

Normally I'd expect to see this kind of thing on a hole punching
workload or dealing with large, sparse files that make use of
speculative preallocation (post-eof blocks allocated in anticipation of
file extending writes). I'm wondering if what is happening here is that
the appending writes and file closes due to oom kills are generating
speculative preallocs and prealloc truncates, respectively, and that
causes prealloc extents at the eof boundary to be split up and then
re-merged by surviving appending writers.

/tmp/file _is_ on an XFS filesystem in your test, correct? If so and if
you still have the output file from a test that reproduced, could you
get the 'xfs_io -c "fiemap -v" <file>' output?

I suppose another possibility is that prealloc occurs, write failure(s)
leads to extent splits via unmapping the target range of the write, and
then surviving writers generate the warning on a delalloc extent merge..

Brian

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-02-06 14:35 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-18 13:44 [RFC PATCH 0/2] fix unbounded too_many_isolated Michal Hocko
2017-01-18 13:44 ` Michal Hocko
2017-01-18 13:44 ` [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone Michal Hocko
2017-01-18 13:44   ` Michal Hocko
2017-01-18 14:46   ` Mel Gorman
2017-01-18 14:46     ` Mel Gorman
2017-01-18 15:15     ` Michal Hocko
2017-01-18 15:15       ` Michal Hocko
2017-01-18 15:54       ` Mel Gorman
2017-01-18 15:54         ` Mel Gorman
2017-01-18 16:17         ` Michal Hocko
2017-01-18 16:17           ` Michal Hocko
2017-01-18 17:00           ` Mel Gorman
2017-01-18 17:00             ` Mel Gorman
2017-01-18 17:29             ` Michal Hocko
2017-01-18 17:29               ` Michal Hocko
2017-01-19 10:07               ` Mel Gorman
2017-01-19 10:07                 ` Mel Gorman
2017-01-19 11:23                 ` Michal Hocko
2017-01-19 11:23                   ` Michal Hocko
2017-01-19 13:11                   ` Mel Gorman
2017-01-19 13:11                     ` Mel Gorman
2017-01-20 13:27                     ` Tetsuo Handa
2017-01-20 13:27                       ` Tetsuo Handa
2017-01-21  7:42                       ` Tetsuo Handa
2017-01-21  7:42                         ` Tetsuo Handa
2017-01-25 10:15                         ` Michal Hocko
2017-01-25 10:15                           ` Michal Hocko
2017-01-25 10:19                           ` Christoph Hellwig
2017-01-25 10:19                             ` Christoph Hellwig
2017-01-25 10:46                             ` Michal Hocko
2017-01-25 10:46                               ` Michal Hocko
2017-01-25 11:09                               ` Tetsuo Handa
2017-01-25 11:09                                 ` Tetsuo Handa
2017-01-25 13:00                                 ` Michal Hocko
2017-01-25 13:00                                   ` Michal Hocko
2017-01-27 14:49                                   ` Michal Hocko
2017-01-27 14:49                                     ` Michal Hocko
2017-01-28 15:27                                     ` Tetsuo Handa
2017-01-28 15:27                                       ` Tetsuo Handa
2017-01-30  8:55                                       ` Michal Hocko
2017-01-30  8:55                                         ` Michal Hocko
2017-02-02 10:14                                         ` Michal Hocko
2017-02-02 10:14                                           ` Michal Hocko
2017-02-03 10:57                                           ` Tetsuo Handa
2017-02-03 10:57                                             ` Tetsuo Handa
2017-02-03 14:41                                             ` Michal Hocko
2017-02-03 14:41                                               ` Michal Hocko
2017-02-03 14:50                                             ` Michal Hocko
2017-02-03 14:50                                               ` Michal Hocko
2017-02-03 17:24                                               ` Brian Foster
2017-02-03 17:24                                                 ` Brian Foster
2017-02-06  6:29                                                 ` Tetsuo Handa
2017-02-06  6:29                                                   ` Tetsuo Handa
2017-02-06 14:35                                                   ` Brian Foster [this message]
2017-02-06 14:35                                                     ` Brian Foster
2017-02-06 14:42                                                     ` Michal Hocko
2017-02-06 14:42                                                       ` Michal Hocko
2017-02-06 15:47                                                       ` Brian Foster
2017-02-06 15:47                                                         ` Brian Foster
2017-02-07 10:30                                                     ` Tetsuo Handa
2017-02-07 10:30                                                       ` Tetsuo Handa
2017-02-07 16:54                                                       ` Brian Foster
2017-02-07 16:54                                                         ` Brian Foster
2017-02-03 14:55                                             ` Michal Hocko
2017-02-03 14:55                                               ` Michal Hocko
2017-02-05 10:43                                               ` Tetsuo Handa
2017-02-05 10:43                                                 ` Tetsuo Handa
2017-02-06 10:34                                                 ` Michal Hocko
2017-02-06 10:34                                                   ` Michal Hocko
2017-02-06 10:39                                                 ` Michal Hocko
2017-02-06 10:39                                                   ` Michal Hocko
2017-02-07 21:12                                                   ` Michal Hocko
2017-02-07 21:12                                                     ` Michal Hocko
2017-02-08  9:24                                                     ` Peter Zijlstra
2017-02-08  9:24                                                       ` Peter Zijlstra
2017-02-21  9:40                                             ` Michal Hocko
2017-02-21  9:40                                               ` Michal Hocko
2017-02-21 14:35                                               ` Tetsuo Handa
2017-02-21 14:35                                                 ` Tetsuo Handa
2017-02-21 15:53                                                 ` Michal Hocko
2017-02-21 15:53                                                   ` Michal Hocko
2017-02-22  2:02                                                   ` Tetsuo Handa
2017-02-22  2:02                                                     ` Tetsuo Handa
2017-02-22  7:54                                                     ` Michal Hocko
2017-02-22  7:54                                                       ` Michal Hocko
2017-02-26  6:30                                                       ` Tetsuo Handa
2017-02-26  6:30                                                         ` Tetsuo Handa
2017-01-31 11:58                                   ` Michal Hocko
2017-01-31 11:58                                     ` Michal Hocko
2017-01-31 12:51                                     ` Christoph Hellwig
2017-01-31 12:51                                       ` Christoph Hellwig
2017-01-31 13:21                                       ` Michal Hocko
2017-01-31 13:21                                         ` Michal Hocko
2017-01-25 10:33                           ` [RFC PATCH 1/2] mm, vmscan: account the number of isolated pagesper zone Tetsuo Handa
2017-01-25 10:33                             ` Tetsuo Handa
2017-01-25 12:34                             ` Michal Hocko
2017-01-25 12:34                               ` Michal Hocko
2017-01-25 13:13                               ` [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone Tetsuo Handa
2017-01-25 13:13                                 ` Tetsuo Handa
2017-01-25  9:53                       ` Michal Hocko
2017-01-25  9:53                         ` Michal Hocko
2017-01-20  6:42                 ` Hillf Danton
2017-01-20  6:42                   ` Hillf Danton
2017-01-20  9:25                   ` Mel Gorman
2017-01-20  9:25                     ` Mel Gorman
2017-01-18 13:44 ` [RFC PATCH 2/2] mm, vmscan: do not loop on too_many_isolated for ever Michal Hocko
2017-01-18 13:44   ` Michal Hocko
2017-01-18 14:50   ` Mel Gorman
2017-01-18 14:50     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170206143533.GC57865@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.