* implicit AOP_FLAG_NOFS for grab_cache_page_write_begin
@ 2020-04-15 7:02 Michal Hocko
2020-04-17 7:29 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Michal Hocko @ 2020-04-15 7:02 UTC (permalink / raw)
To: Christoph Hellwig, Darrick J. Wong; +Cc: linux-mm, linux-fsdevel, LKML
Hi,
I have just received a bug report about memcg OOM [1]. The underlying
issue is memcg specific but the stack trace made me look at the write(2)
patch and I have noticed that iomap_write_begin enforces AOP_FLAG_NOFS
which means that all the page cache that has to be allocated is
GFP_NOFS. What is the reason for this? Do all filesystems really need
the reclaim protection? I was hoping that those filesystems which really
need NOFS context would be using the scope API
(memalloc_nofs_{save,restore}.
Could you clarify please?
[1] http://lkml.kernel.org/r/20200414212558.58eaab4de2ecf864eaa87e5d@linux-foundation.org
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: implicit AOP_FLAG_NOFS for grab_cache_page_write_begin
2020-04-15 7:02 implicit AOP_FLAG_NOFS for grab_cache_page_write_begin Michal Hocko
@ 2020-04-17 7:29 ` Christoph Hellwig
2020-04-17 8:00 ` Michal Hocko
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2020-04-17 7:29 UTC (permalink / raw)
To: Michal Hocko; +Cc: Darrick J. Wong, linux-mm, linux-fsdevel, LKML, Dave Chinner
On Wed, Apr 15, 2020 at 09:02:28AM +0200, Michal Hocko wrote:
> Hi,
> I have just received a bug report about memcg OOM [1]. The underlying
> issue is memcg specific but the stack trace made me look at the write(2)
> patch and I have noticed that iomap_write_begin enforces AOP_FLAG_NOFS
> which means that all the page cache that has to be allocated is
> GFP_NOFS. What is the reason for this? Do all filesystems really need
> the reclaim protection? I was hoping that those filesystems which really
> need NOFS context would be using the scope API
> (memalloc_nofs_{save,restore}.
This comes from the historic XFS code, and this commit from Dave
in particular:
commit aea1b9532143218f8599ecedbbd6bfbf812385e1
Author: Dave Chinner <dchinner@redhat.com>
Date: Tue Jul 20 17:54:12 2010 +1000
xfs: use GFP_NOFS for page cache allocation
Avoid a lockdep warning by preventing page cache allocation from
recursing back into the filesystem during memory reclaim.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: implicit AOP_FLAG_NOFS for grab_cache_page_write_begin
2020-04-17 7:29 ` Christoph Hellwig
@ 2020-04-17 8:00 ` Michal Hocko
2020-04-17 8:06 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Michal Hocko @ 2020-04-17 8:00 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Darrick J. Wong, linux-mm, linux-fsdevel, LKML, Dave Chinner
On Fri 17-04-20 00:29:31, Christoph Hellwig wrote:
> On Wed, Apr 15, 2020 at 09:02:28AM +0200, Michal Hocko wrote:
> > Hi,
> > I have just received a bug report about memcg OOM [1]. The underlying
> > issue is memcg specific but the stack trace made me look at the write(2)
> > patch and I have noticed that iomap_write_begin enforces AOP_FLAG_NOFS
> > which means that all the page cache that has to be allocated is
> > GFP_NOFS. What is the reason for this? Do all filesystems really need
> > the reclaim protection? I was hoping that those filesystems which really
> > need NOFS context would be using the scope API
> > (memalloc_nofs_{save,restore}.
>
> This comes from the historic XFS code, and this commit from Dave
> in particular:
>
> commit aea1b9532143218f8599ecedbbd6bfbf812385e1
> Author: Dave Chinner <dchinner@redhat.com>
> Date: Tue Jul 20 17:54:12 2010 +1000
>
> xfs: use GFP_NOFS for page cache allocation
>
> Avoid a lockdep warning by preventing page cache allocation from
> recursing back into the filesystem during memory reclaim.
Thanks for digging this up! The changelog is not really clear whether
NOFS is to avoid false possitive lockup warnings or real ones. If the
former then we have grown __GFP_NOLOCKDEP flag to workaround the problem
if the later then can we use memalloc_nofs_{save,restore} in the xfs
specific code please?
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: implicit AOP_FLAG_NOFS for grab_cache_page_write_begin
2020-04-17 8:00 ` Michal Hocko
@ 2020-04-17 8:06 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2020-04-17 8:06 UTC (permalink / raw)
To: Michal Hocko
Cc: Christoph Hellwig, Darrick J. Wong, linux-mm, linux-fsdevel,
LKML, Dave Chinner
On Fri, Apr 17, 2020 at 10:00:03AM +0200, Michal Hocko wrote:
> > commit aea1b9532143218f8599ecedbbd6bfbf812385e1
> > Author: Dave Chinner <dchinner@redhat.com>
> > Date: Tue Jul 20 17:54:12 2010 +1000
> >
> > xfs: use GFP_NOFS for page cache allocation
> >
> > Avoid a lockdep warning by preventing page cache allocation from
> > recursing back into the filesystem during memory reclaim.
>
> Thanks for digging this up! The changelog is not really clear whether
> NOFS is to avoid false possitive lockup warnings or real ones. If the
> former then we have grown __GFP_NOLOCKDEP flag to workaround the problem
> if the later then can we use memalloc_nofs_{save,restore} in the xfs
> specific code please?
As far as I can tell we are never in a file system transaction in XFS
when allocating page cache pages. We do, however usually have i_rwsem
locked (or back in the day the XFS-specific predecessor). I'm not
sure what the current issues are, but maybe Dave remembers. In doubt
we should try removing the flag and run heavy stress testing with
lockdep enabled and see if it screams.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-04-17 8:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-15 7:02 implicit AOP_FLAG_NOFS for grab_cache_page_write_begin Michal Hocko
2020-04-17 7:29 ` Christoph Hellwig
2020-04-17 8:00 ` Michal Hocko
2020-04-17 8:06 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.