All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>,
	LKML <linux-kernel@vger.kernel.org>,
	Artem Bityutskiy <dedekind1@gmail.com>,
	Richard Weinberger <richard@nod.at>,
	David Woodhouse <dwmw2@infradead.org>,
	Brian Norris <computersforpeace@gmail.com>,
	Boris Brezillon <boris.brezillon@free-electrons.com>,
	Marek Vasut <marek.vasut@gmail.com>,
	Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Bob Peterson <rpeterso@redhat.com>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Philippe Ombredanne <pombredanne@nexb.com>,
	Kate Stewart <kstewart@linuxfoundation.org>,
	Mikulas Patocka <mpatocka@redhat.com>,
	linux-mtd@lists.infradead.org, linux-ext4@vger.kernel.org,
	cluster-devel@redhat.com, linux-nfs@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: vmalloc with GFP_NOFS
Date: Wed, 9 May 2018 08:13:51 -0700	[thread overview]
Message-ID: <20180509151351.GA4111@magnolia> (raw)
In-Reply-To: <20180509134222.GU32366@dhcp22.suse.cz>

On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> [...]
> > > As a suggestion, could you take
> > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > scope api (which I think you've written about e-mails at length
> > > before), and put that into a file in Documentation/core-api?
> > 
> > I can.
> 
> Does something like the below sound reasonable/helpful?
> ---
> =================================
> GFP masks used from FS/IO context
> =================================
> 
> :Date: Mapy, 2018
> :Author: Michal Hocko <mhocko@kernel.org>
> 
> Introduction
> ============
> 
> FS resp. IO submitting code paths have to be careful when allocating

Not sure what 'FS resp. IO' means here -- 'FS and IO' ?

(Or is this one of those things where this looks like plain English text
but in reality it's some sort of markup that I'm not so familiar with?)

Confused because I've seen 'resp.' used as shorthand for
'responsible'...

> memory to prevent from potential recursion deadlocks caused by direct
> memory reclaim calling back into the FS/IO path and block on already
> held resources (e.g. locks). Traditional way to avoid this problem

'The traditional way to avoid this deadlock problem...'

> is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> resp. GFP_NOIO can be used as shortcut.
> 
> This has been the traditional way to avoid deadlocks since ages. It

I think this sentence is a little redundant with the previous sentence,
you could chop it out and join this paragraph to the one before it.

> turned out though that above approach has led to abuses when the restricted
> gfp mask is used "just in case" without a deeper consideration which leads
> to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> memory over-reclaim or other memory reclaim issues.
> 
> New API
> =======
> 
> Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. ``memalloc_noio_save``,
> ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> section from the memory reclaim recursion into FS/IO POV. Any allocation
> from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
> mask so no memory allocation can recurse back in the FS/IO.
> 
> FS/IO code then simply calls the appropriate save function right at
> the layer where a lock taken from the reclaim context (e.g. shrinker)
> is taken and the corresponding restore function when the lock is
> released. All that ideally along with an explanation what is the reclaim
> context for easier maintenance.
> 
> What about __vmalloc(GFP_NOFS)
> ==============================
> 
> vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> GFP_KERNEL allocations deep inside the allocator which are quit non-trivial

...which are quite non-trivial...

> to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> almost always a bug. The good news is that the NOFS/NOIO semantic can be
> achieved by the scope api.
> 
> In the ideal world, upper layers should already mark dangerous contexts
> and so no special care is required and vmalloc should be called without
> any problems. Sometimes if the context is not really clear or there are
> layering violations then the recommended way around that is to wrap ``vmalloc``
> by the scope API with a comment explaining the problem.

Otherwise looks ok to me based on my understanding of how all this is
supposed to work...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> -- 
> Michal Hocko
> SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Andreas Dilger <adilger.kernel@dilger.ca>,
	Boris Brezillon <boris.brezillon@free-electrons.com>,
	Richard Weinberger <richard@nod.at>,
	cluster-devel@redhat.com, Marek Vasut <marek.vasut@gmail.com>,
	linux-ext4@vger.kernel.org,
	Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	linux-nfs@vger.kernel.org, "Theodore Y. Ts'o" <tytso@mit.edu>,
	Artem Bityutskiy <dedekind1@gmail.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Philippe Ombredanne <pombredanne@nexb.com>,
	Bob Peterson <rpeterso@redhat.com>,
	linux-mtd@lists.infradead.org,
	Brian Norris <computersforpeace@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Anna Schumaker <anna.schumaker@netapp.c
Subject: Re: vmalloc with GFP_NOFS
Date: Wed, 9 May 2018 08:13:51 -0700	[thread overview]
Message-ID: <20180509151351.GA4111@magnolia> (raw)
In-Reply-To: <20180509134222.GU32366@dhcp22.suse.cz>

On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> [...]
> > > As a suggestion, could you take
> > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > scope api (which I think you've written about e-mails at length
> > > before), and put that into a file in Documentation/core-api?
> > 
> > I can.
> 
> Does something like the below sound reasonable/helpful?
> ---
> =================================
> GFP masks used from FS/IO context
> =================================
> 
> :Date: Mapy, 2018
> :Author: Michal Hocko <mhocko@kernel.org>
> 
> Introduction
> ============
> 
> FS resp. IO submitting code paths have to be careful when allocating

Not sure what 'FS resp. IO' means here -- 'FS and IO' ?

(Or is this one of those things where this looks like plain English text
but in reality it's some sort of markup that I'm not so familiar with?)

Confused because I've seen 'resp.' used as shorthand for
'responsible'...

> memory to prevent from potential recursion deadlocks caused by direct
> memory reclaim calling back into the FS/IO path and block on already
> held resources (e.g. locks). Traditional way to avoid this problem

'The traditional way to avoid this deadlock problem...'

> is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> resp. GFP_NOIO can be used as shortcut.
> 
> This has been the traditional way to avoid deadlocks since ages. It

I think this sentence is a little redundant with the previous sentence,
you could chop it out and join this paragraph to the one before it.

> turned out though that above approach has led to abuses when the restricted
> gfp mask is used "just in case" without a deeper consideration which leads
> to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> memory over-reclaim or other memory reclaim issues.
> 
> New API
> =======
> 
> Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. ``memalloc_noio_save``,
> ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> section from the memory reclaim recursion into FS/IO POV. Any allocation
> from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
> mask so no memory allocation can recurse back in the FS/IO.
> 
> FS/IO code then simply calls the appropriate save function right at
> the layer where a lock taken from the reclaim context (e.g. shrinker)
> is taken and the corresponding restore function when the lock is
> released. All that ideally along with an explanation what is the reclaim
> context for easier maintenance.
> 
> What about __vmalloc(GFP_NOFS)
> ==============================
> 
> vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> GFP_KERNEL allocations deep inside the allocator which are quit non-trivial

...which are quite non-trivial...

> to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> almost always a bug. The good news is that the NOFS/NOIO semantic can be
> achieved by the scope api.
> 
> In the ideal world, upper layers should already mark dangerous contexts
> and so no special care is required and vmalloc should be called without
> any problems. Sometimes if the context is not really clear or there are
> layering violations then the recommended way around that is to wrap ``vmalloc``
> by the scope API with a comment explaining the problem.

Otherwise looks ok to me based on my understanding of how all this is
supposed to work...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> -- 
> Michal Hocko
> SUSE Labs

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

WARNING: multiple messages have this Message-ID (diff)
From: Darrick J. Wong <darrick.wong@oracle.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] vmalloc with GFP_NOFS
Date: Wed, 9 May 2018 08:13:51 -0700	[thread overview]
Message-ID: <20180509151351.GA4111@magnolia> (raw)
In-Reply-To: <20180509134222.GU32366@dhcp22.suse.cz>

On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> [...]
> > > As a suggestion, could you take
> > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > scope api (which I think you've written about e-mails at length
> > > before), and put that into a file in Documentation/core-api?
> > 
> > I can.
> 
> Does something like the below sound reasonable/helpful?
> ---
> =================================
> GFP masks used from FS/IO context
> =================================
> 
> :Date: Mapy, 2018
> :Author: Michal Hocko <mhocko@kernel.org>
> 
> Introduction
> ============
> 
> FS resp. IO submitting code paths have to be careful when allocating

Not sure what 'FS resp. IO' means here -- 'FS and IO' ?

(Or is this one of those things where this looks like plain English text
but in reality it's some sort of markup that I'm not so familiar with?)

Confused because I've seen 'resp.' used as shorthand for
'responsible'...

> memory to prevent from potential recursion deadlocks caused by direct
> memory reclaim calling back into the FS/IO path and block on already
> held resources (e.g. locks). Traditional way to avoid this problem

'The traditional way to avoid this deadlock problem...'

> is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> resp. GFP_NOIO can be used as shortcut.
> 
> This has been the traditional way to avoid deadlocks since ages. It

I think this sentence is a little redundant with the previous sentence,
you could chop it out and join this paragraph to the one before it.

> turned out though that above approach has led to abuses when the restricted
> gfp mask is used "just in case" without a deeper consideration which leads
> to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> memory over-reclaim or other memory reclaim issues.
> 
> New API
> =======
> 
> Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. ``memalloc_noio_save``,
> ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> section from the memory reclaim recursion into FS/IO POV. Any allocation
> from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
> mask so no memory allocation can recurse back in the FS/IO.
> 
> FS/IO code then simply calls the appropriate save function right at
> the layer where a lock taken from the reclaim context (e.g. shrinker)
> is taken and the corresponding restore function when the lock is
> released. All that ideally along with an explanation what is the reclaim
> context for easier maintenance.
> 
> What about __vmalloc(GFP_NOFS)
> ==============================
> 
> vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> GFP_KERNEL allocations deep inside the allocator which are quit non-trivial

...which are quite non-trivial...

> to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> almost always a bug. The good news is that the NOFS/NOIO semantic can be
> achieved by the scope api.
> 
> In the ideal world, upper layers should already mark dangerous contexts
> and so no special care is required and vmalloc should be called without
> any problems. Sometimes if the context is not really clear or there are
> layering violations then the recommended way around that is to wrap ``vmalloc``
> by the scope API with a comment explaining the problem.

Otherwise looks ok to me based on my understanding of how all this is
supposed to work...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> -- 
> Michal Hocko
> SUSE Labs



  parent reply	other threads:[~2018-05-09 15:15 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-24 16:27 vmalloc with GFP_NOFS Michal Hocko
2018-04-24 16:27 ` [Cluster-devel] " Michal Hocko
2018-04-24 16:27 ` Michal Hocko
2018-04-24 16:27 ` Michal Hocko
2018-04-24 16:46 ` Mikulas Patocka
2018-04-24 16:46   ` [Cluster-devel] " Mikulas Patocka
2018-04-24 16:46   ` Mikulas Patocka
2018-04-24 16:55   ` Michal Hocko
2018-04-24 16:55     ` [Cluster-devel] " Michal Hocko
2018-04-24 16:55     ` Michal Hocko
2018-04-24 17:05     ` Mikulas Patocka
2018-04-24 17:05       ` [Cluster-devel] " Mikulas Patocka
2018-04-24 17:05       ` Mikulas Patocka
2018-04-24 18:35 ` Theodore Y. Ts'o
2018-04-24 18:35   ` [Cluster-devel] " Theodore Y. Ts'o
2018-04-24 18:35   ` Theodore Y. Ts'o
2018-04-24 19:25   ` Michal Hocko
2018-04-24 19:25     ` [Cluster-devel] " Michal Hocko
2018-04-24 19:25     ` Michal Hocko
2018-05-09 13:42     ` Michal Hocko
2018-05-09 13:42       ` [Cluster-devel] " Michal Hocko
2018-05-09 13:42       ` Michal Hocko
2018-05-09 14:13       ` David Sterba
2018-05-09 14:13         ` [Cluster-devel] " David Sterba
2018-05-09 14:13         ` David Sterba
2018-05-09 15:13       ` Darrick J. Wong [this message]
2018-05-09 15:13         ` [Cluster-devel] " Darrick J. Wong
2018-05-09 15:13         ` Darrick J. Wong
2018-05-09 16:24         ` Mike Rapoport
2018-05-09 16:24           ` [Cluster-devel] " Mike Rapoport
2018-05-09 16:24           ` Mike Rapoport
2018-05-09 21:06           ` Michal Hocko
2018-05-09 21:06             ` [Cluster-devel] " Michal Hocko
2018-05-09 21:06             ` Michal Hocko
2018-05-09 21:04         ` Michal Hocko
2018-05-09 21:04           ` [Cluster-devel] " Michal Hocko
2018-05-09 21:04           ` Michal Hocko
2018-05-09 22:02           ` Darrick J. Wong
2018-05-09 22:02             ` [Cluster-devel] " Darrick J. Wong
2018-05-09 22:02             ` Darrick J. Wong
2018-05-10  5:58             ` Michal Hocko
2018-05-10  5:58               ` [Cluster-devel] " Michal Hocko
2018-05-10  5:58               ` Michal Hocko
2018-05-10  7:18               ` Michal Hocko
2018-05-10  7:18                 ` [Cluster-devel] " Michal Hocko
2018-05-10  7:18                 ` Michal Hocko
2018-05-24 11:43   ` [PATCH] doc: document scope NOFS, NOIO APIs Michal Hocko
2018-05-24 11:43     ` Michal Hocko
2018-05-24 14:33     ` Shakeel Butt
2018-05-24 14:47       ` Michal Hocko
2018-05-24 16:37     ` Randy Dunlap
2018-05-25  7:52       ` Michal Hocko
2018-05-28  7:21         ` Nikolay Borisov
2018-05-29  8:22           ` Michal Hocko
2018-05-28 11:32         ` Vlastimil Babka
2018-05-24 20:52     ` Jonathan Corbet
2018-05-24 20:52       ` Jonathan Corbet
2018-05-25  8:11       ` Michal Hocko
2018-05-24 22:17     ` Dave Chinner
2018-05-24 23:25       ` Theodore Y. Ts'o
2018-05-25  8:16       ` Michal Hocko
2018-05-27 12:47         ` Mike Rapoport
2018-05-28  9:21           ` Michal Hocko
2018-05-28 16:10             ` Randy Dunlap
2018-05-29  8:21               ` Michal Hocko
2018-05-27 23:48         ` Dave Chinner
2018-05-28  9:19           ` Michal Hocko
2018-05-28 22:32             ` Dave Chinner
2018-05-29  8:18               ` Michal Hocko
2018-05-29  8:26     ` [PATCH v2] " Michal Hocko
2018-05-29  8:26       ` Michal Hocko
2018-05-29 10:22       ` Dave Chinner
2018-05-29 11:50       ` Mike Rapoport
2018-05-29 11:51       ` Jonathan Corbet
2018-05-29 11:51         ` Jonathan Corbet
2018-05-29 12:37         ` Michal Hocko
2018-07-17 12:49   ` vmalloc with GFP_NOFS Michal Hocko
2018-07-17 12:49     ` [Cluster-devel] " Michal Hocko
2018-07-17 12:49     ` Michal Hocko
2018-04-24 19:03 ` Richard Weinberger
2018-04-24 19:03   ` [Cluster-devel] " Richard Weinberger
2018-04-24 19:03   ` Richard Weinberger
2018-04-24 19:28   ` Michal Hocko
2018-04-24 19:28     ` [Cluster-devel] " Michal Hocko
2018-04-24 19:28     ` Michal Hocko
2018-04-24 22:18     ` Richard Weinberger
2018-04-24 22:18       ` [Cluster-devel] " Richard Weinberger
2018-04-24 22:18       ` Richard Weinberger
2018-04-24 23:09       ` Michal Hocko
2018-04-24 23:09         ` [Cluster-devel] " Michal Hocko
2018-04-24 23:09         ` Michal Hocko
2018-04-24 23:17         ` Mikulas Patocka
2018-04-24 23:17           ` [Cluster-devel] " Mikulas Patocka
2018-04-24 23:17           ` Mikulas Patocka
2018-04-24 23:25           ` Michal Hocko
2018-04-24 23:25             ` [Cluster-devel] " Michal Hocko
2018-04-24 23:25             ` Michal Hocko
2018-04-25 12:43             ` Mikulas Patocka
2018-04-25 12:43               ` [Cluster-devel] " Mikulas Patocka
2018-04-25 12:43               ` Mikulas Patocka
2018-04-25 14:45               ` Michal Hocko
2018-04-25 14:45                 ` [Cluster-devel] " Michal Hocko
2018-04-25 14:45                 ` Michal Hocko
2018-04-25 15:25                 ` Mikulas Patocka
2018-04-25 15:25                   ` [Cluster-devel] " Mikulas Patocka
2018-04-25 15:25                   ` Mikulas Patocka
2018-04-25 16:56                   ` Michal Hocko
2018-04-25 16:56                     ` [Cluster-devel] " Michal Hocko
2018-04-25 16:56                     ` Michal Hocko
2018-07-17 12:47   ` Michal Hocko
2018-07-17 12:47     ` [Cluster-devel] " Michal Hocko
2018-07-17 12:47     ` Michal Hocko
2018-04-24 19:05 ` Richard Weinberger
2018-04-24 19:05   ` [Cluster-devel] " Richard Weinberger
2018-04-24 19:05   ` Richard Weinberger
2018-04-24 19:10 ` Richard Weinberger
2018-04-24 19:10   ` [Cluster-devel] " Richard Weinberger
2018-04-24 19:10   ` Richard Weinberger
2018-04-24 19:26 ` Steven Whitehouse
2018-04-24 19:26   ` [Cluster-devel] " Steven Whitehouse
2018-04-24 19:26   ` Steven Whitehouse
2018-04-24 20:09   ` Michal Hocko
2018-04-24 20:09     ` [Cluster-devel] " Michal Hocko
2018-04-24 20:09     ` Michal Hocko
2018-07-17 12:50     ` Michal Hocko
2018-07-17 12:50       ` [Cluster-devel] " Michal Hocko
2018-07-17 12:50       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180509151351.GA4111@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=adrian.hunter@intel.com \
    --cc=anna.schumaker@netapp.com \
    --cc=boris.brezillon@free-electrons.com \
    --cc=cluster-devel@redhat.com \
    --cc=computersforpeace@gmail.com \
    --cc=cyrille.pitchen@wedev4u.fr \
    --cc=dedekind1@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=marek.vasut@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=pombredanne@nexb.com \
    --cc=richard@nod.at \
    --cc=rpeterso@redhat.com \
    --cc=swhiteho@redhat.com \
    --cc=trond.myklebust@primarydata.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.