* [PATCH 0/4] extend vmalloc support for constrained allocations @ 2021-10-25 15:02 Michal Hocko 2021-10-25 15:02 ` [PATCH 1/4] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc Michal Hocko ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-25 15:02 UTC (permalink / raw) To: linux-mm Cc: Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton Hi, this has been posted as a RFC previously [1] and it seems there was no fundamental disagreement about the approach so I am dropping RFC and I have also integrated some feedback from that discussion. Based on a recent discussion with Dave and Neil [2] I have tried to implement NOFS, NOIO, NOFAIL support for the vmalloc to make life of kvmalloc users easier. A requirement for NOFAIL support for kvmalloc was new to me but this seems to be really needed by the xfs code. NOFS/NOIO was a known and a long term problem which was hoped to be handled by the scope API. Those scope should have been used at the reclaim recursion boundaries both to document them and also to remove the necessity of NOFS/NOIO constrains for all allocations within that scope. Instead workarounds were developed to wrap a single allocation instead (like ceph_kvmalloc). First patch implements NOFS/NOIO support for vmalloc. The second one adds NOFAIL support and the third one bundles all together into kvmalloc and drops ceph_kvmalloc which can use kvmalloc directly now. Please note that this is RFC and I haven't done any testing on this yet. I hope I haven't missed anything in the vmalloc allocator. It would be really great if Christoph and Uladzislau could have a look. Thanks! [1] http://lkml.kernel.org/r/20211018114712.9802-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/163184741778.29351.16920832234899124642.stgit@noble.brown ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/4] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc 2021-10-25 15:02 [PATCH 0/4] extend vmalloc support for constrained allocations Michal Hocko @ 2021-10-25 15:02 ` Michal Hocko 2021-10-25 15:02 ` [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL Michal Hocko ` (2 subsequent siblings) 3 siblings, 0 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-25 15:02 UTC (permalink / raw) To: linux-mm Cc: Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko From: Michal Hocko <mhocko@suse.com> vmalloc historically hasn't supported GFP_NO{FS,IO} requests because page table allocations do not support externally provided gfp mask and performed GFP_KERNEL like allocations. Since few years we have scope (memalloc_no{fs,io}_{save,restore}) APIs to enforce NOFS and NOIO constrains implicitly to all allocators within the scope. There was a hope that those scopes would be defined on a higher level when the reclaim recursion boundary starts/stops (e.g. when a lock required during the memory reclaim is required etc.). It seems that not all NOFS/NOIO users have adopted this approach and instead they have taken a workaround approach to wrap a single [k]vmalloc allocation by a scope API. These workarounds do not serve the purpose of a better reclaim recursion documentation and reduction of explicit GFP_NO{FS,IO} usege so let's just provide them with the semantic they are asking for without a need for workarounds. Add support for GFP_NOFS and GFP_NOIO to vmalloc directly. All internal allocations already comply with the given gfp_mask. The only current exception is vmap_pages_range which maps kernel page tables. Infer the proper scope API based on the given gfp mask. Signed-off-by: Michal Hocko <mhocko@suse.com> --- mm/vmalloc.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d77830ff604c..c6cc77d2f366 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2889,6 +2889,8 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, unsigned long array_size; unsigned int nr_small_pages = size >> PAGE_SHIFT; unsigned int page_order; + unsigned int flags; + int ret; array_size = (unsigned long)nr_small_pages * sizeof(struct page *); gfp_mask |= __GFP_NOWARN; @@ -2930,8 +2932,24 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, goto fail; } - if (vmap_pages_range(addr, addr + size, prot, area->pages, - page_shift) < 0) { + /* + * page tables allocations ignore external gfp mask, enforce it + * by the scope API + */ + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + flags = memalloc_nofs_save(); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + flags = memalloc_noio_save(); + + ret = vmap_pages_range(addr, addr + size, prot, area->pages, + page_shift); + + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + memalloc_nofs_restore(flags); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + memalloc_noio_restore(flags); + + if (ret < 0) { warn_alloc(gfp_mask, NULL, "vmalloc error: size %lu, failed to map pages", area->nr_pages * PAGE_SIZE); -- 2.30.2 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-25 15:02 [PATCH 0/4] extend vmalloc support for constrained allocations Michal Hocko 2021-10-25 15:02 ` [PATCH 1/4] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc Michal Hocko @ 2021-10-25 15:02 ` Michal Hocko 2021-10-25 22:59 ` NeilBrown 2021-10-26 15:48 ` Uladzislau Rezki 2021-10-25 15:02 ` [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags Michal Hocko 2021-10-25 15:02 ` [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc Michal Hocko 3 siblings, 2 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-25 15:02 UTC (permalink / raw) To: linux-mm Cc: Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko From: Michal Hocko <mhocko@suse.com> Dave Chinner has mentioned that some of the xfs code would benefit from kvmalloc support for __GFP_NOFAIL because they have allocations that cannot fail and they do not fit into a single page. The larg part of the vmalloc implementation already complies with the given gfp flags so there is no work for those to be done. The area and page table allocations are an exception to that. Implement a retry loop for those. Add a short sleep before retrying. 1 jiffy is a completely random timeout. Ideally the retry would wait for an explicit event - e.g. a change to the vmalloc space change if the failure was caused by the space fragmentation or depletion. But there are multiple different reasons to retry and this could become much more complex. Keep the retry simple for now and just sleep to prevent from hogging CPUs. Signed-off-by: Michal Hocko <mhocko@suse.com> --- mm/vmalloc.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c6cc77d2f366..602649919a9d 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) flags = memalloc_noio_save(); - ret = vmap_pages_range(addr, addr + size, prot, area->pages, + do { + ret = vmap_pages_range(addr, addr + size, prot, area->pages, page_shift); + if (ret < 0) + schedule_timeout_uninterruptible(1); + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) memalloc_nofs_restore(flags); @@ -3032,6 +3036,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, warn_alloc(gfp_mask, NULL, "vmalloc error: size %lu, vm_struct allocation failed", real_size); + if (gfp_mask & __GFP_NOFAIL) { + schedule_timeout_uninterruptible(1); + goto again; + } goto fail; } -- 2.30.2 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-25 15:02 ` [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL Michal Hocko @ 2021-10-25 22:59 ` NeilBrown 2021-10-26 7:03 ` Michal Hocko 2021-10-26 15:48 ` Uladzislau Rezki 1 sibling, 1 reply; 26+ messages in thread From: NeilBrown @ 2021-10-25 22:59 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko On Tue, 26 Oct 2021, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > Dave Chinner has mentioned that some of the xfs code would benefit from > kvmalloc support for __GFP_NOFAIL because they have allocations that > cannot fail and they do not fit into a single page. > > The larg part of the vmalloc implementation already complies with the *large* > given gfp flags so there is no work for those to be done. The area > and page table allocations are an exception to that. Implement a retry > loop for those. > > Add a short sleep before retrying. 1 jiffy is a completely random > timeout. Ideally the retry would wait for an explicit event - e.g. > a change to the vmalloc space change if the failure was caused by > the space fragmentation or depletion. But there are multiple different > reasons to retry and this could become much more complex. Keep the retry > simple for now and just sleep to prevent from hogging CPUs. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > mm/vmalloc.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index c6cc77d2f366..602649919a9d 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > flags = memalloc_noio_save(); > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > + do { > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > page_shift); > + if (ret < 0) > + schedule_timeout_uninterruptible(1); > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > > if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) > memalloc_nofs_restore(flags); > @@ -3032,6 +3036,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > warn_alloc(gfp_mask, NULL, > "vmalloc error: size %lu, vm_struct allocation failed", > real_size); > + if (gfp_mask & __GFP_NOFAIL) { > + schedule_timeout_uninterruptible(1); > + goto again; > + } Shouldn't the retry happen *before* the warning? NeilBrown > goto fail; > } > > -- > 2.30.2 > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-25 22:59 ` NeilBrown @ 2021-10-26 7:03 ` Michal Hocko 2021-10-26 10:30 ` NeilBrown 0 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-26 7:03 UTC (permalink / raw) To: NeilBrown Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 09:59:36, Neil Brown wrote: > On Tue, 26 Oct 2021, Michal Hocko wrote: [...] > > @@ -3032,6 +3036,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > > warn_alloc(gfp_mask, NULL, > > "vmalloc error: size %lu, vm_struct allocation failed", > > real_size); > > + if (gfp_mask & __GFP_NOFAIL) { > > + schedule_timeout_uninterruptible(1); > > + goto again; > > + } > > Shouldn't the retry happen *before* the warning? I've done it after to catch the "depleted or fragmented" vmalloc space. This is not related to the memory available and therefore it won't be handled by the oom killer. The error message shouldn't imply the vmalloc allocation failure IMHO but I am open to suggestions. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-26 7:03 ` Michal Hocko @ 2021-10-26 10:30 ` NeilBrown 2021-10-26 11:29 ` Michal Hocko 0 siblings, 1 reply; 26+ messages in thread From: NeilBrown @ 2021-10-26 10:30 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue, 26 Oct 2021, Michal Hocko wrote: > On Tue 26-10-21 09:59:36, Neil Brown wrote: > > On Tue, 26 Oct 2021, Michal Hocko wrote: > [...] > > > @@ -3032,6 +3036,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > > > warn_alloc(gfp_mask, NULL, > > > "vmalloc error: size %lu, vm_struct allocation failed", > > > real_size); > > > + if (gfp_mask & __GFP_NOFAIL) { > > > + schedule_timeout_uninterruptible(1); > > > + goto again; > > > + } > > > > Shouldn't the retry happen *before* the warning? > > I've done it after to catch the "depleted or fragmented" vmalloc space. > This is not related to the memory available and therefore it won't be > handled by the oom killer. The error message shouldn't imply the vmalloc > allocation failure IMHO but I am open to suggestions. The word "failed" does seem to imply what you don't want it to imply... I guess it is reasonable to have this warning, but maybe add " -- retrying" if __GFP_NOFAIL. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-26 10:30 ` NeilBrown @ 2021-10-26 11:29 ` Michal Hocko 0 siblings, 0 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-26 11:29 UTC (permalink / raw) To: NeilBrown Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 21:30:52, Neil Brown wrote: > On Tue, 26 Oct 2021, Michal Hocko wrote: > > On Tue 26-10-21 09:59:36, Neil Brown wrote: > > > On Tue, 26 Oct 2021, Michal Hocko wrote: > > [...] > > > > @@ -3032,6 +3036,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > > > > warn_alloc(gfp_mask, NULL, > > > > "vmalloc error: size %lu, vm_struct allocation failed", > > > > real_size); > > > > + if (gfp_mask & __GFP_NOFAIL) { > > > > + schedule_timeout_uninterruptible(1); > > > > + goto again; > > > > + } > > > > > > Shouldn't the retry happen *before* the warning? > > > > I've done it after to catch the "depleted or fragmented" vmalloc space. > > This is not related to the memory available and therefore it won't be > > handled by the oom killer. The error message shouldn't imply the vmalloc > > allocation failure IMHO but I am open to suggestions. > > The word "failed" does seem to imply what you don't want it to imply... > > I guess it is reasonable to have this warning, but maybe add " -- retrying" > if __GFP_NOFAIL. I do not have a strong opinion on that. I can surely do diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 602649919a9d..3489928fafa2 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3033,10 +3033,11 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, VM_UNINITIALIZED | vm_flags, start, end, node, gfp_mask, caller); if (!area) { + bool nofail = gfp_mask & __GFP_NOFAIL; warn_alloc(gfp_mask, NULL, - "vmalloc error: size %lu, vm_struct allocation failed", - real_size); - if (gfp_mask & __GFP_NOFAIL) { + "vmalloc error: size %lu, vm_struct allocation failed%s", + real_size, (nofail) ? ". Retrying." : ""); + if (nofail) { schedule_timeout_uninterruptible(1); goto again; } -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-25 15:02 ` [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL Michal Hocko 2021-10-25 22:59 ` NeilBrown @ 2021-10-26 15:48 ` Uladzislau Rezki 2021-10-26 16:28 ` Michal Hocko 1 sibling, 1 reply; 26+ messages in thread From: Uladzislau Rezki @ 2021-10-26 15:48 UTC (permalink / raw) To: Michal Hocko Cc: Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko > From: Michal Hocko <mhocko@suse.com> > > Dave Chinner has mentioned that some of the xfs code would benefit from > kvmalloc support for __GFP_NOFAIL because they have allocations that > cannot fail and they do not fit into a single page. > > The larg part of the vmalloc implementation already complies with the > given gfp flags so there is no work for those to be done. The area > and page table allocations are an exception to that. Implement a retry > loop for those. > > Add a short sleep before retrying. 1 jiffy is a completely random > timeout. Ideally the retry would wait for an explicit event - e.g. > a change to the vmalloc space change if the failure was caused by > the space fragmentation or depletion. But there are multiple different > reasons to retry and this could become much more complex. Keep the retry > simple for now and just sleep to prevent from hogging CPUs. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > mm/vmalloc.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index c6cc77d2f366..602649919a9d 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > flags = memalloc_noio_save(); > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > + do { > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > page_shift); > + if (ret < 0) > + schedule_timeout_uninterruptible(1); > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > 1. After that change a below code: <snip> if (ret < 0) { warn_alloc(orig_gfp_mask, NULL, "vmalloc error: size %lu, failed to map pages", area->nr_pages * PAGE_SIZE); goto fail; } <snip> does not make any sense anymore. 2. Can we combine two places where we handle __GFP_NOFAIL into one place? That would look like as more sorted out. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-26 15:48 ` Uladzislau Rezki @ 2021-10-26 16:28 ` Michal Hocko 2021-10-26 19:33 ` Uladzislau Rezki 0 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-26 16:28 UTC (permalink / raw) To: Uladzislau Rezki Cc: Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 17:48:32, Uladzislau Rezki wrote: > > From: Michal Hocko <mhocko@suse.com> > > > > Dave Chinner has mentioned that some of the xfs code would benefit from > > kvmalloc support for __GFP_NOFAIL because they have allocations that > > cannot fail and they do not fit into a single page. > > > > The larg part of the vmalloc implementation already complies with the > > given gfp flags so there is no work for those to be done. The area > > and page table allocations are an exception to that. Implement a retry > > loop for those. > > > > Add a short sleep before retrying. 1 jiffy is a completely random > > timeout. Ideally the retry would wait for an explicit event - e.g. > > a change to the vmalloc space change if the failure was caused by > > the space fragmentation or depletion. But there are multiple different > > reasons to retry and this could become much more complex. Keep the retry > > simple for now and just sleep to prevent from hogging CPUs. > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > --- > > mm/vmalloc.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index c6cc77d2f366..602649919a9d 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > > flags = memalloc_noio_save(); > > > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > + do { > > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > page_shift); > > + if (ret < 0) > > + schedule_timeout_uninterruptible(1); > > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > > > > 1. > After that change a below code: > > <snip> > if (ret < 0) { > warn_alloc(orig_gfp_mask, NULL, > "vmalloc error: size %lu, failed to map pages", > area->nr_pages * PAGE_SIZE); > goto fail; > } > <snip> > > does not make any sense anymore. Why? Allocations without __GFP_NOFAIL can still fail, no? > 2. > Can we combine two places where we handle __GFP_NOFAIL into one place? > That would look like as more sorted out. I have to admit I am not really fluent at vmalloc code so I wanted to make the code as simple as possible. How would I unwind all the allocated memory (already allocated as GFP_NOFAIL) before retrying at __vmalloc_node_range (if that is what you suggest). And isn't that a bit wasteful? Or did you have anything else in mind? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-26 16:28 ` Michal Hocko @ 2021-10-26 19:33 ` Uladzislau Rezki 2021-10-27 6:46 ` Michal Hocko 2021-10-27 17:55 ` Uladzislau Rezki 0 siblings, 2 replies; 26+ messages in thread From: Uladzislau Rezki @ 2021-10-26 19:33 UTC (permalink / raw) To: Michal Hocko Cc: Uladzislau Rezki, Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue, Oct 26, 2021 at 06:28:52PM +0200, Michal Hocko wrote: > On Tue 26-10-21 17:48:32, Uladzislau Rezki wrote: > > > From: Michal Hocko <mhocko@suse.com> > > > > > > Dave Chinner has mentioned that some of the xfs code would benefit from > > > kvmalloc support for __GFP_NOFAIL because they have allocations that > > > cannot fail and they do not fit into a single page. > > > > > > The larg part of the vmalloc implementation already complies with the > > > given gfp flags so there is no work for those to be done. The area > > > and page table allocations are an exception to that. Implement a retry > > > loop for those. > > > > > > Add a short sleep before retrying. 1 jiffy is a completely random > > > timeout. Ideally the retry would wait for an explicit event - e.g. > > > a change to the vmalloc space change if the failure was caused by > > > the space fragmentation or depletion. But there are multiple different > > > reasons to retry and this could become much more complex. Keep the retry > > > simple for now and just sleep to prevent from hogging CPUs. > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > --- > > > mm/vmalloc.c | 10 +++++++++- > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > index c6cc77d2f366..602649919a9d 100644 > > > --- a/mm/vmalloc.c > > > +++ b/mm/vmalloc.c > > > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > > > flags = memalloc_noio_save(); > > > > > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > + do { > > > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > page_shift); > > > + if (ret < 0) > > > + schedule_timeout_uninterruptible(1); > > > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > > > > > > > 1. > > After that change a below code: > > > > <snip> > > if (ret < 0) { > > warn_alloc(orig_gfp_mask, NULL, > > "vmalloc error: size %lu, failed to map pages", > > area->nr_pages * PAGE_SIZE); > > goto fail; > > } > > <snip> > > > > does not make any sense anymore. > > Why? Allocations without __GFP_NOFAIL can still fail, no? > Right. I meant one thing but wrote slightly differently. In case of vmap_pages_range() fails(if __GFP_NOFAIL is set) should we emit any warning message? Because either we can recover on a future iteration or it stuck there infinitely so a user does not understand what happened. From the other hand this is how __GFP_NOFAIL works, hm.. Another thing, i see that schedule_timeout_uninterruptible(1) is invoked for all cases even when __GFP_NOFAIL is not set, in that scenario we do not want to wait, instead we should return back to a caller asap. Or am i missing something here? > > 2. > > Can we combine two places where we handle __GFP_NOFAIL into one place? > > That would look like as more sorted out. > > I have to admit I am not really fluent at vmalloc code so I wanted to > make the code as simple as possible. How would I unwind all the allocated > memory (already allocated as GFP_NOFAIL) before retrying at > __vmalloc_node_range (if that is what you suggest). And isn't that a > bit wasteful? > > Or did you have anything else in mind? > It depends on how often all this can fail. But let me double check if such combining is easy. -- Vlad Rezki ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-26 19:33 ` Uladzislau Rezki @ 2021-10-27 6:46 ` Michal Hocko 2021-10-27 17:55 ` Uladzislau Rezki 1 sibling, 0 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-27 6:46 UTC (permalink / raw) To: Uladzislau Rezki Cc: Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 21:33:15, Uladzislau Rezki wrote: > On Tue, Oct 26, 2021 at 06:28:52PM +0200, Michal Hocko wrote: > > On Tue 26-10-21 17:48:32, Uladzislau Rezki wrote: > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > > > Dave Chinner has mentioned that some of the xfs code would benefit from > > > > kvmalloc support for __GFP_NOFAIL because they have allocations that > > > > cannot fail and they do not fit into a single page. > > > > > > > > The larg part of the vmalloc implementation already complies with the > > > > given gfp flags so there is no work for those to be done. The area > > > > and page table allocations are an exception to that. Implement a retry > > > > loop for those. > > > > > > > > Add a short sleep before retrying. 1 jiffy is a completely random > > > > timeout. Ideally the retry would wait for an explicit event - e.g. > > > > a change to the vmalloc space change if the failure was caused by > > > > the space fragmentation or depletion. But there are multiple different > > > > reasons to retry and this could become much more complex. Keep the retry > > > > simple for now and just sleep to prevent from hogging CPUs. > > > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > --- > > > > mm/vmalloc.c | 10 +++++++++- > > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > index c6cc77d2f366..602649919a9d 100644 > > > > --- a/mm/vmalloc.c > > > > +++ b/mm/vmalloc.c > > > > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > > > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > > > > flags = memalloc_noio_save(); > > > > > > > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > > + do { > > > > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > > page_shift); > > > > + if (ret < 0) > > > > + schedule_timeout_uninterruptible(1); > > > > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > > > > > > > > > > 1. > > > After that change a below code: > > > > > > <snip> > > > if (ret < 0) { > > > warn_alloc(orig_gfp_mask, NULL, > > > "vmalloc error: size %lu, failed to map pages", > > > area->nr_pages * PAGE_SIZE); > > > goto fail; > > > } > > > <snip> > > > > > > does not make any sense anymore. > > > > Why? Allocations without __GFP_NOFAIL can still fail, no? > > > Right. I meant one thing but wrote slightly differently. In case of > vmap_pages_range() fails(if __GFP_NOFAIL is set) should we emit any > warning message? Because either we can recover on a future iteration > or it stuck there infinitely so a user does not understand what happened. > From the other hand this is how __GFP_NOFAIL works, hm.. Yes, the page allocator doesn't warn either and I would like to keep this in sync. > Another thing, i see that schedule_timeout_uninterruptible(1) is invoked > for all cases even when __GFP_NOFAIL is not set, in that scenario we do > not want to wait, instead we should return back to a caller asap. Or am > i missing something here? OK, I will change that. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-26 19:33 ` Uladzislau Rezki 2021-10-27 6:46 ` Michal Hocko @ 2021-10-27 17:55 ` Uladzislau Rezki 2021-10-29 7:57 ` Michal Hocko 1 sibling, 1 reply; 26+ messages in thread From: Uladzislau Rezki @ 2021-10-27 17:55 UTC (permalink / raw) To: Michal Hocko Cc: Michal Hocko, Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue, Oct 26, 2021 at 09:33:15PM +0200, Uladzislau Rezki wrote: > On Tue, Oct 26, 2021 at 06:28:52PM +0200, Michal Hocko wrote: > > On Tue 26-10-21 17:48:32, Uladzislau Rezki wrote: > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > > > Dave Chinner has mentioned that some of the xfs code would benefit from > > > > kvmalloc support for __GFP_NOFAIL because they have allocations that > > > > cannot fail and they do not fit into a single page. > > > > > > > > The larg part of the vmalloc implementation already complies with the > > > > given gfp flags so there is no work for those to be done. The area > > > > and page table allocations are an exception to that. Implement a retry > > > > loop for those. > > > > > > > > Add a short sleep before retrying. 1 jiffy is a completely random > > > > timeout. Ideally the retry would wait for an explicit event - e.g. > > > > a change to the vmalloc space change if the failure was caused by > > > > the space fragmentation or depletion. But there are multiple different > > > > reasons to retry and this could become much more complex. Keep the retry > > > > simple for now and just sleep to prevent from hogging CPUs. > > > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > --- > > > > mm/vmalloc.c | 10 +++++++++- > > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > index c6cc77d2f366..602649919a9d 100644 > > > > --- a/mm/vmalloc.c > > > > +++ b/mm/vmalloc.c > > > > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > > > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > > > > flags = memalloc_noio_save(); > > > > > > > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > > + do { > > > > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > > page_shift); > > > > + if (ret < 0) > > > > + schedule_timeout_uninterruptible(1); > > > > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > > > > > > > > > > 1. > > > After that change a below code: > > > > > > <snip> > > > if (ret < 0) { > > > warn_alloc(orig_gfp_mask, NULL, > > > "vmalloc error: size %lu, failed to map pages", > > > area->nr_pages * PAGE_SIZE); > > > goto fail; > > > } > > > <snip> > > > > > > does not make any sense anymore. > > > > Why? Allocations without __GFP_NOFAIL can still fail, no? > > > Right. I meant one thing but wrote slightly differently. In case of > vmap_pages_range() fails(if __GFP_NOFAIL is set) should we emit any > warning message? Because either we can recover on a future iteration > or it stuck there infinitely so a user does not understand what happened. > From the other hand this is how __GFP_NOFAIL works, hm.. > > Another thing, i see that schedule_timeout_uninterruptible(1) is invoked > for all cases even when __GFP_NOFAIL is not set, in that scenario we do > not want to wait, instead we should return back to a caller asap. Or am > i missing something here? > > > > 2. > > > Can we combine two places where we handle __GFP_NOFAIL into one place? > > > That would look like as more sorted out. > > > > I have to admit I am not really fluent at vmalloc code so I wanted to > > make the code as simple as possible. How would I unwind all the allocated > > memory (already allocated as GFP_NOFAIL) before retrying at > > __vmalloc_node_range (if that is what you suggest). And isn't that a > > bit wasteful? > > > > Or did you have anything else in mind? > > > It depends on how often all this can fail. But let me double check if > such combining is easy. > I mean something like below. The idea is to not spread the __GFP_NOFAIL across the vmalloc file keeping it in one solid place: <snip> diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d77830ff604c..f4b7927e217e 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2889,8 +2889,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, unsigned long array_size; unsigned int nr_small_pages = size >> PAGE_SHIFT; unsigned int page_order; + unsigned long flags; + int ret; array_size = (unsigned long)nr_small_pages * sizeof(struct page *); + + /* + * This is i do not understand why we do not want to see warning messages. + */ gfp_mask |= __GFP_NOWARN; if (!(gfp_mask & (GFP_DMA | GFP_DMA32))) gfp_mask |= __GFP_HIGHMEM; @@ -2930,8 +2936,23 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, goto fail; } - if (vmap_pages_range(addr, addr + size, prot, area->pages, - page_shift) < 0) { + /* + * page tables allocations ignore external gfp mask, enforce it + * by the scope API + */ + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + flags = memalloc_nofs_save(); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + flags = memalloc_noio_save(); + + ret = vmap_pages_range(addr, addr + size, prot, area->pages, page_shift); + + if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO) + memalloc_nofs_restore(flags); + else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) + memalloc_noio_restore(flags); + + if (ret < 0) { warn_alloc(gfp_mask, NULL, "vmalloc error: size %lu, failed to map pages", area->nr_pages * PAGE_SIZE); @@ -2984,6 +3005,12 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, return NULL; } + /* + * Suppress all warnings for __GFP_NOFAIL allocation. + */ + if (gfp_mask & __GFP_NOFAIL) + gfp_mask |= __GFP_NOWARN; + if (vmap_allow_huge && !(vm_flags & VM_NO_HUGE_VMAP)) { unsigned long size_per_node; @@ -3010,16 +3037,22 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, area = __get_vm_area_node(real_size, align, shift, VM_ALLOC | VM_UNINITIALIZED | vm_flags, start, end, node, gfp_mask, caller); - if (!area) { - warn_alloc(gfp_mask, NULL, - "vmalloc error: size %lu, vm_struct allocation failed", - real_size); - goto fail; - } + if (area) + addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); + + if (!area || !addr) { + if (gfp_mask & __GFP_NOFAIL) { + schedule_timeout_uninterruptible(1); + goto again; + } + + if (!area) + warn_alloc(gfp_mask, NULL, + "vmalloc error: size %lu, vm_struct allocation failed", + real_size); - addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); - if (!addr) goto fail; + } /* * In this function, newly allocated vm_struct has VM_UNINITIALIZED <snip> -- Vlad Rezki ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-27 17:55 ` Uladzislau Rezki @ 2021-10-29 7:57 ` Michal Hocko 2021-10-29 14:05 ` Uladzislau Rezki 0 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-29 7:57 UTC (permalink / raw) To: Uladzislau Rezki Cc: Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Wed 27-10-21 19:55:50, Uladzislau Rezki wrote: > On Tue, Oct 26, 2021 at 09:33:15PM +0200, Uladzislau Rezki wrote: > > On Tue, Oct 26, 2021 at 06:28:52PM +0200, Michal Hocko wrote: > > > On Tue 26-10-21 17:48:32, Uladzislau Rezki wrote: > > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > > > > > Dave Chinner has mentioned that some of the xfs code would benefit from > > > > > kvmalloc support for __GFP_NOFAIL because they have allocations that > > > > > cannot fail and they do not fit into a single page. > > > > > > > > > > The larg part of the vmalloc implementation already complies with the > > > > > given gfp flags so there is no work for those to be done. The area > > > > > and page table allocations are an exception to that. Implement a retry > > > > > loop for those. > > > > > > > > > > Add a short sleep before retrying. 1 jiffy is a completely random > > > > > timeout. Ideally the retry would wait for an explicit event - e.g. > > > > > a change to the vmalloc space change if the failure was caused by > > > > > the space fragmentation or depletion. But there are multiple different > > > > > reasons to retry and this could become much more complex. Keep the retry > > > > > simple for now and just sleep to prevent from hogging CPUs. > > > > > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > > --- > > > > > mm/vmalloc.c | 10 +++++++++- > > > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > > index c6cc77d2f366..602649919a9d 100644 > > > > > --- a/mm/vmalloc.c > > > > > +++ b/mm/vmalloc.c > > > > > @@ -2941,8 +2941,12 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > > > > else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0) > > > > > flags = memalloc_noio_save(); > > > > > > > > > > - ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > > > + do { > > > > > + ret = vmap_pages_range(addr, addr + size, prot, area->pages, > > > > > page_shift); > > > > > + if (ret < 0) > > > > > + schedule_timeout_uninterruptible(1); > > > > > + } while ((gfp_mask & __GFP_NOFAIL) && (ret < 0)); > > > > > > > > > > > > > 1. > > > > After that change a below code: > > > > > > > > <snip> > > > > if (ret < 0) { > > > > warn_alloc(orig_gfp_mask, NULL, > > > > "vmalloc error: size %lu, failed to map pages", > > > > area->nr_pages * PAGE_SIZE); > > > > goto fail; > > > > } > > > > <snip> > > > > > > > > does not make any sense anymore. > > > > > > Why? Allocations without __GFP_NOFAIL can still fail, no? > > > > > Right. I meant one thing but wrote slightly differently. In case of > > vmap_pages_range() fails(if __GFP_NOFAIL is set) should we emit any > > warning message? Because either we can recover on a future iteration > > or it stuck there infinitely so a user does not understand what happened. > > From the other hand this is how __GFP_NOFAIL works, hm.. > > > > Another thing, i see that schedule_timeout_uninterruptible(1) is invoked > > for all cases even when __GFP_NOFAIL is not set, in that scenario we do > > not want to wait, instead we should return back to a caller asap. Or am > > i missing something here? > > > > > > 2. > > > > Can we combine two places where we handle __GFP_NOFAIL into one place? > > > > That would look like as more sorted out. > > > > > > I have to admit I am not really fluent at vmalloc code so I wanted to > > > make the code as simple as possible. How would I unwind all the allocated > > > memory (already allocated as GFP_NOFAIL) before retrying at > > > __vmalloc_node_range (if that is what you suggest). And isn't that a > > > bit wasteful? > > > > > > Or did you have anything else in mind? > > > > > It depends on how often all this can fail. But let me double check if > > such combining is easy. > > > I mean something like below. The idea is to not spread the __GFP_NOFAIL > across the vmalloc file keeping it in one solid place: > > <snip> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index d77830ff604c..f4b7927e217e 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2889,8 +2889,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > unsigned long array_size; > unsigned int nr_small_pages = size >> PAGE_SHIFT; > unsigned int page_order; > + unsigned long flags; > + int ret; > > array_size = (unsigned long)nr_small_pages * sizeof(struct page *); > + > + /* > + * This is i do not understand why we do not want to see warning messages. > + */ > gfp_mask |= __GFP_NOWARN; I suspect this is becauser vmalloc wants to have its own failure reporting. [...] > @@ -3010,16 +3037,22 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > area = __get_vm_area_node(real_size, align, shift, VM_ALLOC | > VM_UNINITIALIZED | vm_flags, start, end, node, > gfp_mask, caller); > - if (!area) { > - warn_alloc(gfp_mask, NULL, > - "vmalloc error: size %lu, vm_struct allocation failed", > - real_size); > - goto fail; > - } > + if (area) > + addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); > + > + if (!area || !addr) { > + if (gfp_mask & __GFP_NOFAIL) { > + schedule_timeout_uninterruptible(1); > + goto again; > + } > + > + if (!area) > + warn_alloc(gfp_mask, NULL, > + "vmalloc error: size %lu, vm_struct allocation failed", > + real_size); > > - addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); > - if (!addr) > goto fail; > + } > > /* > * In this function, newly allocated vm_struct has VM_UNINITIALIZED > <snip> OK, this looks easier from the code reading but isn't it quite wasteful to throw all the pages backing the area (all of them allocated as __GFP_NOFAIL) just to then fail to allocate few page tables pages and drop all of that on the floor (this will happen in __vunmap AFAICS). I mean I do not care all that strongly but it seems to me that more changes would need to be done here and optimizations can be done on top. Is this something you feel strongly about? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-29 7:57 ` Michal Hocko @ 2021-10-29 14:05 ` Uladzislau Rezki 2021-10-29 14:45 ` Michal Hocko 0 siblings, 1 reply; 26+ messages in thread From: Uladzislau Rezki @ 2021-10-29 14:05 UTC (permalink / raw) To: Michal Hocko Cc: Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index d77830ff604c..f4b7927e217e 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2889,8 +2889,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > unsigned long array_size; > > unsigned int nr_small_pages = size >> PAGE_SHIFT; > > unsigned int page_order; > > + unsigned long flags; > > + int ret; > > > > array_size = (unsigned long)nr_small_pages * sizeof(struct page *); > > + > > + /* > > + * This is i do not understand why we do not want to see warning messages. > > + */ > > gfp_mask |= __GFP_NOWARN; > > I suspect this is becauser vmalloc wants to have its own failure > reporting. > But as i see it is broken. All three warn_alloc() reports in the __vmalloc_area_node() are useless because the __GFP_NOWARN is added on top of gfp_mask: void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...) { struct va_format vaf; va_list args; static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1); if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs)) return; ... everything with the __GFP_NOWARN is just reverted. > [...] > > @@ -3010,16 +3037,22 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > > area = __get_vm_area_node(real_size, align, shift, VM_ALLOC | > > VM_UNINITIALIZED | vm_flags, start, end, node, > > gfp_mask, caller); > > - if (!area) { > > - warn_alloc(gfp_mask, NULL, > > - "vmalloc error: size %lu, vm_struct allocation failed", > > - real_size); > > - goto fail; > > - } > > + if (area) > > + addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); > > + > > + if (!area || !addr) { > > + if (gfp_mask & __GFP_NOFAIL) { > > + schedule_timeout_uninterruptible(1); > > + goto again; > > + } > > + > > + if (!area) > > + warn_alloc(gfp_mask, NULL, > > + "vmalloc error: size %lu, vm_struct allocation failed", > > + real_size); > > > > - addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); > > - if (!addr) > > goto fail; > > + } > > > > /* > > * In this function, newly allocated vm_struct has VM_UNINITIALIZED > > <snip> > > OK, this looks easier from the code reading but isn't it quite wasteful > to throw all the pages backing the area (all of them allocated as > __GFP_NOFAIL) just to then fail to allocate few page tables pages and > drop all of that on the floor (this will happen in __vunmap AFAICS). > > I mean I do not care all that strongly but it seems to me that more > changes would need to be done here and optimizations can be done on top. > > Is this something you feel strongly about? > Will try to provide some motivations :) It depends on how to look at it. My view is as follows a more simple code is preferred. It is not considered as a hot path and it is rather a corner case to me. I think "unwinding" has some advantage. At least one motivation is to release a memory(on failure) before a delay that will prevent holding of extra memory in case of __GFP_NOFAIL infinitelly does not succeed, i.e. if a process stuck due to __GFP_NOFAIL it does not "hold" an extra memory forever. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-29 14:05 ` Uladzislau Rezki @ 2021-10-29 14:45 ` Michal Hocko 2021-10-29 17:23 ` Uladzislau Rezki 0 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-29 14:45 UTC (permalink / raw) To: Uladzislau Rezki Cc: Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Fri 29-10-21 16:05:32, Uladzislau Rezki wrote: [...] > > OK, this looks easier from the code reading but isn't it quite wasteful > > to throw all the pages backing the area (all of them allocated as > > __GFP_NOFAIL) just to then fail to allocate few page tables pages and > > drop all of that on the floor (this will happen in __vunmap AFAICS). > > > > I mean I do not care all that strongly but it seems to me that more > > changes would need to be done here and optimizations can be done on top. > > > > Is this something you feel strongly about? > > > Will try to provide some motivations :) > > It depends on how to look at it. My view is as follows a more simple code > is preferred. It is not considered as a hot path and it is rather a corner > case to me. Yes, we are definitely talking about corner cases here. Even GFP_KERNEL allocations usually do not fail. > I think "unwinding" has some advantage. At least one motivation > is to release a memory(on failure) before a delay that will prevent holding > of extra memory in case of __GFP_NOFAIL infinitelly does not succeed, i.e. > if a process stuck due to __GFP_NOFAIL it does not "hold" an extra memory > forever. Well, I suspect this is something that we can disagree on and both of us would be kinda right. I would see it as throwing baby out with the bathwater. The vast majority of the memory will be in the area pages and sacrificing that just to allocate few page tables or whatever that might fail in that code path is just a lot of cycles wasted. So unless you really feel strongly about this then I would stick with this approach. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL 2021-10-29 14:45 ` Michal Hocko @ 2021-10-29 17:23 ` Uladzislau Rezki 0 siblings, 0 replies; 26+ messages in thread From: Uladzislau Rezki @ 2021-10-29 17:23 UTC (permalink / raw) To: Michal Hocko Cc: Uladzislau Rezki, Linux Memory Management List, Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton > On Fri 29-10-21 16:05:32, Uladzislau Rezki wrote: > [...] > > > OK, this looks easier from the code reading but isn't it quite wasteful > > > to throw all the pages backing the area (all of them allocated as > > > __GFP_NOFAIL) just to then fail to allocate few page tables pages and > > > drop all of that on the floor (this will happen in __vunmap AFAICS). > > > > > > I mean I do not care all that strongly but it seems to me that more > > > changes would need to be done here and optimizations can be done on top. > > > > > > Is this something you feel strongly about? > > > > > Will try to provide some motivations :) > > > > It depends on how to look at it. My view is as follows a more simple code > > is preferred. It is not considered as a hot path and it is rather a corner > > case to me. > > Yes, we are definitely talking about corner cases here. Even GFP_KERNEL > allocations usually do not fail. > > > I think "unwinding" has some advantage. At least one motivation > > is to release a memory(on failure) before a delay that will prevent holding > > of extra memory in case of __GFP_NOFAIL infinitelly does not succeed, i.e. > > if a process stuck due to __GFP_NOFAIL it does not "hold" an extra memory > > forever. > > Well, I suspect this is something that we can disagree on and both of us > would be kinda right. I would see it as throwing baby out with the > bathwater. The vast majority of the memory will be in the area pages and > sacrificing that just to allocate few page tables or whatever that might > fail in that code path is just a lot of cycles wasted. > We are not talking about performance, no sense to measure cycles here :) > > So unless you really feel strongly about this then I would stick with > this approach. > I have raised one concern. The memory resource is shared between all process in case of __GFP_NOFAIL it might be that we never return back to user in that scenario i prefer to release hold memory for other needs instead of keeping it for nothing. If you think it is not a problem, then i do not have much to say. -- Vlad Rezki ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags. 2021-10-25 15:02 [PATCH 0/4] extend vmalloc support for constrained allocations Michal Hocko 2021-10-25 15:02 ` [PATCH 1/4] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc Michal Hocko 2021-10-25 15:02 ` [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL Michal Hocko @ 2021-10-25 15:02 ` Michal Hocko 2021-10-25 23:26 ` NeilBrown 2021-10-25 15:02 ` [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc Michal Hocko 3 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-25 15:02 UTC (permalink / raw) To: linux-mm Cc: Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko From: Michal Hocko <mhocko@suse.com> The core of the vmalloc allocator __vmalloc_area_node doesn't say anything about gfp mask argument. Not all gfp flags are supported though. Be more explicit about constrains. Signed-off-by: Michal Hocko <mhocko@suse.com> --- mm/vmalloc.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 602649919a9d..2199d821c981 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, * @caller: caller's return address * * Allocate enough pages to cover @size from the page level - * allocator with @gfp_mask flags. Map them into contiguous - * kernel virtual space, using a pagetable protection of @prot. + * allocator with @gfp_mask flags. Please note that the full set of gfp + * flags are not supported. GFP_KERNEL would be a preferred allocation mode + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). + * __GFP_NOWARN can be used to suppress error messages about failures. + * + * Map them into contiguous kernel virtual space, using a pagetable + * protection of @prot. * * Return: the address of the area or %NULL on failure */ -- 2.30.2 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags. 2021-10-25 15:02 ` [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags Michal Hocko @ 2021-10-25 23:26 ` NeilBrown 2021-10-26 7:10 ` Michal Hocko 0 siblings, 1 reply; 26+ messages in thread From: NeilBrown @ 2021-10-25 23:26 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko On Tue, 26 Oct 2021, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > The core of the vmalloc allocator __vmalloc_area_node doesn't say > anything about gfp mask argument. Not all gfp flags are supported > though. Be more explicit about constrains. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > mm/vmalloc.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 602649919a9d..2199d821c981 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > * @caller: caller's return address > * > * Allocate enough pages to cover @size from the page level > - * allocator with @gfp_mask flags. Map them into contiguous > - * kernel virtual space, using a pagetable protection of @prot. > + * allocator with @gfp_mask flags. Please note that the full set of gfp > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not In what sense is GFP_KERNEL "preferred"?? The choice of GFP_NOFS, when necessary, isn't based on preference but on need. I understand that you would prefer no one ever used GFP_NOFs ever - just use the scope API. I even agree. But this is not the place to make that case. > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka I don't think "aka" is the right thing to use here. It is short for "also known as" and there is nothing that is being known as something else. It would be appropriate to say (i.e. GFP_NOWAIT is not supported). "i.e." is short for the Latin "id est" which means "that is" and normally introduces an alternate description (whereas aka introduces an alternate name). > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported. > + * __GFP_NOWARN can be used to suppress error messages about failures. Surely "NOWARN" suppresses warning messages, not error messages .... Thanks, NeilBrown > + * > + * Map them into contiguous kernel virtual space, using a pagetable > + * protection of @prot. > * > * Return: the address of the area or %NULL on failure > */ > -- > 2.30.2 > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags. 2021-10-25 23:26 ` NeilBrown @ 2021-10-26 7:10 ` Michal Hocko 2021-10-26 10:43 ` NeilBrown 0 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-26 7:10 UTC (permalink / raw) To: NeilBrown Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 10:26:06, Neil Brown wrote: > On Tue, 26 Oct 2021, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.com> > > > > The core of the vmalloc allocator __vmalloc_area_node doesn't say > > anything about gfp mask argument. Not all gfp flags are supported > > though. Be more explicit about constrains. > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > --- > > mm/vmalloc.c | 12 ++++++++++-- > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 602649919a9d..2199d821c981 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > * @caller: caller's return address > > * > > * Allocate enough pages to cover @size from the page level > > - * allocator with @gfp_mask flags. Map them into contiguous > > - * kernel virtual space, using a pagetable protection of @prot. > > + * allocator with @gfp_mask flags. Please note that the full set of gfp > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not > > In what sense is GFP_KERNEL "preferred"?? > The choice of GFP_NOFS, when necessary, isn't based on preference but > on need. > > I understand that you would prefer no one ever used GFP_NOFs ever - just > use the scope API. I even agree. But this is not the place to make > that case. Any suggestion for a better wording? > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka > > I don't think "aka" is the right thing to use here. It is short for > "also known as" and there is nothing that is being known as something > else. > It would be appropriate to say (i.e. GFP_NOWAIT is not supported). > "i.e." is short for the Latin "id est" which means "that is" and > normally introduces an alternate description (whereas aka introduces an > alternate name). OK > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). > > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported. Because they cannot be passed to the page table allocator. In both cases the allocation would fail when system is short on memory. GFP_KERNEL used for ptes implicitly doesn't behave that way. > > > + * __GFP_NOWARN can be used to suppress error messages about failures. > > Surely "NOWARN" suppresses warning messages, not error messages .... I am not sure I follow. NOWARN means "do not warn" independently on the log level chosen for the message. Is an allocation failure an error message? Is the "vmalloc error: size %lu, failed to map pages" an error message? Anyway I will go with "__GFP_NOWARN can be used to suppress failure messages" Is that better? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags. 2021-10-26 7:10 ` Michal Hocko @ 2021-10-26 10:43 ` NeilBrown 2021-10-26 12:20 ` Michal Hocko 0 siblings, 1 reply; 26+ messages in thread From: NeilBrown @ 2021-10-26 10:43 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue, 26 Oct 2021, Michal Hocko wrote: > On Tue 26-10-21 10:26:06, Neil Brown wrote: > > On Tue, 26 Oct 2021, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.com> > > > > > > The core of the vmalloc allocator __vmalloc_area_node doesn't say > > > anything about gfp mask argument. Not all gfp flags are supported > > > though. Be more explicit about constrains. > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > --- > > > mm/vmalloc.c | 12 ++++++++++-- > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > index 602649919a9d..2199d821c981 100644 > > > --- a/mm/vmalloc.c > > > +++ b/mm/vmalloc.c > > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > > * @caller: caller's return address > > > * > > > * Allocate enough pages to cover @size from the page level > > > - * allocator with @gfp_mask flags. Map them into contiguous > > > - * kernel virtual space, using a pagetable protection of @prot. > > > + * allocator with @gfp_mask flags. Please note that the full set of gfp > > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode > > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not > > > > In what sense is GFP_KERNEL "preferred"?? > > The choice of GFP_NOFS, when necessary, isn't based on preference but > > on need. > > > > I understand that you would prefer no one ever used GFP_NOFs ever - just > > use the scope API. I even agree. But this is not the place to make > > that case. > > Any suggestion for a better wording? "GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all supported". > > > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka > > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka > > > > I don't think "aka" is the right thing to use here. It is short for > > "also known as" and there is nothing that is being known as something > > else. > > It would be appropriate to say (i.e. GFP_NOWAIT is not supported). > > "i.e." is short for the Latin "id est" which means "that is" and > > normally introduces an alternate description (whereas aka introduces an > > alternate name). > > OK > > > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). > > > > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported. > > Because they cannot be passed to the page table allocator. In both cases > the allocation would fail when system is short on memory. GFP_KERNEL > used for ptes implicitly doesn't behave that way. Could you please point me to the particular allocation which uses GFP_KERNEL rather than the flags passed to __vmalloc_node()? I cannot find it. > > > > > > + * __GFP_NOWARN can be used to suppress error messages about failures. > > > > Surely "NOWARN" suppresses warning messages, not error messages .... > > I am not sure I follow. NOWARN means "do not warn" independently on the > log level chosen for the message. Is an allocation failure an error > message? Is the "vmalloc error: size %lu, failed to map pages" an error > message? If guess working with a C compiler has trained me to think that "warnings" are different from "errors". > > Anyway I will go with "__GFP_NOWARN can be used to suppress failure messages" > > Is that better? Yes, that's an excellent solution! Thanks. NeilBrown > -- > Michal Hocko > SUSE Labs > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags. 2021-10-26 10:43 ` NeilBrown @ 2021-10-26 12:20 ` Michal Hocko 0 siblings, 0 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-26 12:20 UTC (permalink / raw) To: NeilBrown Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 21:43:17, Neil Brown wrote: > On Tue, 26 Oct 2021, Michal Hocko wrote: > > On Tue 26-10-21 10:26:06, Neil Brown wrote: > > > On Tue, 26 Oct 2021, Michal Hocko wrote: > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > > > The core of the vmalloc allocator __vmalloc_area_node doesn't say > > > > anything about gfp mask argument. Not all gfp flags are supported > > > > though. Be more explicit about constrains. > > > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > --- > > > > mm/vmalloc.c | 12 ++++++++++-- > > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > index 602649919a9d..2199d821c981 100644 > > > > --- a/mm/vmalloc.c > > > > +++ b/mm/vmalloc.c > > > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, > > > > * @caller: caller's return address > > > > * > > > > * Allocate enough pages to cover @size from the page level > > > > - * allocator with @gfp_mask flags. Map them into contiguous > > > > - * kernel virtual space, using a pagetable protection of @prot. > > > > + * allocator with @gfp_mask flags. Please note that the full set of gfp > > > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode > > > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not > > > > > > In what sense is GFP_KERNEL "preferred"?? > > > The choice of GFP_NOFS, when necessary, isn't based on preference but > > > on need. > > > > > > I understand that you would prefer no one ever used GFP_NOFs ever - just > > > use the scope API. I even agree. But this is not the place to make > > > that case. > > > > Any suggestion for a better wording? > > "GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all supported". OK. Check the incremental update at the end of the email > > > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka > > > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka > > > > > > I don't think "aka" is the right thing to use here. It is short for > > > "also known as" and there is nothing that is being known as something > > > else. > > > It would be appropriate to say (i.e. GFP_NOWAIT is not supported). > > > "i.e." is short for the Latin "id est" which means "that is" and > > > normally introduces an alternate description (whereas aka introduces an > > > alternate name). > > > > OK > > > > > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). > > > > > > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported. > > > > Because they cannot be passed to the page table allocator. In both cases > > the allocation would fail when system is short on memory. GFP_KERNEL > > used for ptes implicitly doesn't behave that way. > > Could you please point me to the particular allocation which uses > GFP_KERNEL rather than the flags passed to __vmalloc_node()? I cannot > find it. > It is dug __vmalloc_area_node vmap_pages_range vmap_pages_range_noflush vmap_range_noflush || vmap_small_pages_range_noflush vmap_p4d_range p4d_alloc_track __p4d_alloc p4d_alloc_one get_zeroed_page(GFP_KERNEL_ACCOUNT) the same applies for all other levels of page tables. This is what I have currently commit ae7fc6c2ef6949a76d697fc61bb350197dfca330 Author: Michal Hocko <mhocko@suse.com> Date: Tue Oct 26 14:16:32 2021 +0200 fold me "mm/vmalloc: be more explicit about supported gfp flags." diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 2ddaa9410aee..82a07b04317e 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2981,12 +2981,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, * * Allocate enough pages to cover @size from the page level * allocator with @gfp_mask flags. Please note that the full set of gfp - * flags are not supported. GFP_KERNEL would be a preferred allocation mode - * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not - * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka - * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka - * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported). - * __GFP_NOWARN can be used to suppress error messages about failures. + * flags are not supported. GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all + * supported. + * Zone modifiers are not supported. From the reclaim modifiers + * __GFP_DIRECT_RECLAIM is required (aka GFP_NOWAIT is not supported) + * and only __GFP_NOFAIL is supported (i.e. __GFP_NORETRY and + * __GFP_RETRY_MAYFAIL are not supported). + * + * __GFP_NOWARN can be used to suppress failures messages. * * Map them into contiguous kernel virtual space, using a pagetable * protection of @prot. -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc 2021-10-25 15:02 [PATCH 0/4] extend vmalloc support for constrained allocations Michal Hocko ` (2 preceding siblings ...) 2021-10-25 15:02 ` [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags Michal Hocko @ 2021-10-25 15:02 ` Michal Hocko 2021-10-25 23:34 ` NeilBrown 3 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-25 15:02 UTC (permalink / raw) To: linux-mm Cc: Dave Chinner, Neil Brown, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko From: Michal Hocko <mhocko@suse.com> A support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by previous patches so we can allow the support for kvmalloc. This will allow some external users to simplify or completely remove their helpers. GFP_NOWAIT semantic hasn't been supported so far but it hasn't been explicitly documented so let's add a note about that. ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc. Signed-off-by: Michal Hocko <mhocko@suse.com> --- include/linux/ceph/libceph.h | 1 - mm/util.c | 15 ++++----------- net/ceph/buffer.c | 4 ++-- net/ceph/ceph_common.c | 27 --------------------------- net/ceph/crypto.c | 2 +- net/ceph/messenger.c | 2 +- net/ceph/messenger_v2.c | 2 +- net/ceph/osdmap.c | 12 ++++++------ 8 files changed, 15 insertions(+), 50 deletions(-) diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h index 409d8c29bc4f..309acbcb5a8a 100644 --- a/include/linux/ceph/libceph.h +++ b/include/linux/ceph/libceph.h @@ -295,7 +295,6 @@ extern bool libceph_compatible(void *data); extern const char *ceph_msg_type_name(int type); extern int ceph_check_fsid(struct ceph_client *client, struct ceph_fsid *fsid); -extern void *ceph_kvmalloc(size_t size, gfp_t flags); struct fs_parameter; struct fc_log; diff --git a/mm/util.c b/mm/util.c index bacabe446906..fdec6b4b1267 100644 --- a/mm/util.c +++ b/mm/util.c @@ -549,13 +549,10 @@ EXPORT_SYMBOL(vm_mmap); * Uses kmalloc to get the memory but if the allocation fails then falls back * to the vmalloc allocator. Use kvfree for freeing the memory. * - * Reclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported. + * Reclaim modifiers - __GFP_NORETRY and GFP_NOWAIT are not supported. * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is * preferable to the vmalloc fallback, due to visible performance drawbacks. * - * Please note that any use of gfp flags outside of GFP_KERNEL is careful to not - * fall back to vmalloc. - * * Return: pointer to the allocated memory of %NULL in case of failure */ void *kvmalloc_node(size_t size, gfp_t flags, int node) @@ -563,13 +560,6 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) gfp_t kmalloc_flags = flags; void *ret; - /* - * vmalloc uses GFP_KERNEL for some internal allocations (e.g page tables) - * so the given set of flags has to be compatible. - */ - if ((flags & GFP_KERNEL) != GFP_KERNEL) - return kmalloc_node(size, flags, node); - /* * We want to attempt a large physically contiguous block first because * it is less likely to fragment multiple larger blocks and therefore @@ -582,6 +572,9 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) if (!(kmalloc_flags & __GFP_RETRY_MAYFAIL)) kmalloc_flags |= __GFP_NORETRY; + + /* nofail semantic is implemented by the vmalloc fallback */ + kmalloc_flags &= ~__GFP_NOFAIL; } ret = kmalloc_node(size, kmalloc_flags, node); diff --git a/net/ceph/buffer.c b/net/ceph/buffer.c index 5622763ad402..7e51f128045d 100644 --- a/net/ceph/buffer.c +++ b/net/ceph/buffer.c @@ -7,7 +7,7 @@ #include <linux/ceph/buffer.h> #include <linux/ceph/decode.h> -#include <linux/ceph/libceph.h> /* for ceph_kvmalloc */ +#include <linux/ceph/libceph.h> /* for kvmalloc */ struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp) { @@ -17,7 +17,7 @@ struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp) if (!b) return NULL; - b->vec.iov_base = ceph_kvmalloc(len, gfp); + b->vec.iov_base = kvmalloc(len, gfp); if (!b->vec.iov_base) { kfree(b); return NULL; diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c index 97d6ea763e32..9441b4a4912b 100644 --- a/net/ceph/ceph_common.c +++ b/net/ceph/ceph_common.c @@ -190,33 +190,6 @@ int ceph_compare_options(struct ceph_options *new_opt, } EXPORT_SYMBOL(ceph_compare_options); -/* - * kvmalloc() doesn't fall back to the vmalloc allocator unless flags are - * compatible with (a superset of) GFP_KERNEL. This is because while the - * actual pages are allocated with the specified flags, the page table pages - * are always allocated with GFP_KERNEL. - * - * ceph_kvmalloc() may be called with GFP_KERNEL, GFP_NOFS or GFP_NOIO. - */ -void *ceph_kvmalloc(size_t size, gfp_t flags) -{ - void *p; - - if ((flags & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS)) { - p = kvmalloc(size, flags); - } else if ((flags & (__GFP_IO | __GFP_FS)) == __GFP_IO) { - unsigned int nofs_flag = memalloc_nofs_save(); - p = kvmalloc(size, GFP_KERNEL); - memalloc_nofs_restore(nofs_flag); - } else { - unsigned int noio_flag = memalloc_noio_save(); - p = kvmalloc(size, GFP_KERNEL); - memalloc_noio_restore(noio_flag); - } - - return p; -} - static int parse_fsid(const char *str, struct ceph_fsid *fsid) { int i = 0; diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c index 92d89b331645..051d22c0e4ad 100644 --- a/net/ceph/crypto.c +++ b/net/ceph/crypto.c @@ -147,7 +147,7 @@ void ceph_crypto_key_destroy(struct ceph_crypto_key *key) static const u8 *aes_iv = (u8 *)CEPH_AES_IV; /* - * Should be used for buffers allocated with ceph_kvmalloc(). + * Should be used for buffers allocated with kvmalloc(). * Currently these are encrypt out-buffer (ceph_buffer) and decrypt * in-buffer (msg front). * diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 57d043b382ed..7b891be799d2 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1920,7 +1920,7 @@ struct ceph_msg *ceph_msg_new2(int type, int front_len, int max_data_items, /* front */ if (front_len) { - m->front.iov_base = ceph_kvmalloc(front_len, flags); + m->front.iov_base = kvmalloc(front_len, flags); if (m->front.iov_base == NULL) { dout("ceph_msg_new can't allocate %d bytes\n", front_len); diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index cc40ce4e02fb..c4099b641b38 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -308,7 +308,7 @@ static void *alloc_conn_buf(struct ceph_connection *con, int len) if (WARN_ON(con->v2.conn_buf_cnt >= ARRAY_SIZE(con->v2.conn_bufs))) return NULL; - buf = ceph_kvmalloc(len, GFP_NOIO); + buf = kvmalloc(len, GFP_NOIO); if (!buf) return NULL; diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c index 75b738083523..2823bb3cff55 100644 --- a/net/ceph/osdmap.c +++ b/net/ceph/osdmap.c @@ -980,7 +980,7 @@ static struct crush_work *alloc_workspace(const struct crush_map *c) work_size = crush_work_size(c, CEPH_PG_MAX_SIZE); dout("%s work_size %zu bytes\n", __func__, work_size); - work = ceph_kvmalloc(work_size, GFP_NOIO); + work = kvmalloc(work_size, GFP_NOIO); if (!work) return NULL; @@ -1190,9 +1190,9 @@ static int osdmap_set_max_osd(struct ceph_osdmap *map, u32 max) if (max == map->max_osd) return 0; - state = ceph_kvmalloc(array_size(max, sizeof(*state)), GFP_NOFS); - weight = ceph_kvmalloc(array_size(max, sizeof(*weight)), GFP_NOFS); - addr = ceph_kvmalloc(array_size(max, sizeof(*addr)), GFP_NOFS); + state = kvmalloc(array_size(max, sizeof(*state)), GFP_NOFS); + weight = kvmalloc(array_size(max, sizeof(*weight)), GFP_NOFS); + addr = kvmalloc(array_size(max, sizeof(*addr)), GFP_NOFS); if (!state || !weight || !addr) { kvfree(state); kvfree(weight); @@ -1222,7 +1222,7 @@ static int osdmap_set_max_osd(struct ceph_osdmap *map, u32 max) if (map->osd_primary_affinity) { u32 *affinity; - affinity = ceph_kvmalloc(array_size(max, sizeof(*affinity)), + affinity = kvmalloc(array_size(max, sizeof(*affinity)), GFP_NOFS); if (!affinity) return -ENOMEM; @@ -1503,7 +1503,7 @@ static int set_primary_affinity(struct ceph_osdmap *map, int osd, u32 aff) if (!map->osd_primary_affinity) { int i; - map->osd_primary_affinity = ceph_kvmalloc( + map->osd_primary_affinity = kvmalloc( array_size(map->max_osd, sizeof(*map->osd_primary_affinity)), GFP_NOFS); if (!map->osd_primary_affinity) -- 2.30.2 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc 2021-10-25 15:02 ` [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc Michal Hocko @ 2021-10-25 23:34 ` NeilBrown 2021-10-26 7:15 ` Michal Hocko 0 siblings, 1 reply; 26+ messages in thread From: NeilBrown @ 2021-10-25 23:34 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton, Michal Hocko On Tue, 26 Oct 2021, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > A support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented > by previous patches so we can allow the support for kvmalloc. This > will allow some external users to simplify or completely remove > their helpers. > > GFP_NOWAIT semantic hasn't been supported so far but it hasn't been > explicitly documented so let's add a note about that. > > ceph_kvmalloc is the first helper to be dropped and changed to > kvmalloc. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > include/linux/ceph/libceph.h | 1 - > mm/util.c | 15 ++++----------- > net/ceph/buffer.c | 4 ++-- > net/ceph/ceph_common.c | 27 --------------------------- > net/ceph/crypto.c | 2 +- > net/ceph/messenger.c | 2 +- > net/ceph/messenger_v2.c | 2 +- > net/ceph/osdmap.c | 12 ++++++------ > 8 files changed, 15 insertions(+), 50 deletions(-) > > diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h > index 409d8c29bc4f..309acbcb5a8a 100644 > --- a/include/linux/ceph/libceph.h > +++ b/include/linux/ceph/libceph.h > @@ -295,7 +295,6 @@ extern bool libceph_compatible(void *data); > > extern const char *ceph_msg_type_name(int type); > extern int ceph_check_fsid(struct ceph_client *client, struct ceph_fsid *fsid); > -extern void *ceph_kvmalloc(size_t size, gfp_t flags); > > struct fs_parameter; > struct fc_log; > diff --git a/mm/util.c b/mm/util.c > index bacabe446906..fdec6b4b1267 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -549,13 +549,10 @@ EXPORT_SYMBOL(vm_mmap); > * Uses kmalloc to get the memory but if the allocation fails then falls back > * to the vmalloc allocator. Use kvfree for freeing the memory. > * > - * Reclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported. > + * Reclaim modifiers - __GFP_NORETRY and GFP_NOWAIT are not supported. GFP_NOWAIT is not a modifier. It is a base value that can be modified. I think you mean that __GFP_NORETRY is not supported and __GFP_DIRECT_RECLAIM is required But I really cannot see why either of these statements are true. Before your patch, __GFP_NORETRY would have forced use of kmalloc, so that would mean it isn't really supported. But that doesn't happen any more. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc 2021-10-25 23:34 ` NeilBrown @ 2021-10-26 7:15 ` Michal Hocko 2021-10-26 10:48 ` NeilBrown 0 siblings, 1 reply; 26+ messages in thread From: Michal Hocko @ 2021-10-26 7:15 UTC (permalink / raw) To: NeilBrown Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 10:34:34, Neil Brown wrote: > On Tue, 26 Oct 2021, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.com> > > > > A support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented > > by previous patches so we can allow the support for kvmalloc. This > > will allow some external users to simplify or completely remove > > their helpers. > > > > GFP_NOWAIT semantic hasn't been supported so far but it hasn't been > > explicitly documented so let's add a note about that. > > > > ceph_kvmalloc is the first helper to be dropped and changed to > > kvmalloc. > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > --- > > include/linux/ceph/libceph.h | 1 - > > mm/util.c | 15 ++++----------- > > net/ceph/buffer.c | 4 ++-- > > net/ceph/ceph_common.c | 27 --------------------------- > > net/ceph/crypto.c | 2 +- > > net/ceph/messenger.c | 2 +- > > net/ceph/messenger_v2.c | 2 +- > > net/ceph/osdmap.c | 12 ++++++------ > > 8 files changed, 15 insertions(+), 50 deletions(-) > > > > diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h > > index 409d8c29bc4f..309acbcb5a8a 100644 > > --- a/include/linux/ceph/libceph.h > > +++ b/include/linux/ceph/libceph.h > > @@ -295,7 +295,6 @@ extern bool libceph_compatible(void *data); > > > > extern const char *ceph_msg_type_name(int type); > > extern int ceph_check_fsid(struct ceph_client *client, struct ceph_fsid *fsid); > > -extern void *ceph_kvmalloc(size_t size, gfp_t flags); > > > > struct fs_parameter; > > struct fc_log; > > diff --git a/mm/util.c b/mm/util.c > > index bacabe446906..fdec6b4b1267 100644 > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -549,13 +549,10 @@ EXPORT_SYMBOL(vm_mmap); > > * Uses kmalloc to get the memory but if the allocation fails then falls back > > * to the vmalloc allocator. Use kvfree for freeing the memory. > > * > > - * Reclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported. > > + * Reclaim modifiers - __GFP_NORETRY and GFP_NOWAIT are not supported. > > GFP_NOWAIT is not a modifier. It is a base value that can be modified. > I think you mean that > __GFP_NORETRY is not supported and __GFP_DIRECT_RECLAIM is required I thought naming the higher level gfp mask would be more helpful here. Most people do not tend to think in terms of __GFP_DIRECT_RECLAIM but rather GFP_NOWAIT or GFP_ATOMIC. > But I really cannot see why either of these statements are true. The reason is same as why vmalloc do not support neither of them. > Before your patch, __GFP_NORETRY would have forced use of kmalloc, so > that would mean it isn't really supported. But that doesn't happen any more. __GFP_NORETRY is used internaly by kvmalloc but that doesn't mean it is supported by the caller. In fact __GFP_NORETRY is used to implement a higher level logic of the prioritization between kmalloc and vmalloc fallback because some users would rather see vmalloc fallback even for smaller allocations which do not really fail otherwise (e.g. < order-4). -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc 2021-10-26 7:15 ` Michal Hocko @ 2021-10-26 10:48 ` NeilBrown 2021-10-26 12:23 ` Michal Hocko 0 siblings, 1 reply; 26+ messages in thread From: NeilBrown @ 2021-10-26 10:48 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue, 26 Oct 2021, Michal Hocko wrote: > On Tue 26-10-21 10:34:34, Neil Brown wrote: > > On Tue, 26 Oct 2021, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.com> > > > > > > A support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented > > > by previous patches so we can allow the support for kvmalloc. This > > > will allow some external users to simplify or completely remove > > > their helpers. > > > > > > GFP_NOWAIT semantic hasn't been supported so far but it hasn't been > > > explicitly documented so let's add a note about that. > > > > > > ceph_kvmalloc is the first helper to be dropped and changed to > > > kvmalloc. > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > --- > > > include/linux/ceph/libceph.h | 1 - > > > mm/util.c | 15 ++++----------- > > > net/ceph/buffer.c | 4 ++-- > > > net/ceph/ceph_common.c | 27 --------------------------- > > > net/ceph/crypto.c | 2 +- > > > net/ceph/messenger.c | 2 +- > > > net/ceph/messenger_v2.c | 2 +- > > > net/ceph/osdmap.c | 12 ++++++------ > > > 8 files changed, 15 insertions(+), 50 deletions(-) > > > > > > diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h > > > index 409d8c29bc4f..309acbcb5a8a 100644 > > > --- a/include/linux/ceph/libceph.h > > > +++ b/include/linux/ceph/libceph.h > > > @@ -295,7 +295,6 @@ extern bool libceph_compatible(void *data); > > > > > > extern const char *ceph_msg_type_name(int type); > > > extern int ceph_check_fsid(struct ceph_client *client, struct ceph_fsid *fsid); > > > -extern void *ceph_kvmalloc(size_t size, gfp_t flags); > > > > > > struct fs_parameter; > > > struct fc_log; > > > diff --git a/mm/util.c b/mm/util.c > > > index bacabe446906..fdec6b4b1267 100644 > > > --- a/mm/util.c > > > +++ b/mm/util.c > > > @@ -549,13 +549,10 @@ EXPORT_SYMBOL(vm_mmap); > > > * Uses kmalloc to get the memory but if the allocation fails then falls back > > > * to the vmalloc allocator. Use kvfree for freeing the memory. > > > * > > > - * Reclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported. > > > + * Reclaim modifiers - __GFP_NORETRY and GFP_NOWAIT are not supported. > > > > GFP_NOWAIT is not a modifier. It is a base value that can be modified. > > I think you mean that > > __GFP_NORETRY is not supported and __GFP_DIRECT_RECLAIM is required > > I thought naming the higher level gfp mask would be more helpful here. > Most people do not tend to think in terms of __GFP_DIRECT_RECLAIM but > rather GFP_NOWAIT or GFP_ATOMIC. Maybe it would. But the text says "Reclaim modifiers" and then lists one modifier and one mask. That is confusing. If you want to mention both, keep them separate. GFP_NOWAIT and GFP_ATOMIC are not supported, neither is the __GFP_NORETRY modifier. or something like that. Thanks, NeilBrown > > > But I really cannot see why either of these statements are true. > > The reason is same as why vmalloc do not support neither of them. > > > Before your patch, __GFP_NORETRY would have forced use of kmalloc, so > > that would mean it isn't really supported. But that doesn't happen any more. > > __GFP_NORETRY is used internaly by kvmalloc but that doesn't mean it is > supported by the caller. In fact __GFP_NORETRY is used to implement a > higher level logic of the prioritization between kmalloc and vmalloc > fallback because some users would rather see vmalloc fallback even for > smaller allocations which do not really fail otherwise (e.g. < order-4). > -- > Michal Hocko > SUSE Labs > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc 2021-10-26 10:48 ` NeilBrown @ 2021-10-26 12:23 ` Michal Hocko 0 siblings, 0 replies; 26+ messages in thread From: Michal Hocko @ 2021-10-26 12:23 UTC (permalink / raw) To: NeilBrown Cc: linux-mm, Dave Chinner, Andrew Morton, Christoph Hellwig, Uladzislau Rezki, linux-fsdevel, LKML, Ilya Dryomov, Jeff Layton On Tue 26-10-21 21:48:05, Neil Brown wrote: > On Tue, 26 Oct 2021, Michal Hocko wrote: [...] > > > GFP_NOWAIT is not a modifier. It is a base value that can be modified. > > > I think you mean that > > > __GFP_NORETRY is not supported and __GFP_DIRECT_RECLAIM is required > > > > I thought naming the higher level gfp mask would be more helpful here. > > Most people do not tend to think in terms of __GFP_DIRECT_RECLAIM but > > rather GFP_NOWAIT or GFP_ATOMIC. > > Maybe it would. But the text says "Reclaim modifiers" and then lists > one modifier and one mask. That is confusing. > If you want to mention both, keep them separate. > > GFP_NOWAIT and GFP_ATOMIC are not supported, neither is the > __GFP_NORETRY modifier. > > or something like that. Fair enough. I went with this commit fb93996c217cea864a3b3ffa8a8cd482bf0a1f62 Author: Michal Hocko <mhocko@suse.com> Date: Tue Oct 26 14:23:00 2021 +0200 fold me "mm: allow !GFP_KERNEL allocations for kvmalloc" diff --git a/mm/util.c b/mm/util.c index fdec6b4b1267..1fb6dd907bb0 100644 --- a/mm/util.c +++ b/mm/util.c @@ -549,7 +549,7 @@ EXPORT_SYMBOL(vm_mmap); * Uses kmalloc to get the memory but if the allocation fails then falls back * to the vmalloc allocator. Use kvfree for freeing the memory. * - * Reclaim modifiers - __GFP_NORETRY and GFP_NOWAIT are not supported. + * GFP_NOWAIT and GFP_ATOMIC are not supported, neither is the __GFP_NORETRY modifier. * __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is * preferable to the vmalloc fallback, due to visible performance drawbacks. * -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 26+ messages in thread
end of thread, other threads:[~2021-10-29 17:23 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-10-25 15:02 [PATCH 0/4] extend vmalloc support for constrained allocations Michal Hocko 2021-10-25 15:02 ` [PATCH 1/4] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc Michal Hocko 2021-10-25 15:02 ` [PATCH 2/4] mm/vmalloc: add support for __GFP_NOFAIL Michal Hocko 2021-10-25 22:59 ` NeilBrown 2021-10-26 7:03 ` Michal Hocko 2021-10-26 10:30 ` NeilBrown 2021-10-26 11:29 ` Michal Hocko 2021-10-26 15:48 ` Uladzislau Rezki 2021-10-26 16:28 ` Michal Hocko 2021-10-26 19:33 ` Uladzislau Rezki 2021-10-27 6:46 ` Michal Hocko 2021-10-27 17:55 ` Uladzislau Rezki 2021-10-29 7:57 ` Michal Hocko 2021-10-29 14:05 ` Uladzislau Rezki 2021-10-29 14:45 ` Michal Hocko 2021-10-29 17:23 ` Uladzislau Rezki 2021-10-25 15:02 ` [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags Michal Hocko 2021-10-25 23:26 ` NeilBrown 2021-10-26 7:10 ` Michal Hocko 2021-10-26 10:43 ` NeilBrown 2021-10-26 12:20 ` Michal Hocko 2021-10-25 15:02 ` [PATCH 4/4] mm: allow !GFP_KERNEL allocations for kvmalloc Michal Hocko 2021-10-25 23:34 ` NeilBrown 2021-10-26 7:15 ` Michal Hocko 2021-10-26 10:48 ` NeilBrown 2021-10-26 12:23 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).