* [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. @ 2012-04-25 20:10 David Miller 2012-04-25 20:12 ` Tejun Heo 2012-04-25 22:46 ` Yinghai Lu 0 siblings, 2 replies; 10+ messages in thread From: David Miller @ 2012-04-25 20:10 UTC (permalink / raw) To: linux-kernel; +Cc: yinghai, tj, torvalds The comments above __alloc_bootmem_node() claim that the code will first try the allocation using 'goal' and if that fails it will try again but with the 'goal' requirement dropped. Unfortunately, this is not what the code does, so fix it to do so. This is important for nobootmem conversions to architectures such as sparc where MAX_DMA_ADDRESS is infinity. On such architectures all of the allocations done by generic spots, such as the sparse-vmemmap implementation, will pass in: __pa(MAX_DMA_ADDRESS) as the goal, and with the limit given as "-1" this will always fail unless we add the appropriate fallback logic here. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/mm/nobootmem.c b/mm/nobootmem.c index 24f0fc1..e53bb8a 100644 --- a/mm/nobootmem.c +++ b/mm/nobootmem.c @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, if (WARN_ON_ONCE(slab_is_available())) return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); +again: ptr = __alloc_memory_core_early(pgdat->node_id, size, align, goal, -1ULL); if (ptr) return ptr; - return __alloc_memory_core_early(MAX_NUMNODES, size, align, - goal, -1ULL); + ptr = __alloc_memory_core_early(MAX_NUMNODES, size, align, + goal, -1ULL); + if (!ptr && goal) { + goal = 0; + goto again; + } + return ptr; } void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size, ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-04-25 20:10 [PATCH] mm: nobootmem: Correct alloc_bootmem semantics David Miller @ 2012-04-25 20:12 ` Tejun Heo 2012-04-25 22:46 ` Yinghai Lu 1 sibling, 0 replies; 10+ messages in thread From: Tejun Heo @ 2012-04-25 20:12 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, yinghai, torvalds On Wed, Apr 25, 2012 at 04:10:50PM -0400, David Miller wrote: > > The comments above __alloc_bootmem_node() claim that the code will > first try the allocation using 'goal' and if that fails it will > try again but with the 'goal' requirement dropped. > > Unfortunately, this is not what the code does, so fix it to do so. > > This is important for nobootmem conversions to architectures such > as sparc where MAX_DMA_ADDRESS is infinity. > > On such architectures all of the allocations done by generic spots, > such as the sparse-vmemmap implementation, will pass in: > > __pa(MAX_DMA_ADDRESS) > > as the goal, and with the limit given as "-1" this will always fail > unless we add the appropriate fallback logic here. > > Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Tejun Heo <tj@kernel.org> Thanks. -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-04-25 20:10 [PATCH] mm: nobootmem: Correct alloc_bootmem semantics David Miller 2012-04-25 20:12 ` Tejun Heo @ 2012-04-25 22:46 ` Yinghai Lu 2012-04-25 23:00 ` David Miller 1 sibling, 1 reply; 10+ messages in thread From: Yinghai Lu @ 2012-04-25 22:46 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, tj, torvalds On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: > > The comments above __alloc_bootmem_node() claim that the code will > first try the allocation using 'goal' and if that fails it will > try again but with the 'goal' requirement dropped. > > Unfortunately, this is not what the code does, so fix it to do so. > > This is important for nobootmem conversions to architectures such > as sparc where MAX_DMA_ADDRESS is infinity. > > On such architectures all of the allocations done by generic spots, > such as the sparse-vmemmap implementation, will pass in: > > __pa(MAX_DMA_ADDRESS) > > as the goal, and with the limit given as "-1" this will always fail > unless we add the appropriate fallback logic here. > > Signed-off-by: David S. Miller <davem@davemloft.net> > > diff --git a/mm/nobootmem.c b/mm/nobootmem.c > index 24f0fc1..e53bb8a 100644 > --- a/mm/nobootmem.c > +++ b/mm/nobootmem.c > @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, > if (WARN_ON_ONCE(slab_is_available())) > return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); > > +again: > ptr = __alloc_memory_core_early(pgdat->node_id, size, align, > goal, -1ULL); > if (ptr) > return ptr; If you want to be consistent to bootmem version. again label should be here instead. > > - return __alloc_memory_core_early(MAX_NUMNODES, size, align, > - goal, -1ULL); > + ptr = __alloc_memory_core_early(MAX_NUMNODES, size, align, > + goal, -1ULL); > + if (!ptr && goal) { > + goal = 0; > + goto again; > + } > + return ptr; > } Thanks Yinghai ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-04-25 22:46 ` Yinghai Lu @ 2012-04-25 23:00 ` David Miller 2012-04-25 23:14 ` Yinghai Lu 2012-05-03 15:28 ` Johannes Weiner 0 siblings, 2 replies; 10+ messages in thread From: David Miller @ 2012-04-25 23:00 UTC (permalink / raw) To: yinghai; +Cc: linux-kernel, tj, torvalds From: Yinghai Lu <yinghai@kernel.org> Date: Wed, 25 Apr 2012 15:46:42 -0700 > On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: >> @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, >> if (WARN_ON_ONCE(slab_is_available())) >> return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); >> >> +again: >> ptr = __alloc_memory_core_early(pgdat->node_id, size, align, >> goal, -1ULL); >> if (ptr) >> return ptr; > > If you want to be consistent to bootmem version. > > again label should be here instead. It is merely an artifact of implementation that the bootmem version doesn't try to respect the given node if the goal cannot be satisfied, and in fact I would classify that as a bug that needs to be fixed. Therefore, I believe the bootmem case is what needs to be adjusted instead. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-04-25 23:00 ` David Miller @ 2012-04-25 23:14 ` Yinghai Lu 2012-04-25 23:15 ` David Miller 2012-05-03 15:28 ` Johannes Weiner 1 sibling, 1 reply; 10+ messages in thread From: Yinghai Lu @ 2012-04-25 23:14 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, tj, torvalds On Wed, Apr 25, 2012 at 4:00 PM, David Miller <davem@davemloft.net> wrote: > From: Yinghai Lu <yinghai@kernel.org> > Date: Wed, 25 Apr 2012 15:46:42 -0700 > >> On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: >>> @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, >>> if (WARN_ON_ONCE(slab_is_available())) >>> return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); >>> >>> +again: >>> ptr = __alloc_memory_core_early(pgdat->node_id, size, align, >>> goal, -1ULL); >>> if (ptr) >>> return ptr; >> >> If you want to be consistent to bootmem version. >> >> again label should be here instead. > > It is merely an artifact of implementation that the bootmem version > doesn't try to respect the given node if the goal cannot be satisfied, > and in fact I would classify that as a bug that needs to be fixed. > > Therefore, I believe the bootmem case is what needs to be adjusted > instead. Yes. Acked-by: Yinghai Lu <yinghai@kernel.org> Linus will pick it directly or through your sparc nobootmem conversion? Yinghai ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-04-25 23:14 ` Yinghai Lu @ 2012-04-25 23:15 ` David Miller 0 siblings, 0 replies; 10+ messages in thread From: David Miller @ 2012-04-25 23:15 UTC (permalink / raw) To: yinghai; +Cc: linux-kernel, tj, torvalds From: Yinghai Lu <yinghai@kernel.org> Date: Wed, 25 Apr 2012 16:14:00 -0700 > On Wed, Apr 25, 2012 at 4:00 PM, David Miller <davem@davemloft.net> wrote: >> From: Yinghai Lu <yinghai@kernel.org> >> Date: Wed, 25 Apr 2012 15:46:42 -0700 >> >>> On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: >>>> @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, >>>> if (WARN_ON_ONCE(slab_is_available())) >>>> return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); >>>> >>>> +again: >>>> ptr = __alloc_memory_core_early(pgdat->node_id, size, align, >>>> goal, -1ULL); >>>> if (ptr) >>>> return ptr; >>> >>> If you want to be consistent to bootmem version. >>> >>> again label should be here instead. >> >> It is merely an artifact of implementation that the bootmem version >> doesn't try to respect the given node if the goal cannot be satisfied, >> and in fact I would classify that as a bug that needs to be fixed. >> >> Therefore, I believe the bootmem case is what needs to be adjusted >> instead. > > Yes. > > Acked-by: Yinghai Lu <yinghai@kernel.org> > > Linus will pick it directly or through your sparc nobootmem conversion? I was hoping Linus would take this directly. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-04-25 23:00 ` David Miller 2012-04-25 23:14 ` Yinghai Lu @ 2012-05-03 15:28 ` Johannes Weiner 2012-05-03 17:04 ` David Miller 1 sibling, 1 reply; 10+ messages in thread From: Johannes Weiner @ 2012-05-03 15:28 UTC (permalink / raw) To: David Miller; +Cc: yinghai, linux-kernel, tj, torvalds On Wed, Apr 25, 2012 at 07:00:34PM -0400, David Miller wrote: > From: Yinghai Lu <yinghai@kernel.org> > Date: Wed, 25 Apr 2012 15:46:42 -0700 > > > On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: > >> @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, > >> if (WARN_ON_ONCE(slab_is_available())) > >> return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); > >> > >> +again: > >> ptr = __alloc_memory_core_early(pgdat->node_id, size, align, > >> goal, -1ULL); > >> if (ptr) > >> return ptr; > > > > If you want to be consistent to bootmem version. > > > > again label should be here instead. > > It is merely an artifact of implementation that the bootmem version > doesn't try to respect the given node if the goal cannot be satisfied, > and in fact I would classify that as a bug that needs to be fixed. > > Therefore, I believe the bootmem case is what needs to be adjusted > instead. Now it does: node+goal, goal, node, anywhere whereas the memblock version of __alloc_bootmem_node_nopanic() also still does: node+goal, goal, anywhere Your description suggests that the node should be higher prioritized than the goal, which I understand as: node+goal, node, anywhere. Which do we actually want? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-05-03 15:28 ` Johannes Weiner @ 2012-05-03 17:04 ` David Miller 2012-05-04 9:41 ` Johannes Weiner 0 siblings, 1 reply; 10+ messages in thread From: David Miller @ 2012-05-03 17:04 UTC (permalink / raw) To: hannes; +Cc: yinghai, linux-kernel, tj, torvalds From: Johannes Weiner <hannes@cmpxchg.org> Date: Thu, 3 May 2012 17:28:41 +0200 > On Wed, Apr 25, 2012 at 07:00:34PM -0400, David Miller wrote: >> From: Yinghai Lu <yinghai@kernel.org> >> Date: Wed, 25 Apr 2012 15:46:42 -0700 >> >> > On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: >> >> @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, >> >> if (WARN_ON_ONCE(slab_is_available())) >> >> return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); >> >> >> >> +again: >> >> ptr = __alloc_memory_core_early(pgdat->node_id, size, align, >> >> goal, -1ULL); >> >> if (ptr) >> >> return ptr; >> > >> > If you want to be consistent to bootmem version. >> > >> > again label should be here instead. >> >> It is merely an artifact of implementation that the bootmem version >> doesn't try to respect the given node if the goal cannot be satisfied, >> and in fact I would classify that as a bug that needs to be fixed. >> >> Therefore, I believe the bootmem case is what needs to be adjusted >> instead. > > Now it does: node+goal, goal, node, anywhere > > whereas the memblock version of __alloc_bootmem_node_nopanic() also > still does: node+goal, goal, anywhere > > Your description suggests that the node should be higher prioritized > than the goal, which I understand as: node+goal, node, anywhere. > > Which do we actually want? I think the goal is what needs to be prioritized. An explicit goal usually has a requirement, like "I need physical memory in the low 32-bits" and if they specified an explicit node they really mean "and give me it on NUMA node X if you can." Hence the sequence: node+goal, goal, node, any the only other reasonable option would be: node+goal, node, goal, any but I think that doesn't match what people want when an explicit goal is specified. Do you? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-05-03 17:04 ` David Miller @ 2012-05-04 9:41 ` Johannes Weiner 2012-05-04 14:46 ` David Miller 0 siblings, 1 reply; 10+ messages in thread From: Johannes Weiner @ 2012-05-04 9:41 UTC (permalink / raw) To: David Miller; +Cc: yinghai, linux-kernel, tj, torvalds On Thu, May 03, 2012 at 01:04:16PM -0400, David Miller wrote: > From: Johannes Weiner <hannes@cmpxchg.org> > Date: Thu, 3 May 2012 17:28:41 +0200 > > > On Wed, Apr 25, 2012 at 07:00:34PM -0400, David Miller wrote: > >> From: Yinghai Lu <yinghai@kernel.org> > >> Date: Wed, 25 Apr 2012 15:46:42 -0700 > >> > >> > On Wed, Apr 25, 2012 at 1:10 PM, David Miller <davem@davemloft.net> wrote: > >> >> @@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size, > >> >> if (WARN_ON_ONCE(slab_is_available())) > >> >> return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id); > >> >> > >> >> +again: > >> >> ptr = __alloc_memory_core_early(pgdat->node_id, size, align, > >> >> goal, -1ULL); > >> >> if (ptr) > >> >> return ptr; > >> > > >> > If you want to be consistent to bootmem version. > >> > > >> > again label should be here instead. > >> > >> It is merely an artifact of implementation that the bootmem version > >> doesn't try to respect the given node if the goal cannot be satisfied, > >> and in fact I would classify that as a bug that needs to be fixed. > >> > >> Therefore, I believe the bootmem case is what needs to be adjusted > >> instead. > > > > Now it does: node+goal, goal, node, anywhere > > > > whereas the memblock version of __alloc_bootmem_node_nopanic() also > > still does: node+goal, goal, anywhere > > > > Your description suggests that the node should be higher prioritized > > than the goal, which I understand as: node+goal, node, anywhere. > > > > Which do we actually want? > > I think the goal is what needs to be prioritized. An explicit goal usually > has a requirement, like "I need physical memory in the low 32-bits" and if > they specified an explicit node they really mean "and give me it on NUMA > node X if you can." Hence the sequence: > > node+goal, goal, node, any > > the only other reasonable option would be: > > node+goal, node, goal, any > > but I think that doesn't match what people want when an explicit goal > is specified. Do you? Oh I think that's what limit is for. The goal is usually to allocate high address memory for users that can deal with it and keep lowmem for users that can't. For example, I can imagine sparsemem usemap allocation in the memory hotplug case would prefer having the usemap on the same node as the corresponding pgdat descriptor than allocating on any node above the goal and possibly create circular dependencies. But that is quite rare/unlikely anyway, and I guess in most other cases it's better to go for preventing lowmem exhaustian than to preserve node locality. So I'm fine with this priority order, but it's a judgement call. I'll send patches to make everything use the same policy. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: nobootmem: Correct alloc_bootmem semantics. 2012-05-04 9:41 ` Johannes Weiner @ 2012-05-04 14:46 ` David Miller 0 siblings, 0 replies; 10+ messages in thread From: David Miller @ 2012-05-04 14:46 UTC (permalink / raw) To: hannes; +Cc: yinghai, linux-kernel, tj, torvalds From: Johannes Weiner <hannes@cmpxchg.org> Date: Fri, 4 May 2012 11:41:05 +0200 > I'll send patches to make everything use the same policy. Thanks for doing this Johannes. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-05-04 14:47 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-04-25 20:10 [PATCH] mm: nobootmem: Correct alloc_bootmem semantics David Miller 2012-04-25 20:12 ` Tejun Heo 2012-04-25 22:46 ` Yinghai Lu 2012-04-25 23:00 ` David Miller 2012-04-25 23:14 ` Yinghai Lu 2012-04-25 23:15 ` David Miller 2012-05-03 15:28 ` Johannes Weiner 2012-05-03 17:04 ` David Miller 2012-05-04 9:41 ` Johannes Weiner 2012-05-04 14:46 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).