linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* __GFP_REPEAT usage in fq_alloc_node
@ 2017-01-06 15:20 Michal Hocko
  2017-01-06 15:39 ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2017-01-06 15:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-mm, LKML

Hi Eric,
I am currently checking kmalloc with vmalloc fallback users and convert
them to a new kvmalloc helper [1]. While I am adding a support for
__GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use
__GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef
("pkt_sched: fq: more robust memory allocation") doesn't mention
anything. Could you clarify this please?

Thanks!

[1] http://lkml.kernel.org/r/20170102133700.1734-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20170104181229.GB10183@dhcp22.suse.cz
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 15:20 __GFP_REPEAT usage in fq_alloc_node Michal Hocko
@ 2017-01-06 15:39 ` Eric Dumazet
  2017-01-06 16:07   ` Michal Hocko
  2017-01-06 16:31   ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka
  0 siblings, 2 replies; 19+ messages in thread
From: Eric Dumazet @ 2017-01-06 15:39 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, LKML

On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote:
>
> Hi Eric,
> I am currently checking kmalloc with vmalloc fallback users and convert
> them to a new kvmalloc helper [1]. While I am adding a support for
> __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use
> __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef
> ("pkt_sched: fq: more robust memory allocation") doesn't mention
> anything. Could you clarify this please?
>
> Thanks!

I guess this question applies to all __GFP_REPEAT usages in net/ ?

At the time, tests on the hardware I had in my labs showed that
vmalloc() could deliver pages spread
all over the memory and that was a small penalty (once memory is
fragmented enough, not at boot time)

I guess this wont be anymore a concern if I can finish my pending work
about vmalloc() trying to get adjacent pages
https://lkml.org/lkml/2016/12/21/285

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 15:39 ` Eric Dumazet
@ 2017-01-06 16:07   ` Michal Hocko
  2017-01-06 16:19     ` Michal Hocko
  2017-01-14 23:43     ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot
  2017-01-06 16:31   ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka
  1 sibling, 2 replies; 19+ messages in thread
From: Michal Hocko @ 2017-01-06 16:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-mm, LKML

On Fri 06-01-17 07:39:14, Eric Dumazet wrote:
> On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote:
> >
> > Hi Eric,
> > I am currently checking kmalloc with vmalloc fallback users and convert
> > them to a new kvmalloc helper [1]. While I am adding a support for
> > __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use
> > __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef
> > ("pkt_sched: fq: more robust memory allocation") doesn't mention
> > anything. Could you clarify this please?
> >
> > Thanks!
> 
> I guess this question applies to all __GFP_REPEAT usages in net/ ?

I am _currently_ interested only in those which have vmalloc fallback
and cannot see more of them. Maybe my git grep foo needs some help.

> At the time, tests on the hardware I had in my labs showed that
> vmalloc() could deliver pages spread
> all over the memory and that was a small penalty (once memory is
> fragmented enough, not at boot time)

I see. Then I will go with kvmalloc with __GFP_REPEAT and we can drop
the flag later after it is not needed anymore. See the patch below.

Thanks for the clarification.

> I guess this wont be anymore a concern if I can finish my pending work
> about vmalloc() trying to get adjacent pages
> https://lkml.org/lkml/2016/12/21/285

I see

Thanks!
---
>From 1f3769de85c18aa0796f215cffdb01a2e70d2d2f Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Fri, 6 Jan 2017 17:03:31 +0100
Subject: [PATCH] net_sched: use kvmalloc rather than opencoded variant

fq_alloc_node opencodes kmalloc with vmalloc fallback. Use the kvmalloc
variant instead. Keep the __GFP_REPEAT flag based on explanation from
Eric:
"
At the time, tests on the hardware I had in my labs showed that
vmalloc() could deliver pages spread all over the memory and that was a
small penalty (once memory is fragmented enough, not at boot time)
"

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 net/sched/sch_fq.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index a4f738ac7728..594f77d89f6c 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -624,16 +624,6 @@ static void fq_rehash(struct fq_sched_data *q,
 	q->stat_gc_flows += fcnt;
 }
 
-static void *fq_alloc_node(size_t sz, int node)
-{
-	void *ptr;
-
-	ptr = kmalloc_node(sz, GFP_KERNEL | __GFP_REPEAT | __GFP_NOWARN, node);
-	if (!ptr)
-		ptr = vmalloc_node(sz, node);
-	return ptr;
-}
-
 static void fq_free(void *addr)
 {
 	kvfree(addr);
@@ -650,7 +640,7 @@ static int fq_resize(struct Qdisc *sch, u32 log)
 		return 0;
 
 	/* If XPS was setup, we can allocate memory on right NUMA node */
-	array = fq_alloc_node(sizeof(struct rb_root) << log,
+	array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
 			      netdev_queue_numa_node_read(sch->dev_queue));
 	if (!array)
 		return -ENOMEM;
-- 
2.11.0


-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 16:07   ` Michal Hocko
@ 2017-01-06 16:19     ` Michal Hocko
  2017-01-07  3:33       ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot
                         ` (2 more replies)
  2017-01-14 23:43     ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot
  1 sibling, 3 replies; 19+ messages in thread
From: Michal Hocko @ 2017-01-06 16:19 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-mm, LKML

On Fri 06-01-17 17:07:43, Michal Hocko wrote:
> On Fri 06-01-17 07:39:14, Eric Dumazet wrote:
> > On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > Hi Eric,
> > > I am currently checking kmalloc with vmalloc fallback users and convert
> > > them to a new kvmalloc helper [1]. While I am adding a support for
> > > __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use
> > > __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef
> > > ("pkt_sched: fq: more robust memory allocation") doesn't mention
> > > anything. Could you clarify this please?
> > >
> > > Thanks!
> > 
> > I guess this question applies to all __GFP_REPEAT usages in net/ ?
> 
> I am _currently_ interested only in those which have vmalloc fallback
> and cannot see more of them. Maybe my git grep foo needs some help.

I was obviously blind...
---
>From 8eadf8774daecdd6c4de37339216282a16920458 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Fri, 6 Jan 2017 17:03:31 +0100
Subject: [PATCH] net: use kvmalloc rather than open coded variant

fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc
with vmalloc fallback. Use the kvmalloc variant instead. Keep the
__GFP_REPEAT flag based on explanation from
Eric:
"
At the time, tests on the hardware I had in my labs showed that
vmalloc() could deliver pages spread all over the memory and that was a
small penalty (once memory is fragmented enough, not at boot time)
"

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 net/core/dev.c     | 24 +++++++++---------------
 net/sched/sch_fq.c | 12 +-----------
 2 files changed, 10 insertions(+), 26 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 56818f7eab2b..5cf2762387aa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7111,12 +7111,10 @@ static int netif_alloc_rx_queues(struct net_device *dev)
 
 	BUG_ON(count < 1);
 
-	rx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
-	if (!rx) {
-		rx = vzalloc(sz);
-		if (!rx)
-			return -ENOMEM;
-	}
+	rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
+	if (!rx)
+		return -ENOMEM;
+
 	dev->_rx = rx;
 
 	for (i = 0; i < count; i++)
@@ -7153,12 +7151,10 @@ static int netif_alloc_netdev_queues(struct net_device *dev)
 	if (count < 1 || count > 0xffff)
 		return -EINVAL;
 
-	tx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
-	if (!tx) {
-		tx = vzalloc(sz);
-		if (!tx)
-			return -ENOMEM;
-	}
+	tx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
+	if (!tx)
+		return -ENOMEM;
+
 	dev->_tx = tx;
 
 	netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL);
@@ -7691,9 +7687,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	/* ensure 32-byte alignment of whole construct */
 	alloc_size += NETDEV_ALIGN - 1;
 
-	p = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
-	if (!p)
-		p = vzalloc(alloc_size);
+	p = kvzalloc(alloc_size, GFP_KERNEL | __GFP_REPEAT);
 	if (!p)
 		return NULL;
 
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index a4f738ac7728..594f77d89f6c 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -624,16 +624,6 @@ static void fq_rehash(struct fq_sched_data *q,
 	q->stat_gc_flows += fcnt;
 }
 
-static void *fq_alloc_node(size_t sz, int node)
-{
-	void *ptr;
-
-	ptr = kmalloc_node(sz, GFP_KERNEL | __GFP_REPEAT | __GFP_NOWARN, node);
-	if (!ptr)
-		ptr = vmalloc_node(sz, node);
-	return ptr;
-}
-
 static void fq_free(void *addr)
 {
 	kvfree(addr);
@@ -650,7 +640,7 @@ static int fq_resize(struct Qdisc *sch, u32 log)
 		return 0;
 
 	/* If XPS was setup, we can allocate memory on right NUMA node */
-	array = fq_alloc_node(sizeof(struct rb_root) << log,
+	array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
 			      netdev_queue_numa_node_read(sch->dev_queue));
 	if (!array)
 		return -ENOMEM;
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 15:39 ` Eric Dumazet
  2017-01-06 16:07   ` Michal Hocko
@ 2017-01-06 16:31   ` Vlastimil Babka
  2017-01-06 16:48     ` Eric Dumazet
  1 sibling, 1 reply; 19+ messages in thread
From: Vlastimil Babka @ 2017-01-06 16:31 UTC (permalink / raw)
  To: Eric Dumazet, Michal Hocko; +Cc: linux-mm, LKML

On 01/06/2017 04:39 PM, Eric Dumazet wrote:
> On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>
>> Hi Eric,
>> I am currently checking kmalloc with vmalloc fallback users and convert
>> them to a new kvmalloc helper [1]. While I am adding a support for
>> __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use
>> __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef
>> ("pkt_sched: fq: more robust memory allocation") doesn't mention
>> anything. Could you clarify this please?
>>
>> Thanks!
> 
> I guess this question applies to all __GFP_REPEAT usages in net/ ?
> 
> At the time, tests on the hardware I had in my labs showed that
> vmalloc() could deliver pages spread
> all over the memory and that was a small penalty (once memory is
> fragmented enough, not at boot time)

I wonder what's that cause of the penalty (when accessing the vmapped
area I suppose?) Is it higher risk of collisions cache misses within the
area, compared to consecutive physical adresses?

> I guess this wont be anymore a concern if I can finish my pending work
> about vmalloc() trying to get adjacent pages
> https://lkml.org/lkml/2016/12/21/285
> 
> Thanks.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 16:31   ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka
@ 2017-01-06 16:48     ` Eric Dumazet
  2017-01-06 16:50       ` Eric Dumazet
  2017-01-06 16:55       ` Vlastimil Babka
  0 siblings, 2 replies; 19+ messages in thread
From: Eric Dumazet @ 2017-01-06 16:48 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Michal Hocko, linux-mm, LKML

On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote:

>
> I wonder what's that cause of the penalty (when accessing the vmapped
> area I suppose?) Is it higher risk of collisions cache misses within the
> area, compared to consecutive physical adresses?

I believe tests were done with 48 fq qdisc, each having 2^16 slots.
So I had 48 blocs,of 524288 bytes.

Trying a bit harder at setup time to get 128 consecutive pages got
less TLB pressure.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 16:48     ` Eric Dumazet
@ 2017-01-06 16:50       ` Eric Dumazet
  2017-01-06 16:55       ` Vlastimil Babka
  1 sibling, 0 replies; 19+ messages in thread
From: Eric Dumazet @ 2017-01-06 16:50 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Michal Hocko, linux-mm, LKML

On Fri, Jan 6, 2017 at 8:48 AM, Eric Dumazet <edumazet@google.com> wrote:
> On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>
>>
>> I wonder what's that cause of the penalty (when accessing the vmapped
>> area I suppose?) Is it higher risk of collisions cache misses within the
>> area, compared to consecutive physical adresses?
>
> I believe tests were done with 48 fq qdisc, each having 2^16 slots.
> So I had 48 blocs,of 524288 bytes.
>
> Trying a bit harder at setup time to get 128 consecutive pages got
> less TLB pressure.

Forgot to mention tests include DDOS, so hitting a random hash bucket
for every packet.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 16:48     ` Eric Dumazet
  2017-01-06 16:50       ` Eric Dumazet
@ 2017-01-06 16:55       ` Vlastimil Babka
  2017-01-06 17:08         ` Eric Dumazet
  1 sibling, 1 reply; 19+ messages in thread
From: Vlastimil Babka @ 2017-01-06 16:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Michal Hocko, linux-mm, LKML

On 01/06/2017 05:48 PM, Eric Dumazet wrote:
> On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
>>
>> I wonder what's that cause of the penalty (when accessing the vmapped
>> area I suppose?) Is it higher risk of collisions cache misses within the
>> area, compared to consecutive physical adresses?
> 
> I believe tests were done with 48 fq qdisc, each having 2^16 slots.
> So I had 48 blocs,of 524288 bytes.
> 
> Trying a bit harder at setup time to get 128 consecutive pages got
> less TLB pressure.

Hmm that's rather surprising to me. TLB caches the page table lookups
and the PFN's of the physical pages it translates to shouldn't matter -
the page tables will look the same. With 128 consecutive pages could
manifest the reduced collision cache miss effect though.

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 16:55       ` Vlastimil Babka
@ 2017-01-06 17:08         ` Eric Dumazet
  2017-01-06 17:18           ` Vlastimil Babka
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2017-01-06 17:08 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Michal Hocko, linux-mm, LKML

On Fri, Jan 6, 2017 at 8:55 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 01/06/2017 05:48 PM, Eric Dumazet wrote:
>> On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>>>
>>> I wonder what's that cause of the penalty (when accessing the vmapped
>>> area I suppose?) Is it higher risk of collisions cache misses within the
>>> area, compared to consecutive physical adresses?
>>
>> I believe tests were done with 48 fq qdisc, each having 2^16 slots.
>> So I had 48 blocs,of 524288 bytes.
>>
>> Trying a bit harder at setup time to get 128 consecutive pages got
>> less TLB pressure.
>
> Hmm that's rather surprising to me. TLB caches the page table lookups
> and the PFN's of the physical pages it translates to shouldn't matter -
> the page tables will look the same. With 128 consecutive pages could
> manifest the reduced collision cache miss effect though.
>

To be clear, the difference came from :

Using kmalloc() to allocate 48 x 524288 bytes

Or using vmalloc()

Are you telling me HugePages are not in play there ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 17:08         ` Eric Dumazet
@ 2017-01-06 17:18           ` Vlastimil Babka
  0 siblings, 0 replies; 19+ messages in thread
From: Vlastimil Babka @ 2017-01-06 17:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Michal Hocko, linux-mm, LKML

On 01/06/2017 06:08 PM, Eric Dumazet wrote:
> On Fri, Jan 6, 2017 at 8:55 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 01/06/2017 05:48 PM, Eric Dumazet wrote:
>>> On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>>>
>>>> I wonder what's that cause of the penalty (when accessing the vmapped
>>>> area I suppose?) Is it higher risk of collisions cache misses within the
>>>> area, compared to consecutive physical adresses?
>>>
>>> I believe tests were done with 48 fq qdisc, each having 2^16 slots.
>>> So I had 48 blocs,of 524288 bytes.
>>>
>>> Trying a bit harder at setup time to get 128 consecutive pages got
>>> less TLB pressure.
>>
>> Hmm that's rather surprising to me. TLB caches the page table lookups
>> and the PFN's of the physical pages it translates to shouldn't matter -
>> the page tables will look the same. With 128 consecutive pages could
>> manifest the reduced collision cache miss effect though.
>>
> 
> To be clear, the difference came from :
> 
> Using kmalloc() to allocate 48 x 524288 bytes
> 
> Or using vmalloc()
> 
> Are you telling me HugePages are not in play there ?

Oh that's certainly a difference, as kmalloc() will give you the kernel
mapping which can use 1GB Hugepages. But if you just combine these
kmalloc chunks into vmalloc mapping (IIUC that's what your RFC was
doing?), you lose that benefit AFAIK. On the other hand I recall reading
that AMD Zen will have PTE Coalescing [1] which, if true and I
understand that correctly, would indeed result in better TLB usage with
adjacent page table entries pointing to consecutive pages. But perhaps
the starting pte's position will also have to be aligned to make this
work, dunno.

[1]
http://www.anandtech.com/show/10591/amd-zen-microarchiture-part-2-extracting-instructionlevel-parallelism/6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] net: use kvmalloc rather than open coded variant
  2017-01-06 16:19     ` Michal Hocko
@ 2017-01-07  3:33       ` kbuild test robot
  2017-01-07  9:19         ` Michal Hocko
  2017-01-07  3:35       ` kbuild test robot
  2017-01-09 10:22       ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko
  2 siblings, 1 reply; 19+ messages in thread
From: kbuild test robot @ 2017-01-07  3:33 UTC (permalink / raw)
  To: Michal Hocko; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML

[-- Attachment #1: Type: text/plain, Size: 2083 bytes --]

Hi Michal,

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.10-rc2 next-20170106]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105
config: x86_64-randconfig-i0-201701 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

Note: the linux-review/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 HEAD 29df6a817f53555953b47c6f8d09397f9f7b598c builds fine.
      It only hurts bisectibility.

All error/warnings (new ones prefixed by >>):

   net/core/dev.c: In function 'netif_alloc_rx_queues':
>> net/core/dev.c:7114:2: error: implicit declaration of function 'kvzalloc' [-Werror=implicit-function-declaration]
     rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
     ^
>> net/core/dev.c:7114:5: warning: assignment makes pointer from integer without a cast
     rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
        ^
   net/core/dev.c: In function 'netif_alloc_netdev_queues':
   net/core/dev.c:7154:5: warning: assignment makes pointer from integer without a cast
     tx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
        ^
   net/core/dev.c: In function 'alloc_netdev_mqs':
   net/core/dev.c:7690:4: warning: assignment makes pointer from integer without a cast
     p = kvzalloc(alloc_size, GFP_KERNEL | __GFP_REPEAT);
       ^
   cc1: some warnings being treated as errors

vim +/kvzalloc +7114 net/core/dev.c

  7108		unsigned int i, count = dev->num_rx_queues;
  7109		struct netdev_rx_queue *rx;
  7110		size_t sz = count * sizeof(*rx);
  7111	
  7112		BUG_ON(count < 1);
  7113	
> 7114		rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
  7115		if (!rx)
  7116			return -ENOMEM;
  7117	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24916 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] net: use kvmalloc rather than open coded variant
  2017-01-06 16:19     ` Michal Hocko
  2017-01-07  3:33       ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot
@ 2017-01-07  3:35       ` kbuild test robot
  2017-01-09 10:22       ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko
  2 siblings, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2017-01-07  3:35 UTC (permalink / raw)
  To: Michal Hocko; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML

[-- Attachment #1: Type: text/plain, Size: 1693 bytes --]

Hi Michal,

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.10-rc2 next-20170106]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105
config: i386-randconfig-n0-201701 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All error/warnings (new ones prefixed by >>):

   net/sched/sch_fq.c: In function 'fq_resize':
>> net/sched/sch_fq.c:643:10: error: implicit declaration of function 'kvmalloc_node' [-Werror=implicit-function-declaration]
     array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
             ^~~~~~~~~~~~~
>> net/sched/sch_fq.c:643:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
     array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
           ^
   cc1: some warnings being treated as errors

vim +/kvmalloc_node +643 net/sched/sch_fq.c

   637		u32 idx;
   638	
   639		if (q->fq_root && log == q->fq_trees_log)
   640			return 0;
   641	
   642		/* If XPS was setup, we can allocate memory on right NUMA node */
 > 643		array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
   644				      netdev_queue_numa_node_read(sch->dev_queue));
   645		if (!array)
   646			return -ENOMEM;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25032 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] net: use kvmalloc rather than open coded variant
  2017-01-07  3:33       ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot
@ 2017-01-07  9:19         ` Michal Hocko
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2017-01-07  9:19 UTC (permalink / raw)
  To: kbuild test robot; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML

On Sat 07-01-17 11:33:15, kbuild test robot wrote:
> Hi Michal,
> 
> [auto build test ERROR on net-next/master]
> [also build test ERROR on v4.10-rc2 next-20170106]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105
> config: x86_64-randconfig-i0-201701 (attached as .config)
> compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64 
> 
> Note: the linux-review/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 HEAD 29df6a817f53555953b47c6f8d09397f9f7b598c builds fine.
>       It only hurts bisectibility.

This patch depends on
http://lkml.kernel.org/r/20170102133700.1734-1-mhocko@kernel.org
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-06 16:19     ` Michal Hocko
  2017-01-07  3:33       ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot
  2017-01-07  3:35       ` kbuild test robot
@ 2017-01-09 10:22       ` Michal Hocko
  2017-01-09 16:00         ` Eric Dumazet
  2 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2017-01-09 10:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-mm, LKML

On Fri 06-01-17 17:19:44, Michal Hocko wrote:
[...]
> From 8eadf8774daecdd6c4de37339216282a16920458 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Fri, 6 Jan 2017 17:03:31 +0100
> Subject: [PATCH] net: use kvmalloc rather than open coded variant
> 
> fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc
> with vmalloc fallback. Use the kvmalloc variant instead. Keep the
> __GFP_REPEAT flag based on explanation from
> Eric:
> "
> At the time, tests on the hardware I had in my labs showed that
> vmalloc() could deliver pages spread all over the memory and that was a
> small penalty (once memory is fragmented enough, not at boot time)
> "

the changelog doesn't mention it but this, unlike other kvmalloc
conversions is not without functional changes. The kmalloc part
will be weaker than it is with the original code for !costly (<64kB)
requests, because we are enforcing __GFP_NORETRY to break out from the
page allocator which doesn't really fail such a small requests.

Now the question is what those code paths really prefer. Do they really
want to potentially loop in the page allocator and invoke the OOM killer
when the memory is short/fragmeted? I mean we can get into a situation
when no order-3 pages can be compacted and shooting the system down just
for that reason sounds quite dangerous to me.

So the main question is how hard should we try before falling back to
vmalloc here?
 
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  net/core/dev.c     | 24 +++++++++---------------
>  net/sched/sch_fq.c | 12 +-----------
>  2 files changed, 10 insertions(+), 26 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 56818f7eab2b..5cf2762387aa 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7111,12 +7111,10 @@ static int netif_alloc_rx_queues(struct net_device *dev)
>  
>  	BUG_ON(count < 1);
>  
> -	rx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
> -	if (!rx) {
> -		rx = vzalloc(sz);
> -		if (!rx)
> -			return -ENOMEM;
> -	}
> +	rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
> +	if (!rx)
> +		return -ENOMEM;
> +
>  	dev->_rx = rx;
>  
>  	for (i = 0; i < count; i++)
> @@ -7153,12 +7151,10 @@ static int netif_alloc_netdev_queues(struct net_device *dev)
>  	if (count < 1 || count > 0xffff)
>  		return -EINVAL;
>  
> -	tx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
> -	if (!tx) {
> -		tx = vzalloc(sz);
> -		if (!tx)
> -			return -ENOMEM;
> -	}
> +	tx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT);
> +	if (!tx)
> +		return -ENOMEM;
> +
>  	dev->_tx = tx;
>  
>  	netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL);
> @@ -7691,9 +7687,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>  	/* ensure 32-byte alignment of whole construct */
>  	alloc_size += NETDEV_ALIGN - 1;
>  
> -	p = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
> -	if (!p)
> -		p = vzalloc(alloc_size);
> +	p = kvzalloc(alloc_size, GFP_KERNEL | __GFP_REPEAT);
>  	if (!p)
>  		return NULL;
>  
> diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
> index a4f738ac7728..594f77d89f6c 100644
> --- a/net/sched/sch_fq.c
> +++ b/net/sched/sch_fq.c
> @@ -624,16 +624,6 @@ static void fq_rehash(struct fq_sched_data *q,
>  	q->stat_gc_flows += fcnt;
>  }
>  
> -static void *fq_alloc_node(size_t sz, int node)
> -{
> -	void *ptr;
> -
> -	ptr = kmalloc_node(sz, GFP_KERNEL | __GFP_REPEAT | __GFP_NOWARN, node);
> -	if (!ptr)
> -		ptr = vmalloc_node(sz, node);
> -	return ptr;
> -}
> -
>  static void fq_free(void *addr)
>  {
>  	kvfree(addr);
> @@ -650,7 +640,7 @@ static int fq_resize(struct Qdisc *sch, u32 log)
>  		return 0;
>  
>  	/* If XPS was setup, we can allocate memory on right NUMA node */
> -	array = fq_alloc_node(sizeof(struct rb_root) << log,
> +	array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
>  			      netdev_queue_numa_node_read(sch->dev_queue));
>  	if (!array)
>  		return -ENOMEM;
> -- 
> 2.11.0
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-09 10:22       ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko
@ 2017-01-09 16:00         ` Eric Dumazet
  2017-01-09 17:45           ` Michal Hocko
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2017-01-09 16:00 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, LKML

On Mon, Jan 9, 2017 at 2:22 AM, Michal Hocko <mhocko@kernel.org> wrote:
>
> the changelog doesn't mention it but this, unlike other kvmalloc
> conversions is not without functional changes. The kmalloc part
> will be weaker than it is with the original code for !costly (<64kB)
> requests, because we are enforcing __GFP_NORETRY to break out from the
> page allocator which doesn't really fail such a small requests.
>
> Now the question is what those code paths really prefer. Do they really
> want to potentially loop in the page allocator and invoke the OOM killer
> when the memory is short/fragmeted? I mean we can get into a situation
> when no order-3 pages can be compacted and shooting the system down just
> for that reason sounds quite dangerous to me.
>
> So the main question is how hard should we try before falling back to
> vmalloc here?

This patch is fine :

1) Default hash size is 1024 slots, 8192 bytes on 64bit arches.
2) Most of the times, qdisc are setup at boot time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-09 16:00         ` Eric Dumazet
@ 2017-01-09 17:45           ` Michal Hocko
  2017-01-09 17:53             ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2017-01-09 17:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-mm, LKML

On Mon 09-01-17 08:00:16, Eric Dumazet wrote:
> On Mon, Jan 9, 2017 at 2:22 AM, Michal Hocko <mhocko@kernel.org> wrote:
> >
> > the changelog doesn't mention it but this, unlike other kvmalloc
> > conversions is not without functional changes. The kmalloc part
> > will be weaker than it is with the original code for !costly (<64kB)
> > requests, because we are enforcing __GFP_NORETRY to break out from the
> > page allocator which doesn't really fail such a small requests.
> >
> > Now the question is what those code paths really prefer. Do they really
> > want to potentially loop in the page allocator and invoke the OOM killer
> > when the memory is short/fragmeted? I mean we can get into a situation
> > when no order-3 pages can be compacted and shooting the system down just
> > for that reason sounds quite dangerous to me.
> >
> > So the main question is how hard should we try before falling back to
> > vmalloc here?
> 
> This patch is fine :
> 
> 1) Default hash size is 1024 slots, 8192 bytes on 64bit arches.

What about those non-default configurations. Do they really want to
invoke the OOM killer rather than fallback to the vmalloc?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: __GFP_REPEAT usage in fq_alloc_node
  2017-01-09 17:45           ` Michal Hocko
@ 2017-01-09 17:53             ` Eric Dumazet
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Dumazet @ 2017-01-09 17:53 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, LKML

On Mon, Jan 9, 2017 at 9:45 AM, Michal Hocko <mhocko@kernel.org> wrote:

> What about those non-default configurations. Do they really want to
> invoke the OOM killer rather than fallback to the vmalloc?

In our case, we use 4096 slots per fq, so that is a 16KB memory allocation.
And these allocations happen right after boot, while we have plenty of
non fragmented memory.

Presumably falling back to vmalloc() would be just fine.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] net_sched: use kvmalloc rather than opencoded variant
  2017-01-06 16:07   ` Michal Hocko
  2017-01-06 16:19     ` Michal Hocko
@ 2017-01-14 23:43     ` kbuild test robot
  2017-01-16  8:54       ` Michal Hocko
  1 sibling, 1 reply; 19+ messages in thread
From: kbuild test robot @ 2017-01-14 23:43 UTC (permalink / raw)
  To: Michal Hocko; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

Hi Michal,

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.10-rc3 next-20170113]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/net_sched-use-kvmalloc-rather-than-opencoded-variant/20170107-120926
config: i386-randconfig-h1-01150559 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   net/sched/sch_fq.c: In function 'fq_resize':
>> net/sched/sch_fq.c:643:10: error: implicit declaration of function 'kvmalloc_node' [-Werror=implicit-function-declaration]
     array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
             ^~~~~~~~~~~~~
   net/sched/sch_fq.c:643:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
     array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
           ^
   cc1: some warnings being treated as errors

vim +/kvmalloc_node +643 net/sched/sch_fq.c

   637		u32 idx;
   638	
   639		if (q->fq_root && log == q->fq_trees_log)
   640			return 0;
   641	
   642		/* If XPS was setup, we can allocate memory on right NUMA node */
 > 643		array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT,
   644				      netdev_queue_numa_node_read(sch->dev_queue));
   645		if (!array)
   646			return -ENOMEM;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30475 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] net_sched: use kvmalloc rather than opencoded variant
  2017-01-14 23:43     ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot
@ 2017-01-16  8:54       ` Michal Hocko
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2017-01-16  8:54 UTC (permalink / raw)
  To: kbuild test robot; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML

On Sun 15-01-17 07:43:01, kbuild test robot wrote:
> Hi Michal,
> 
> [auto build test ERROR on net-next/master]
> [also build test ERROR on v4.10-rc3 next-20170113]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/net_sched-use-kvmalloc-rather-than-opencoded-variant/20170107-120926

This depends on kvmalloc patch. But I have rebased this patch on top of
others in
http://lkml.kernel.org/r/20170112153717.28943-1-mhocko@kernel.org
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-01-16  8:54 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-06 15:20 __GFP_REPEAT usage in fq_alloc_node Michal Hocko
2017-01-06 15:39 ` Eric Dumazet
2017-01-06 16:07   ` Michal Hocko
2017-01-06 16:19     ` Michal Hocko
2017-01-07  3:33       ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot
2017-01-07  9:19         ` Michal Hocko
2017-01-07  3:35       ` kbuild test robot
2017-01-09 10:22       ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko
2017-01-09 16:00         ` Eric Dumazet
2017-01-09 17:45           ` Michal Hocko
2017-01-09 17:53             ` Eric Dumazet
2017-01-14 23:43     ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot
2017-01-16  8:54       ` Michal Hocko
2017-01-06 16:31   ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka
2017-01-06 16:48     ` Eric Dumazet
2017-01-06 16:50       ` Eric Dumazet
2017-01-06 16:55       ` Vlastimil Babka
2017-01-06 17:08         ` Eric Dumazet
2017-01-06 17:18           ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).