* __GFP_REPEAT usage in fq_alloc_node @ 2017-01-06 15:20 Michal Hocko 2017-01-06 15:39 ` Eric Dumazet 0 siblings, 1 reply; 19+ messages in thread From: Michal Hocko @ 2017-01-06 15:20 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-mm, LKML Hi Eric, I am currently checking kmalloc with vmalloc fallback users and convert them to a new kvmalloc helper [1]. While I am adding a support for __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef ("pkt_sched: fq: more robust memory allocation") doesn't mention anything. Could you clarify this please? Thanks! [1] http://lkml.kernel.org/r/20170102133700.1734-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/20170104181229.GB10183@dhcp22.suse.cz -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 15:20 __GFP_REPEAT usage in fq_alloc_node Michal Hocko @ 2017-01-06 15:39 ` Eric Dumazet 2017-01-06 16:07 ` Michal Hocko 2017-01-06 16:31 ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka 0 siblings, 2 replies; 19+ messages in thread From: Eric Dumazet @ 2017-01-06 15:39 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-mm, LKML On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote: > > Hi Eric, > I am currently checking kmalloc with vmalloc fallback users and convert > them to a new kvmalloc helper [1]. While I am adding a support for > __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use > __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef > ("pkt_sched: fq: more robust memory allocation") doesn't mention > anything. Could you clarify this please? > > Thanks! I guess this question applies to all __GFP_REPEAT usages in net/ ? At the time, tests on the hardware I had in my labs showed that vmalloc() could deliver pages spread all over the memory and that was a small penalty (once memory is fragmented enough, not at boot time) I guess this wont be anymore a concern if I can finish my pending work about vmalloc() trying to get adjacent pages https://lkml.org/lkml/2016/12/21/285 Thanks. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 15:39 ` Eric Dumazet @ 2017-01-06 16:07 ` Michal Hocko 2017-01-06 16:19 ` Michal Hocko 2017-01-14 23:43 ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot 2017-01-06 16:31 ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka 1 sibling, 2 replies; 19+ messages in thread From: Michal Hocko @ 2017-01-06 16:07 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-mm, LKML On Fri 06-01-17 07:39:14, Eric Dumazet wrote: > On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote: > > > > Hi Eric, > > I am currently checking kmalloc with vmalloc fallback users and convert > > them to a new kvmalloc helper [1]. While I am adding a support for > > __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use > > __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef > > ("pkt_sched: fq: more robust memory allocation") doesn't mention > > anything. Could you clarify this please? > > > > Thanks! > > I guess this question applies to all __GFP_REPEAT usages in net/ ? I am _currently_ interested only in those which have vmalloc fallback and cannot see more of them. Maybe my git grep foo needs some help. > At the time, tests on the hardware I had in my labs showed that > vmalloc() could deliver pages spread > all over the memory and that was a small penalty (once memory is > fragmented enough, not at boot time) I see. Then I will go with kvmalloc with __GFP_REPEAT and we can drop the flag later after it is not needed anymore. See the patch below. Thanks for the clarification. > I guess this wont be anymore a concern if I can finish my pending work > about vmalloc() trying to get adjacent pages > https://lkml.org/lkml/2016/12/21/285 I see Thanks! --- >From 1f3769de85c18aa0796f215cffdb01a2e70d2d2f Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.com> Date: Fri, 6 Jan 2017 17:03:31 +0100 Subject: [PATCH] net_sched: use kvmalloc rather than opencoded variant fq_alloc_node opencodes kmalloc with vmalloc fallback. Use the kvmalloc variant instead. Keep the __GFP_REPEAT flag based on explanation from Eric: " At the time, tests on the hardware I had in my labs showed that vmalloc() could deliver pages spread all over the memory and that was a small penalty (once memory is fragmented enough, not at boot time) " Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Michal Hocko <mhocko@suse.com> --- net/sched/sch_fq.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index a4f738ac7728..594f77d89f6c 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -624,16 +624,6 @@ static void fq_rehash(struct fq_sched_data *q, q->stat_gc_flows += fcnt; } -static void *fq_alloc_node(size_t sz, int node) -{ - void *ptr; - - ptr = kmalloc_node(sz, GFP_KERNEL | __GFP_REPEAT | __GFP_NOWARN, node); - if (!ptr) - ptr = vmalloc_node(sz, node); - return ptr; -} - static void fq_free(void *addr) { kvfree(addr); @@ -650,7 +640,7 @@ static int fq_resize(struct Qdisc *sch, u32 log) return 0; /* If XPS was setup, we can allocate memory on right NUMA node */ - array = fq_alloc_node(sizeof(struct rb_root) << log, + array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, netdev_queue_numa_node_read(sch->dev_queue)); if (!array) return -ENOMEM; -- 2.11.0 -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 16:07 ` Michal Hocko @ 2017-01-06 16:19 ` Michal Hocko 2017-01-07 3:33 ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot ` (2 more replies) 2017-01-14 23:43 ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot 1 sibling, 3 replies; 19+ messages in thread From: Michal Hocko @ 2017-01-06 16:19 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-mm, LKML On Fri 06-01-17 17:07:43, Michal Hocko wrote: > On Fri 06-01-17 07:39:14, Eric Dumazet wrote: > > On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote: > > > > > > Hi Eric, > > > I am currently checking kmalloc with vmalloc fallback users and convert > > > them to a new kvmalloc helper [1]. While I am adding a support for > > > __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use > > > __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef > > > ("pkt_sched: fq: more robust memory allocation") doesn't mention > > > anything. Could you clarify this please? > > > > > > Thanks! > > > > I guess this question applies to all __GFP_REPEAT usages in net/ ? > > I am _currently_ interested only in those which have vmalloc fallback > and cannot see more of them. Maybe my git grep foo needs some help. I was obviously blind... --- >From 8eadf8774daecdd6c4de37339216282a16920458 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.com> Date: Fri, 6 Jan 2017 17:03:31 +0100 Subject: [PATCH] net: use kvmalloc rather than open coded variant fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc with vmalloc fallback. Use the kvmalloc variant instead. Keep the __GFP_REPEAT flag based on explanation from Eric: " At the time, tests on the hardware I had in my labs showed that vmalloc() could deliver pages spread all over the memory and that was a small penalty (once memory is fragmented enough, not at boot time) " Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Michal Hocko <mhocko@suse.com> --- net/core/dev.c | 24 +++++++++--------------- net/sched/sch_fq.c | 12 +----------- 2 files changed, 10 insertions(+), 26 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 56818f7eab2b..5cf2762387aa 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7111,12 +7111,10 @@ static int netif_alloc_rx_queues(struct net_device *dev) BUG_ON(count < 1); - rx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); - if (!rx) { - rx = vzalloc(sz); - if (!rx) - return -ENOMEM; - } + rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); + if (!rx) + return -ENOMEM; + dev->_rx = rx; for (i = 0; i < count; i++) @@ -7153,12 +7151,10 @@ static int netif_alloc_netdev_queues(struct net_device *dev) if (count < 1 || count > 0xffff) return -EINVAL; - tx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); - if (!tx) { - tx = vzalloc(sz); - if (!tx) - return -ENOMEM; - } + tx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); + if (!tx) + return -ENOMEM; + dev->_tx = tx; netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL); @@ -7691,9 +7687,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, /* ensure 32-byte alignment of whole construct */ alloc_size += NETDEV_ALIGN - 1; - p = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); - if (!p) - p = vzalloc(alloc_size); + p = kvzalloc(alloc_size, GFP_KERNEL | __GFP_REPEAT); if (!p) return NULL; diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index a4f738ac7728..594f77d89f6c 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -624,16 +624,6 @@ static void fq_rehash(struct fq_sched_data *q, q->stat_gc_flows += fcnt; } -static void *fq_alloc_node(size_t sz, int node) -{ - void *ptr; - - ptr = kmalloc_node(sz, GFP_KERNEL | __GFP_REPEAT | __GFP_NOWARN, node); - if (!ptr) - ptr = vmalloc_node(sz, node); - return ptr; -} - static void fq_free(void *addr) { kvfree(addr); @@ -650,7 +640,7 @@ static int fq_resize(struct Qdisc *sch, u32 log) return 0; /* If XPS was setup, we can allocate memory on right NUMA node */ - array = fq_alloc_node(sizeof(struct rb_root) << log, + array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, netdev_queue_numa_node_read(sch->dev_queue)); if (!array) return -ENOMEM; -- 2.11.0 -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] net: use kvmalloc rather than open coded variant 2017-01-06 16:19 ` Michal Hocko @ 2017-01-07 3:33 ` kbuild test robot 2017-01-07 9:19 ` Michal Hocko 2017-01-07 3:35 ` kbuild test robot 2017-01-09 10:22 ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko 2 siblings, 1 reply; 19+ messages in thread From: kbuild test robot @ 2017-01-07 3:33 UTC (permalink / raw) To: Michal Hocko; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML [-- Attachment #1: Type: text/plain, Size: 2083 bytes --] Hi Michal, [auto build test ERROR on net-next/master] [also build test ERROR on v4.10-rc2 next-20170106] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 config: x86_64-randconfig-i0-201701 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 Note: the linux-review/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 HEAD 29df6a817f53555953b47c6f8d09397f9f7b598c builds fine. It only hurts bisectibility. All error/warnings (new ones prefixed by >>): net/core/dev.c: In function 'netif_alloc_rx_queues': >> net/core/dev.c:7114:2: error: implicit declaration of function 'kvzalloc' [-Werror=implicit-function-declaration] rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); ^ >> net/core/dev.c:7114:5: warning: assignment makes pointer from integer without a cast rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); ^ net/core/dev.c: In function 'netif_alloc_netdev_queues': net/core/dev.c:7154:5: warning: assignment makes pointer from integer without a cast tx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); ^ net/core/dev.c: In function 'alloc_netdev_mqs': net/core/dev.c:7690:4: warning: assignment makes pointer from integer without a cast p = kvzalloc(alloc_size, GFP_KERNEL | __GFP_REPEAT); ^ cc1: some warnings being treated as errors vim +/kvzalloc +7114 net/core/dev.c 7108 unsigned int i, count = dev->num_rx_queues; 7109 struct netdev_rx_queue *rx; 7110 size_t sz = count * sizeof(*rx); 7111 7112 BUG_ON(count < 1); 7113 > 7114 rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); 7115 if (!rx) 7116 return -ENOMEM; 7117 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 24916 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] net: use kvmalloc rather than open coded variant 2017-01-07 3:33 ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot @ 2017-01-07 9:19 ` Michal Hocko 0 siblings, 0 replies; 19+ messages in thread From: Michal Hocko @ 2017-01-07 9:19 UTC (permalink / raw) To: kbuild test robot; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML On Sat 07-01-17 11:33:15, kbuild test robot wrote: > Hi Michal, > > [auto build test ERROR on net-next/master] > [also build test ERROR on v4.10-rc2 next-20170106] > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] > > url: https://github.com/0day-ci/linux/commits/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 > config: x86_64-randconfig-i0-201701 (attached as .config) > compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 > reproduce: > # save the attached .config to linux build tree > make ARCH=x86_64 > > Note: the linux-review/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 HEAD 29df6a817f53555953b47c6f8d09397f9f7b598c builds fine. > It only hurts bisectibility. This patch depends on http://lkml.kernel.org/r/20170102133700.1734-1-mhocko@kernel.org -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] net: use kvmalloc rather than open coded variant 2017-01-06 16:19 ` Michal Hocko 2017-01-07 3:33 ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot @ 2017-01-07 3:35 ` kbuild test robot 2017-01-09 10:22 ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko 2 siblings, 0 replies; 19+ messages in thread From: kbuild test robot @ 2017-01-07 3:35 UTC (permalink / raw) To: Michal Hocko; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML [-- Attachment #1: Type: text/plain, Size: 1693 bytes --] Hi Michal, [auto build test ERROR on net-next/master] [also build test ERROR on v4.10-rc2 next-20170106] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Michal-Hocko/net-use-kvmalloc-rather-than-open-coded-variant/20170107-104105 config: i386-randconfig-n0-201701 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): net/sched/sch_fq.c: In function 'fq_resize': >> net/sched/sch_fq.c:643:10: error: implicit declaration of function 'kvmalloc_node' [-Werror=implicit-function-declaration] array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, ^~~~~~~~~~~~~ >> net/sched/sch_fq.c:643:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion] array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, ^ cc1: some warnings being treated as errors vim +/kvmalloc_node +643 net/sched/sch_fq.c 637 u32 idx; 638 639 if (q->fq_root && log == q->fq_trees_log) 640 return 0; 641 642 /* If XPS was setup, we can allocate memory on right NUMA node */ > 643 array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, 644 netdev_queue_numa_node_read(sch->dev_queue)); 645 if (!array) 646 return -ENOMEM; --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 25032 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 16:19 ` Michal Hocko 2017-01-07 3:33 ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot 2017-01-07 3:35 ` kbuild test robot @ 2017-01-09 10:22 ` Michal Hocko 2017-01-09 16:00 ` Eric Dumazet 2 siblings, 1 reply; 19+ messages in thread From: Michal Hocko @ 2017-01-09 10:22 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-mm, LKML On Fri 06-01-17 17:19:44, Michal Hocko wrote: [...] > From 8eadf8774daecdd6c4de37339216282a16920458 Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.com> > Date: Fri, 6 Jan 2017 17:03:31 +0100 > Subject: [PATCH] net: use kvmalloc rather than open coded variant > > fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc > with vmalloc fallback. Use the kvmalloc variant instead. Keep the > __GFP_REPEAT flag based on explanation from > Eric: > " > At the time, tests on the hardware I had in my labs showed that > vmalloc() could deliver pages spread all over the memory and that was a > small penalty (once memory is fragmented enough, not at boot time) > " the changelog doesn't mention it but this, unlike other kvmalloc conversions is not without functional changes. The kmalloc part will be weaker than it is with the original code for !costly (<64kB) requests, because we are enforcing __GFP_NORETRY to break out from the page allocator which doesn't really fail such a small requests. Now the question is what those code paths really prefer. Do they really want to potentially loop in the page allocator and invoke the OOM killer when the memory is short/fragmeted? I mean we can get into a situation when no order-3 pages can be compacted and shooting the system down just for that reason sounds quite dangerous to me. So the main question is how hard should we try before falling back to vmalloc here? > Cc: Eric Dumazet <edumazet@google.com> > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > net/core/dev.c | 24 +++++++++--------------- > net/sched/sch_fq.c | 12 +----------- > 2 files changed, 10 insertions(+), 26 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 56818f7eab2b..5cf2762387aa 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -7111,12 +7111,10 @@ static int netif_alloc_rx_queues(struct net_device *dev) > > BUG_ON(count < 1); > > - rx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); > - if (!rx) { > - rx = vzalloc(sz); > - if (!rx) > - return -ENOMEM; > - } > + rx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); > + if (!rx) > + return -ENOMEM; > + > dev->_rx = rx; > > for (i = 0; i < count; i++) > @@ -7153,12 +7151,10 @@ static int netif_alloc_netdev_queues(struct net_device *dev) > if (count < 1 || count > 0xffff) > return -EINVAL; > > - tx = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); > - if (!tx) { > - tx = vzalloc(sz); > - if (!tx) > - return -ENOMEM; > - } > + tx = kvzalloc(sz, GFP_KERNEL | __GFP_REPEAT); > + if (!tx) > + return -ENOMEM; > + > dev->_tx = tx; > > netdev_for_each_tx_queue(dev, netdev_init_one_queue, NULL); > @@ -7691,9 +7687,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, > /* ensure 32-byte alignment of whole construct */ > alloc_size += NETDEV_ALIGN - 1; > > - p = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT); > - if (!p) > - p = vzalloc(alloc_size); > + p = kvzalloc(alloc_size, GFP_KERNEL | __GFP_REPEAT); > if (!p) > return NULL; > > diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c > index a4f738ac7728..594f77d89f6c 100644 > --- a/net/sched/sch_fq.c > +++ b/net/sched/sch_fq.c > @@ -624,16 +624,6 @@ static void fq_rehash(struct fq_sched_data *q, > q->stat_gc_flows += fcnt; > } > > -static void *fq_alloc_node(size_t sz, int node) > -{ > - void *ptr; > - > - ptr = kmalloc_node(sz, GFP_KERNEL | __GFP_REPEAT | __GFP_NOWARN, node); > - if (!ptr) > - ptr = vmalloc_node(sz, node); > - return ptr; > -} > - > static void fq_free(void *addr) > { > kvfree(addr); > @@ -650,7 +640,7 @@ static int fq_resize(struct Qdisc *sch, u32 log) > return 0; > > /* If XPS was setup, we can allocate memory on right NUMA node */ > - array = fq_alloc_node(sizeof(struct rb_root) << log, > + array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, > netdev_queue_numa_node_read(sch->dev_queue)); > if (!array) > return -ENOMEM; > -- > 2.11.0 > > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-09 10:22 ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko @ 2017-01-09 16:00 ` Eric Dumazet 2017-01-09 17:45 ` Michal Hocko 0 siblings, 1 reply; 19+ messages in thread From: Eric Dumazet @ 2017-01-09 16:00 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-mm, LKML On Mon, Jan 9, 2017 at 2:22 AM, Michal Hocko <mhocko@kernel.org> wrote: > > the changelog doesn't mention it but this, unlike other kvmalloc > conversions is not without functional changes. The kmalloc part > will be weaker than it is with the original code for !costly (<64kB) > requests, because we are enforcing __GFP_NORETRY to break out from the > page allocator which doesn't really fail such a small requests. > > Now the question is what those code paths really prefer. Do they really > want to potentially loop in the page allocator and invoke the OOM killer > when the memory is short/fragmeted? I mean we can get into a situation > when no order-3 pages can be compacted and shooting the system down just > for that reason sounds quite dangerous to me. > > So the main question is how hard should we try before falling back to > vmalloc here? This patch is fine : 1) Default hash size is 1024 slots, 8192 bytes on 64bit arches. 2) Most of the times, qdisc are setup at boot time. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-09 16:00 ` Eric Dumazet @ 2017-01-09 17:45 ` Michal Hocko 2017-01-09 17:53 ` Eric Dumazet 0 siblings, 1 reply; 19+ messages in thread From: Michal Hocko @ 2017-01-09 17:45 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-mm, LKML On Mon 09-01-17 08:00:16, Eric Dumazet wrote: > On Mon, Jan 9, 2017 at 2:22 AM, Michal Hocko <mhocko@kernel.org> wrote: > > > > the changelog doesn't mention it but this, unlike other kvmalloc > > conversions is not without functional changes. The kmalloc part > > will be weaker than it is with the original code for !costly (<64kB) > > requests, because we are enforcing __GFP_NORETRY to break out from the > > page allocator which doesn't really fail such a small requests. > > > > Now the question is what those code paths really prefer. Do they really > > want to potentially loop in the page allocator and invoke the OOM killer > > when the memory is short/fragmeted? I mean we can get into a situation > > when no order-3 pages can be compacted and shooting the system down just > > for that reason sounds quite dangerous to me. > > > > So the main question is how hard should we try before falling back to > > vmalloc here? > > This patch is fine : > > 1) Default hash size is 1024 slots, 8192 bytes on 64bit arches. What about those non-default configurations. Do they really want to invoke the OOM killer rather than fallback to the vmalloc? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-09 17:45 ` Michal Hocko @ 2017-01-09 17:53 ` Eric Dumazet 0 siblings, 0 replies; 19+ messages in thread From: Eric Dumazet @ 2017-01-09 17:53 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-mm, LKML On Mon, Jan 9, 2017 at 9:45 AM, Michal Hocko <mhocko@kernel.org> wrote: > What about those non-default configurations. Do they really want to > invoke the OOM killer rather than fallback to the vmalloc? In our case, we use 4096 slots per fq, so that is a 16KB memory allocation. And these allocations happen right after boot, while we have plenty of non fragmented memory. Presumably falling back to vmalloc() would be just fine. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] net_sched: use kvmalloc rather than opencoded variant 2017-01-06 16:07 ` Michal Hocko 2017-01-06 16:19 ` Michal Hocko @ 2017-01-14 23:43 ` kbuild test robot 2017-01-16 8:54 ` Michal Hocko 1 sibling, 1 reply; 19+ messages in thread From: kbuild test robot @ 2017-01-14 23:43 UTC (permalink / raw) To: Michal Hocko; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML [-- Attachment #1: Type: text/plain, Size: 1692 bytes --] Hi Michal, [auto build test ERROR on net-next/master] [also build test ERROR on v4.10-rc3 next-20170113] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Michal-Hocko/net_sched-use-kvmalloc-rather-than-opencoded-variant/20170107-120926 config: i386-randconfig-h1-01150559 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): net/sched/sch_fq.c: In function 'fq_resize': >> net/sched/sch_fq.c:643:10: error: implicit declaration of function 'kvmalloc_node' [-Werror=implicit-function-declaration] array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, ^~~~~~~~~~~~~ net/sched/sch_fq.c:643:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion] array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, ^ cc1: some warnings being treated as errors vim +/kvmalloc_node +643 net/sched/sch_fq.c 637 u32 idx; 638 639 if (q->fq_root && log == q->fq_trees_log) 640 return 0; 641 642 /* If XPS was setup, we can allocate memory on right NUMA node */ > 643 array = kvmalloc_node(sizeof(struct rb_root) << log, GFP_KERNEL | __GFP_REPEAT, 644 netdev_queue_numa_node_read(sch->dev_queue)); 645 if (!array) 646 return -ENOMEM; --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 30475 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] net_sched: use kvmalloc rather than opencoded variant 2017-01-14 23:43 ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot @ 2017-01-16 8:54 ` Michal Hocko 0 siblings, 0 replies; 19+ messages in thread From: Michal Hocko @ 2017-01-16 8:54 UTC (permalink / raw) To: kbuild test robot; +Cc: kbuild-all, Eric Dumazet, linux-mm, LKML On Sun 15-01-17 07:43:01, kbuild test robot wrote: > Hi Michal, > > [auto build test ERROR on net-next/master] > [also build test ERROR on v4.10-rc3 next-20170113] > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] > > url: https://github.com/0day-ci/linux/commits/Michal-Hocko/net_sched-use-kvmalloc-rather-than-opencoded-variant/20170107-120926 This depends on kvmalloc patch. But I have rebased this patch on top of others in http://lkml.kernel.org/r/20170112153717.28943-1-mhocko@kernel.org -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 15:39 ` Eric Dumazet 2017-01-06 16:07 ` Michal Hocko @ 2017-01-06 16:31 ` Vlastimil Babka 2017-01-06 16:48 ` Eric Dumazet 1 sibling, 1 reply; 19+ messages in thread From: Vlastimil Babka @ 2017-01-06 16:31 UTC (permalink / raw) To: Eric Dumazet, Michal Hocko; +Cc: linux-mm, LKML On 01/06/2017 04:39 PM, Eric Dumazet wrote: > On Fri, Jan 6, 2017 at 7:20 AM, Michal Hocko <mhocko@kernel.org> wrote: >> >> Hi Eric, >> I am currently checking kmalloc with vmalloc fallback users and convert >> them to a new kvmalloc helper [1]. While I am adding a support for >> __GFP_REPEAT to kvmalloc [2] I was wondering what is the reason to use >> __GFP_REPEAT in fq_alloc_node in the first place. c3bd85495aef >> ("pkt_sched: fq: more robust memory allocation") doesn't mention >> anything. Could you clarify this please? >> >> Thanks! > > I guess this question applies to all __GFP_REPEAT usages in net/ ? > > At the time, tests on the hardware I had in my labs showed that > vmalloc() could deliver pages spread > all over the memory and that was a small penalty (once memory is > fragmented enough, not at boot time) I wonder what's that cause of the penalty (when accessing the vmapped area I suppose?) Is it higher risk of collisions cache misses within the area, compared to consecutive physical adresses? > I guess this wont be anymore a concern if I can finish my pending work > about vmalloc() trying to get adjacent pages > https://lkml.org/lkml/2016/12/21/285 > > Thanks. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 16:31 ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka @ 2017-01-06 16:48 ` Eric Dumazet 2017-01-06 16:50 ` Eric Dumazet 2017-01-06 16:55 ` Vlastimil Babka 0 siblings, 2 replies; 19+ messages in thread From: Eric Dumazet @ 2017-01-06 16:48 UTC (permalink / raw) To: Vlastimil Babka; +Cc: Michal Hocko, linux-mm, LKML On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote: > > I wonder what's that cause of the penalty (when accessing the vmapped > area I suppose?) Is it higher risk of collisions cache misses within the > area, compared to consecutive physical adresses? I believe tests were done with 48 fq qdisc, each having 2^16 slots. So I had 48 blocs,of 524288 bytes. Trying a bit harder at setup time to get 128 consecutive pages got less TLB pressure. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 16:48 ` Eric Dumazet @ 2017-01-06 16:50 ` Eric Dumazet 2017-01-06 16:55 ` Vlastimil Babka 1 sibling, 0 replies; 19+ messages in thread From: Eric Dumazet @ 2017-01-06 16:50 UTC (permalink / raw) To: Vlastimil Babka; +Cc: Michal Hocko, linux-mm, LKML On Fri, Jan 6, 2017 at 8:48 AM, Eric Dumazet <edumazet@google.com> wrote: > On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote: > >> >> I wonder what's that cause of the penalty (when accessing the vmapped >> area I suppose?) Is it higher risk of collisions cache misses within the >> area, compared to consecutive physical adresses? > > I believe tests were done with 48 fq qdisc, each having 2^16 slots. > So I had 48 blocs,of 524288 bytes. > > Trying a bit harder at setup time to get 128 consecutive pages got > less TLB pressure. Forgot to mention tests include DDOS, so hitting a random hash bucket for every packet. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 16:48 ` Eric Dumazet 2017-01-06 16:50 ` Eric Dumazet @ 2017-01-06 16:55 ` Vlastimil Babka 2017-01-06 17:08 ` Eric Dumazet 1 sibling, 1 reply; 19+ messages in thread From: Vlastimil Babka @ 2017-01-06 16:55 UTC (permalink / raw) To: Eric Dumazet; +Cc: Michal Hocko, linux-mm, LKML On 01/06/2017 05:48 PM, Eric Dumazet wrote: > On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote: > >> >> I wonder what's that cause of the penalty (when accessing the vmapped >> area I suppose?) Is it higher risk of collisions cache misses within the >> area, compared to consecutive physical adresses? > > I believe tests were done with 48 fq qdisc, each having 2^16 slots. > So I had 48 blocs,of 524288 bytes. > > Trying a bit harder at setup time to get 128 consecutive pages got > less TLB pressure. Hmm that's rather surprising to me. TLB caches the page table lookups and the PFN's of the physical pages it translates to shouldn't matter - the page tables will look the same. With 128 consecutive pages could manifest the reduced collision cache miss effect though. > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 16:55 ` Vlastimil Babka @ 2017-01-06 17:08 ` Eric Dumazet 2017-01-06 17:18 ` Vlastimil Babka 0 siblings, 1 reply; 19+ messages in thread From: Eric Dumazet @ 2017-01-06 17:08 UTC (permalink / raw) To: Vlastimil Babka; +Cc: Michal Hocko, linux-mm, LKML On Fri, Jan 6, 2017 at 8:55 AM, Vlastimil Babka <vbabka@suse.cz> wrote: > On 01/06/2017 05:48 PM, Eric Dumazet wrote: >> On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote: >> >>> >>> I wonder what's that cause of the penalty (when accessing the vmapped >>> area I suppose?) Is it higher risk of collisions cache misses within the >>> area, compared to consecutive physical adresses? >> >> I believe tests were done with 48 fq qdisc, each having 2^16 slots. >> So I had 48 blocs,of 524288 bytes. >> >> Trying a bit harder at setup time to get 128 consecutive pages got >> less TLB pressure. > > Hmm that's rather surprising to me. TLB caches the page table lookups > and the PFN's of the physical pages it translates to shouldn't matter - > the page tables will look the same. With 128 consecutive pages could > manifest the reduced collision cache miss effect though. > To be clear, the difference came from : Using kmalloc() to allocate 48 x 524288 bytes Or using vmalloc() Are you telling me HugePages are not in play there ? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: __GFP_REPEAT usage in fq_alloc_node 2017-01-06 17:08 ` Eric Dumazet @ 2017-01-06 17:18 ` Vlastimil Babka 0 siblings, 0 replies; 19+ messages in thread From: Vlastimil Babka @ 2017-01-06 17:18 UTC (permalink / raw) To: Eric Dumazet; +Cc: Michal Hocko, linux-mm, LKML On 01/06/2017 06:08 PM, Eric Dumazet wrote: > On Fri, Jan 6, 2017 at 8:55 AM, Vlastimil Babka <vbabka@suse.cz> wrote: >> On 01/06/2017 05:48 PM, Eric Dumazet wrote: >>> On Fri, Jan 6, 2017 at 8:31 AM, Vlastimil Babka <vbabka@suse.cz> wrote: >>> >>>> >>>> I wonder what's that cause of the penalty (when accessing the vmapped >>>> area I suppose?) Is it higher risk of collisions cache misses within the >>>> area, compared to consecutive physical adresses? >>> >>> I believe tests were done with 48 fq qdisc, each having 2^16 slots. >>> So I had 48 blocs,of 524288 bytes. >>> >>> Trying a bit harder at setup time to get 128 consecutive pages got >>> less TLB pressure. >> >> Hmm that's rather surprising to me. TLB caches the page table lookups >> and the PFN's of the physical pages it translates to shouldn't matter - >> the page tables will look the same. With 128 consecutive pages could >> manifest the reduced collision cache miss effect though. >> > > To be clear, the difference came from : > > Using kmalloc() to allocate 48 x 524288 bytes > > Or using vmalloc() > > Are you telling me HugePages are not in play there ? Oh that's certainly a difference, as kmalloc() will give you the kernel mapping which can use 1GB Hugepages. But if you just combine these kmalloc chunks into vmalloc mapping (IIUC that's what your RFC was doing?), you lose that benefit AFAIK. On the other hand I recall reading that AMD Zen will have PTE Coalescing [1] which, if true and I understand that correctly, would indeed result in better TLB usage with adjacent page table entries pointing to consecutive pages. But perhaps the starting pte's position will also have to be aligned to make this work, dunno. [1] http://www.anandtech.com/show/10591/amd-zen-microarchiture-part-2-extracting-instructionlevel-parallelism/6 ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-01-16 8:54 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-06 15:20 __GFP_REPEAT usage in fq_alloc_node Michal Hocko 2017-01-06 15:39 ` Eric Dumazet 2017-01-06 16:07 ` Michal Hocko 2017-01-06 16:19 ` Michal Hocko 2017-01-07 3:33 ` [PATCH] net: use kvmalloc rather than open coded variant kbuild test robot 2017-01-07 9:19 ` Michal Hocko 2017-01-07 3:35 ` kbuild test robot 2017-01-09 10:22 ` __GFP_REPEAT usage in fq_alloc_node Michal Hocko 2017-01-09 16:00 ` Eric Dumazet 2017-01-09 17:45 ` Michal Hocko 2017-01-09 17:53 ` Eric Dumazet 2017-01-14 23:43 ` [PATCH] net_sched: use kvmalloc rather than opencoded variant kbuild test robot 2017-01-16 8:54 ` Michal Hocko 2017-01-06 16:31 ` __GFP_REPEAT usage in fq_alloc_node Vlastimil Babka 2017-01-06 16:48 ` Eric Dumazet 2017-01-06 16:50 ` Eric Dumazet 2017-01-06 16:55 ` Vlastimil Babka 2017-01-06 17:08 ` Eric Dumazet 2017-01-06 17:18 ` Vlastimil Babka
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).