* 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed @ 2011-05-19 6:35 Denys Fedoryshchenko 2011-05-19 6:39 ` Eric Dumazet 2011-05-19 7:51 ` 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed David Miller 0 siblings, 2 replies; 10+ messages in thread From: Denys Fedoryshchenko @ 2011-05-19 6:35 UTC (permalink / raw) To: netdev Hi, again Just tried to upgrade large NAS from 2.6.38.6 to 2.6.39-rc7-git11, and at same time enabling ipv6 on it. Got that, after ppp2897 brought up (sure it means there is other 2896 available, and also few ethernet vlans, around 32). I am not sure it is a bug, but it looks i had free memory(the box had 8GB free), and lowmem too, also i will try to enable there 64bit kernel at evening. May 17 16:00:42 194.146.155.70 kernel: [14925.897799] PERCPU: allocation failed, size=2048 align=4, failed to allocate new chunk May 17 16:00:42 194.146.155.70 kernel: [14925.898163] Pid: 24207, comm: pppd Not tainted 2.6.39-rc7-git11-build-0058 #4 May 17 16:00:42 194.146.155.70 kernel: [14925.898164] Call Trace: May 17 16:00:42 194.146.155.70 kernel: [14925.898169] [<c0335548>] ? printk+0x18/0x20 May 17 16:00:42 194.146.155.70 kernel: [14925.898173] [<c017ecd0>] pcpu_alloc+0x616/0x67a May 17 16:00:42 194.146.155.70 kernel: [14925.898176] [<c0194a80>] ? __kmalloc_track_caller+0x68/0xc0 May 17 16:00:42 194.146.155.70 kernel: [14925.898189] [<f8ae196c>] ? kzalloc+0xb/0xd [ipv6] May 17 16:00:42 194.146.155.70 kernel: [14925.898193] [<c01320a5>] ? _local_bh_enable_ip.clone.6+0x18/0x71 May 17 16:00:42 194.146.155.70 kernel: [14925.898195] [<c017ed3e>] __alloc_percpu+0xa/0xc May 17 16:00:42 194.146.155.70 kernel: [14925.898198] [<c030aa7d>] snmp_mib_init+0x2f/0x51 May 17 16:00:42 194.146.155.70 kernel: [14925.898207] [<f8ae2ad0>] ipv6_add_dev+0x133/0x2a3 [ipv6] May 17 16:00:42 194.146.155.70 kernel: [14925.898209] [<c030e12d>] ? ip_mc_init_dev+0x75/0x86 May 17 16:00:42 194.146.155.70 kernel: [14925.898211] [<c0309321>] ? devinet_sysctl_register+0x34/0x38 May 17 16:00:42 194.146.155.70 kernel: [14925.898221] [<f8ae5754>] addrconf_notify+0x50/0x6a5 [ipv6] May 17 16:00:42 194.146.155.70 kernel: [14925.898224] [<c0218f52>] ? add_uevent_var+0xa3/0xa3 May 17 16:00:42 194.146.155.70 kernel: [14925.898226] [<c0309901>] ? inetdev_event+0x55/0x3c0 May 17 16:00:42 194.146.155.70 kernel: [14925.898230] [<c01446f9>] notifier_call_chain+0x26/0x48 May 17 16:00:42 194.146.155.70 kernel: [14925.898232] [<c01447a7>] raw_notifier_call_chain+0x1a/0x1c May 17 16:00:42 194.146.155.70 kernel: [14925.898236] [<c02c8115>] call_netdevice_notifiers+0x44/0x4b May 17 16:00:42 194.146.155.70 kernel: [14925.898238] [<c01320a5>] ? _local_bh_enable_ip.clone.6+0x18/0x71 May 17 16:00:42 194.146.155.70 kernel: [14925.898240] [<c0132106>] ? local_bh_enable_ip+0x8/0xa May 17 16:00:42 194.146.155.70 kernel: [14925.898242] [<c02ca19b>] register_netdevice+0x1fb/0x255 May 17 16:00:42 194.146.155.70 kernel: [14925.898244] [<c02ca227>] register_netdev+0x32/0x41 May 17 16:00:42 194.146.155.70 kernel: [14925.898247] [<c021d5cf>] ? sprintf+0x1c/0x1e May 17 16:00:42 194.146.155.70 kernel: [14925.898249] [<c029647a>] ppp_ioctl+0x224/0xaea May 17 16:00:42 194.146.155.70 kernel: [14925.898252] [<c01a35cc>] ? do_filp_open+0x26/0x67 May 17 16:00:42 194.146.155.70 kernel: [14925.898254] [<c0296256>] ? ppp_write+0x98/0x98 May 17 16:00:42 194.146.155.70 kernel: [14925.898256] [<c01a53ce>] do_vfs_ioctl+0x45e/0x498 May 17 16:00:42 194.146.155.70 kernel: [14925.898258] [<c01a118e>] ? getname_flags+0x1e/0xad May 17 16:00:42 194.146.155.70 kernel: [14925.898260] [<c019391b>] ? kmem_cache_free+0x14/0x83 May 17 16:00:42 194.146.155.70 kernel: [14925.898262] [<c01ab5bb>] ? alloc_fd+0x4e/0xba May 17 16:00:42 194.146.155.70 kernel: [14925.898265] [<c0199465>] ? do_sys_open+0xdb/0xe5 May 17 16:00:42 194.146.155.70 kernel: [14925.898266] [<c019ac7b>] ? fput+0x13/0x155 May 17 16:00:42 194.146.155.70 kernel: [14925.898268] [<c01a4387>] ? do_fcntl+0x227/0x3aa May 17 16:00:42 194.146.155.70 kernel: [14925.898270] [<c01a543b>] sys_ioctl+0x33/0x4c May 17 16:00:42 194.146.155.70 kernel: [14925.898273] [<c0336edd>] syscall_call+0x7/0xb ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed 2011-05-19 6:35 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed Denys Fedoryshchenko @ 2011-05-19 6:39 ` Eric Dumazet 2011-05-19 6:47 ` Denys Fedoryshchenko 2011-05-19 6:55 ` Eric Dumazet 2011-05-19 7:51 ` 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed David Miller 1 sibling, 2 replies; 10+ messages in thread From: Eric Dumazet @ 2011-05-19 6:39 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev Le jeudi 19 mai 2011 à 09:35 +0300, Denys Fedoryshchenko a écrit : > Hi, again > > Just tried to upgrade large NAS from 2.6.38.6 to 2.6.39-rc7-git11, and > at same time enabling ipv6 on it. > Got that, after ppp2897 brought up (sure it means there is other 2896 > available, and also few ethernet vlans, around 32). > I am not sure it is a bug, but it looks i had free memory(the box had > 8GB free), and lowmem too, also i will try to enable there 64bit kernel > at evening. > > May 17 16:00:42 194.146.155.70 kernel: [14925.897799] PERCPU: > allocation failed, size=2048 align=4, failed to allocate new chunk > May 17 16:00:42 194.146.155.70 kernel: [14925.898163] Pid: 24207, comm: > pppd Not tainted 2.6.39-rc7-git11-build-0058 #4 > May 17 16:00:42 194.146.155.70 kernel: [14925.898164] Call Trace: > May 17 16:00:42 194.146.155.70 kernel: [14925.898169] [<c0335548>] ? > printk+0x18/0x20 > May 17 16:00:42 194.146.155.70 kernel: [14925.898173] [<c017ecd0>] > pcpu_alloc+0x616/0x67a > May 17 16:00:42 194.146.155.70 kernel: [14925.898176] [<c0194a80>] ? > __kmalloc_track_caller+0x68/0xc0 > May 17 16:00:42 194.146.155.70 kernel: [14925.898189] [<f8ae196c>] ? > kzalloc+0xb/0xd [ipv6] > May 17 16:00:42 194.146.155.70 kernel: [14925.898193] [<c01320a5>] ? > _local_bh_enable_ip.clone.6+0x18/0x71 > May 17 16:00:42 194.146.155.70 kernel: [14925.898195] [<c017ed3e>] > __alloc_percpu+0xa/0xc > May 17 16:00:42 194.146.155.70 kernel: [14925.898198] [<c030aa7d>] > snmp_mib_init+0x2f/0x51 > May 17 16:00:42 194.146.155.70 kernel: [14925.898207] [<f8ae2ad0>] > ipv6_add_dev+0x133/0x2a3 [ipv6] > May 17 16:00:42 194.146.155.70 kernel: [14925.898209] [<c030e12d>] ? > ip_mc_init_dev+0x75/0x86 > May 17 16:00:42 194.146.155.70 kernel: [14925.898211] [<c0309321>] ? > devinet_sysctl_register+0x34/0x38 > May 17 16:00:42 194.146.155.70 kernel: [14925.898221] [<f8ae5754>] > addrconf_notify+0x50/0x6a5 [ipv6] > May 17 16:00:42 194.146.155.70 kernel: [14925.898224] [<c0218f52>] ? > add_uevent_var+0xa3/0xa3 > May 17 16:00:42 194.146.155.70 kernel: [14925.898226] [<c0309901>] ? > inetdev_event+0x55/0x3c0 > May 17 16:00:42 194.146.155.70 kernel: [14925.898230] [<c01446f9>] > notifier_call_chain+0x26/0x48 > May 17 16:00:42 194.146.155.70 kernel: [14925.898232] [<c01447a7>] > raw_notifier_call_chain+0x1a/0x1c > May 17 16:00:42 194.146.155.70 kernel: [14925.898236] [<c02c8115>] > call_netdevice_notifiers+0x44/0x4b > May 17 16:00:42 194.146.155.70 kernel: [14925.898238] [<c01320a5>] ? > _local_bh_enable_ip.clone.6+0x18/0x71 > May 17 16:00:42 194.146.155.70 kernel: [14925.898240] [<c0132106>] ? > local_bh_enable_ip+0x8/0xa > May 17 16:00:42 194.146.155.70 kernel: [14925.898242] [<c02ca19b>] > register_netdevice+0x1fb/0x255 > May 17 16:00:42 194.146.155.70 kernel: [14925.898244] [<c02ca227>] > register_netdev+0x32/0x41 > May 17 16:00:42 194.146.155.70 kernel: [14925.898247] [<c021d5cf>] ? > sprintf+0x1c/0x1e > May 17 16:00:42 194.146.155.70 kernel: [14925.898249] [<c029647a>] > ppp_ioctl+0x224/0xaea > May 17 16:00:42 194.146.155.70 kernel: [14925.898252] [<c01a35cc>] ? > do_filp_open+0x26/0x67 > May 17 16:00:42 194.146.155.70 kernel: [14925.898254] [<c0296256>] ? > ppp_write+0x98/0x98 > May 17 16:00:42 194.146.155.70 kernel: [14925.898256] [<c01a53ce>] > do_vfs_ioctl+0x45e/0x498 > May 17 16:00:42 194.146.155.70 kernel: [14925.898258] [<c01a118e>] ? > getname_flags+0x1e/0xad > May 17 16:00:42 194.146.155.70 kernel: [14925.898260] [<c019391b>] ? > kmem_cache_free+0x14/0x83 > May 17 16:00:42 194.146.155.70 kernel: [14925.898262] [<c01ab5bb>] ? > alloc_fd+0x4e/0xba > May 17 16:00:42 194.146.155.70 kernel: [14925.898265] [<c0199465>] ? > do_sys_open+0xdb/0xe5 > May 17 16:00:42 194.146.155.70 kernel: [14925.898266] [<c019ac7b>] ? > fput+0x13/0x155 > May 17 16:00:42 194.146.155.70 kernel: [14925.898268] [<c01a4387>] ? > do_fcntl+0x227/0x3aa > May 17 16:00:42 194.146.155.70 kernel: [14925.898270] [<c01a543b>] > sys_ioctl+0x33/0x4c > May 17 16:00:42 194.146.155.70 kernel: [14925.898273] [<c0336edd>] > syscall_call+0x7/0xb > -- Its a known problem : When ipv6 is enabled, we allocate percpu memory to hold per device snmp counters. make sure kernel idea of max possible cpus matches real number of cpus. And yes, switching to 64bit kernel helps a lot. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed 2011-05-19 6:39 ` Eric Dumazet @ 2011-05-19 6:47 ` Denys Fedoryshchenko 2011-05-19 6:55 ` Eric Dumazet 1 sibling, 0 replies; 10+ messages in thread From: Denys Fedoryshchenko @ 2011-05-19 6:47 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Thu, 19 May 2011 08:39:18 +0200, Eric Dumazet wrote: > Le jeudi 19 mai 2011 à 09:35 +0300, Denys Fedoryshchenko a écrit : >> Hi, again >> >> Just tried to upgrade large NAS from 2.6.38.6 to 2.6.39-rc7-git11, >> and >> at same time enabling ipv6 on it. >> Got that, after ppp2897 brought up (sure it means there is other >> 2896 >> available, and also few ethernet vlans, around 32). >> I am not sure it is a bug, but it looks i had free memory(the box >> had >> 8GB free), and lowmem too, also i will try to enable there 64bit >> kernel >> at evening. >> >> May 17 16:00:42 194.146.155.70 kernel: [14925.897799] PERCPU: >> allocation failed, size=2048 align=4, failed to allocate new chunk >> May 17 16:00:42 194.146.155.70 kernel: [14925.898163] Pid: 24207, >> comm: >> pppd Not tainted 2.6.39-rc7-git11-build-0058 #4 > > Its a known problem : When ipv6 is enabled, we allocate percpu memory > to > hold per device snmp counters. > > make sure kernel idea of max possible cpus matches real number of > cpus. > > And yes, switching to 64bit kernel helps a lot. > Yes, it matches, i guess. CONFIG_NR_CPUS=8 processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz Thanks. Then i will simply switch kernel to 64bit, but for now with 32bit userspace, since this semi-embedded system mass deployed, and i have to maintain it alone (cannot handle both 32/64 bit userspace), and some pc's don't have lm flag in cpuinfo :) I am hitting a lot lowmem limits lately, but the only application that was not working right 32bit userspace/64bit kernel - ipvsadm. Should i report it as a bug (i will check if it is still an issue)? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed 2011-05-19 6:39 ` Eric Dumazet 2011-05-19 6:47 ` Denys Fedoryshchenko @ 2011-05-19 6:55 ` Eric Dumazet 2011-05-19 7:28 ` Denys Fedoryshchenko 2011-05-19 11:14 ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet 1 sibling, 2 replies; 10+ messages in thread From: Eric Dumazet @ 2011-05-19 6:55 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev Le jeudi 19 mai 2011 à 08:39 +0200, Eric Dumazet a écrit : > Its a known problem : When ipv6 is enabled, we allocate percpu memory to > hold per device snmp counters. > > make sure kernel idea of max possible cpus matches real number of cpus. > > And yes, switching to 64bit kernel helps a lot. > > Looking at snmp6_alloc_dev(), we allocate three mib per device : ipstats_mib (30 * sizeof(u64) * number_of_possible_cpus) icmpv6_mib (4 * sizeof(long) * number_of_possible_cpus) icmpv6msg_mib (26 * sizeof(long)) For sure icmp ones dont need percpu counter. Plain atomic_long_t (shared) would be enough, since ICMP messages are rare enough. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed 2011-05-19 6:55 ` Eric Dumazet @ 2011-05-19 7:28 ` Denys Fedoryshchenko 2011-05-19 7:44 ` Eric Dumazet 2011-05-19 11:14 ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet 1 sibling, 1 reply; 10+ messages in thread From: Denys Fedoryshchenko @ 2011-05-19 7:28 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev On Thu, 19 May 2011 08:55:13 +0200, Eric Dumazet wrote: > Le jeudi 19 mai 2011 à 08:39 +0200, Eric Dumazet a écrit : > >> Its a known problem : When ipv6 is enabled, we allocate percpu >> memory to >> hold per device snmp counters. >> >> make sure kernel idea of max possible cpus matches real number of >> cpus. >> >> And yes, switching to 64bit kernel helps a lot. >> >> > > Looking at snmp6_alloc_dev(), we allocate three mib per device : > > ipstats_mib (30 * sizeof(u64) * number_of_possible_cpus) > icmpv6_mib (4 * sizeof(long) * number_of_possible_cpus) > icmpv6msg_mib (26 * sizeof(long)) 1920 + 256 + 208 = 2386 * 3000ppp's = 7152000, i think it is not that much at any case, if i am not wrong. But at any case i will try 64bit. > > For sure icmp ones dont need percpu counter. Plain atomic_long_t > (shared) would be enough, since ICMP messages are rare enough. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed 2011-05-19 7:28 ` Denys Fedoryshchenko @ 2011-05-19 7:44 ` Eric Dumazet 0 siblings, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2011-05-19 7:44 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev Le jeudi 19 mai 2011 à 10:28 +0300, Denys Fedoryshchenko a écrit : > On Thu, 19 May 2011 08:55:13 +0200, Eric Dumazet wrote: > > Le jeudi 19 mai 2011 à 08:39 +0200, Eric Dumazet a écrit : > > > >> Its a known problem : When ipv6 is enabled, we allocate percpu > >> memory to > >> hold per device snmp counters. > >> > >> make sure kernel idea of max possible cpus matches real number of > >> cpus. > >> > >> And yes, switching to 64bit kernel helps a lot. > >> > >> > > > > Looking at snmp6_alloc_dev(), we allocate three mib per device : > > > > ipstats_mib (30 * sizeof(u64) * number_of_possible_cpus) > > icmpv6_mib (4 * sizeof(long) * number_of_possible_cpus) > > icmpv6msg_mib (26 * sizeof(long)) > 1920 + > 256 + > 208 = 2386 * 3000ppp's = 7152000, i think it is not that much at any > case, if i am not wrong. > > But at any case i will try 64bit. If you really want to stay 32bit, you might try to enlarge vmalloc aread (128 Mbytes default) to get room for pcpu data : grep pcpu /proc/vmallocinfo boot param : vmalloc=256M ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes 2011-05-19 6:55 ` Eric Dumazet 2011-05-19 7:28 ` Denys Fedoryshchenko @ 2011-05-19 11:14 ` Eric Dumazet 2011-05-19 11:26 ` Denys Fedoryshchenko 2011-05-19 20:19 ` David Miller 1 sibling, 2 replies; 10+ messages in thread From: Eric Dumazet @ 2011-05-19 11:14 UTC (permalink / raw) To: Denys Fedoryshchenko, David Miller; +Cc: netdev Le jeudi 19 mai 2011 à 08:55 +0200, Eric Dumazet a écrit : > Looking at snmp6_alloc_dev(), we allocate three mib per device : > > ipstats_mib (30 * sizeof(u64) * number_of_possible_cpus) > icmpv6_mib (4 * sizeof(long) * number_of_possible_cpus) > icmpv6msg_mib (26 * sizeof(long)) > Oops, I forgot that mibs were doubled (one set for USER, one set for BH) And : #define __ICMP6MSG_MIB_MAX 512 So icmpv6msg_mib is really 512*sizeof(long)*number_of_possible_cpus*2 32 kbytes per device on a 8cpu machine, 32bit kernel. Plus all other mibs... yes thats way too big for a seldom used stuff. Here is patch I cooked and tested on my machine : [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes. ipv6 has per device ICMP SNMP counters, taking too much space because they use percpu storage. needed size per device is : (512+4)*sizeof(long)*number_of_possible_cpus*2 On a 32bit kernel, 16 possible cpus, this wastes more than 64kbytes of memory per ipv6 enabled network device, taken in vmalloc pool. Since ICMP messages are rare, just use shared counters (atomic_long_t) Per network space ICMP counters are still using percpu memory, we might also convert them to shared counters in a future patch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Denys Fedoryshchenko <denys@visp.net.lb> --- include/net/if_inet6.h | 4 +-- include/net/ipv6.h | 19 +++++++++++++----- include/net/snmp.h | 14 +++++++++++++ net/ipv6/addrconf.c | 24 +++++++++++------------ net/ipv6/proc.c | 40 +++++++++++++++++++++++++-------------- 5 files changed, 68 insertions(+), 33 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 0c603fe..11cf373 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -154,8 +154,8 @@ struct ifacaddr6 { struct ipv6_devstat { struct proc_dir_entry *proc_dir_entry; DEFINE_SNMP_STAT(struct ipstats_mib, ipv6); - DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6); - DEFINE_SNMP_STAT(struct icmpv6msg_mib, icmpv6msg); + DEFINE_SNMP_STAT_ATOMIC(struct icmpv6_mib_device, icmpv6dev); + DEFINE_SNMP_STAT_ATOMIC(struct icmpv6msg_mib_device, icmpv6msgdev); }; struct inet6_dev { diff --git a/include/net/ipv6.h b/include/net/ipv6.h index e1c60b4..c033ed0 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -123,6 +123,15 @@ extern struct ctl_path net_ipv6_ctl_path[]; SNMP_INC_STATS##modifier((net)->mib.statname##_statistics, (field));\ }) +/* per device counters are atomic_long_t */ +#define _DEVINCATOMIC(net, statname, modifier, idev, field) \ +({ \ + struct inet6_dev *_idev = (idev); \ + if (likely(_idev != NULL)) \ + SNMP_INC_STATS_ATOMIC_LONG((_idev)->stats.statname##dev, (field)); \ + SNMP_INC_STATS##modifier((net)->mib.statname##_statistics, (field));\ +}) + #define _DEVADD(net, statname, modifier, idev, field, val) \ ({ \ struct inet6_dev *_idev = (idev); \ @@ -154,16 +163,16 @@ extern struct ctl_path net_ipv6_ctl_path[]; #define IP6_UPD_PO_STATS_BH(net, idev,field,val) \ _DEVUPD(net, ipv6, 64_BH, idev, field, val) #define ICMP6_INC_STATS(net, idev, field) \ - _DEVINC(net, icmpv6, , idev, field) + _DEVINCATOMIC(net, icmpv6, , idev, field) #define ICMP6_INC_STATS_BH(net, idev, field) \ - _DEVINC(net, icmpv6, _BH, idev, field) + _DEVINCATOMIC(net, icmpv6, _BH, idev, field) #define ICMP6MSGOUT_INC_STATS(net, idev, field) \ - _DEVINC(net, icmpv6msg, , idev, field +256) + _DEVINCATOMIC(net, icmpv6msg, , idev, field +256) #define ICMP6MSGOUT_INC_STATS_BH(net, idev, field) \ - _DEVINC(net, icmpv6msg, _BH, idev, field +256) + _DEVINCATOMIC(net, icmpv6msg, _BH, idev, field +256) #define ICMP6MSGIN_INC_STATS_BH(net, idev, field) \ - _DEVINC(net, icmpv6msg, _BH, idev, field) + _DEVINCATOMIC(net, icmpv6msg, _BH, idev, field) struct ip6_ra_chain { struct ip6_ra_chain *next; diff --git a/include/net/snmp.h b/include/net/snmp.h index 27461d6..479083a 100644 --- a/include/net/snmp.h +++ b/include/net/snmp.h @@ -72,14 +72,24 @@ struct icmpmsg_mib { /* ICMP6 (IPv6-ICMP) */ #define ICMP6_MIB_MAX __ICMP6_MIB_MAX +/* per network ns counters */ struct icmpv6_mib { unsigned long mibs[ICMP6_MIB_MAX]; }; +/* per device counters, (shared on all cpus) */ +struct icmpv6_mib_device { + atomic_long_t mibs[ICMP6_MIB_MAX]; +}; #define ICMP6MSG_MIB_MAX __ICMP6MSG_MIB_MAX +/* per network ns counters */ struct icmpv6msg_mib { unsigned long mibs[ICMP6MSG_MIB_MAX]; }; +/* per device counters, (shared on all cpus) */ +struct icmpv6msg_mib_device { + atomic_long_t mibs[ICMP6MSG_MIB_MAX]; +}; /* TCP */ @@ -114,6 +124,8 @@ struct linux_xfrm_mib { */ #define DEFINE_SNMP_STAT(type, name) \ __typeof__(type) __percpu *name[2] +#define DEFINE_SNMP_STAT_ATOMIC(type, name) \ + __typeof__(type) *name #define DECLARE_SNMP_STAT(type, name) \ extern __typeof__(type) __percpu *name[2] @@ -124,6 +136,8 @@ struct linux_xfrm_mib { __this_cpu_inc(mib[0]->mibs[field]) #define SNMP_INC_STATS_USER(mib, field) \ this_cpu_inc(mib[1]->mibs[field]) +#define SNMP_INC_STATS_ATOMIC_LONG(mib, field) \ + atomic_long_inc(&mib->mibs[field]) #define SNMP_INC_STATS(mib, field) \ this_cpu_inc(mib[!in_softirq()]->mibs[field]) #define SNMP_DEC_STATS(mib, field) \ diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index f2f9b2e..3cfbbf3 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -289,19 +289,19 @@ static int snmp6_alloc_dev(struct inet6_dev *idev) sizeof(struct ipstats_mib), __alignof__(struct ipstats_mib)) < 0) goto err_ip; - if (snmp_mib_init((void __percpu **)idev->stats.icmpv6, - sizeof(struct icmpv6_mib), - __alignof__(struct icmpv6_mib)) < 0) + idev->stats.icmpv6dev = kzalloc(sizeof(struct icmpv6_mib_device), + GFP_KERNEL); + if (!idev->stats.icmpv6dev) goto err_icmp; - if (snmp_mib_init((void __percpu **)idev->stats.icmpv6msg, - sizeof(struct icmpv6msg_mib), - __alignof__(struct icmpv6msg_mib)) < 0) + idev->stats.icmpv6msgdev = kzalloc(sizeof(struct icmpv6msg_mib_device), + GFP_KERNEL); + if (!idev->stats.icmpv6msgdev) goto err_icmpmsg; return 0; err_icmpmsg: - snmp_mib_free((void __percpu **)idev->stats.icmpv6); + kfree(idev->stats.icmpv6dev); err_icmp: snmp_mib_free((void __percpu **)idev->stats.ipv6); err_ip: @@ -310,8 +310,8 @@ err_ip: static void snmp6_free_dev(struct inet6_dev *idev) { - snmp_mib_free((void __percpu **)idev->stats.icmpv6msg); - snmp_mib_free((void __percpu **)idev->stats.icmpv6); + kfree(idev->stats.icmpv6msgdev); + kfree(idev->stats.icmpv6dev); snmp_mib_free((void __percpu **)idev->stats.ipv6); } @@ -3838,7 +3838,7 @@ static inline size_t inet6_if_nlmsg_size(void) + nla_total_size(inet6_ifla6_size()); /* IFLA_PROTINFO */ } -static inline void __snmp6_fill_stats(u64 *stats, void __percpu **mib, +static inline void __snmp6_fill_statsdev(u64 *stats, atomic_long_t *mib, int items, int bytes) { int i; @@ -3848,7 +3848,7 @@ static inline void __snmp6_fill_stats(u64 *stats, void __percpu **mib, /* Use put_unaligned() because stats may not be aligned for u64. */ put_unaligned(items, &stats[0]); for (i = 1; i < items; i++) - put_unaligned(snmp_fold_field(mib, i), &stats[i]); + put_unaligned(atomic_long_read(&mib[i]), &stats[i]); memset(&stats[items], 0, pad); } @@ -3877,7 +3877,7 @@ static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype, IPSTATS_MIB_MAX, bytes, offsetof(struct ipstats_mib, syncp)); break; case IFLA_INET6_ICMP6STATS: - __snmp6_fill_stats(stats, (void __percpu **)idev->stats.icmpv6, ICMP6_MIB_MAX, bytes); + __snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, ICMP6_MIB_MAX, bytes); break; } } diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c index 24b3558..18ff5df 100644 --- a/net/ipv6/proc.c +++ b/net/ipv6/proc.c @@ -141,7 +141,11 @@ static const struct snmp_mib snmp6_udplite6_list[] = { SNMP_MIB_SENTINEL }; -static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib) +/* can be called either with percpu mib (pcpumib != NULL), + * or shared one (smib != NULL) + */ +static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **pcpumib, + atomic_long_t *smib) { char name[32]; int i; @@ -158,14 +162,14 @@ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib) snprintf(name, sizeof(name), "Icmp6%s%s", i & 0x100 ? "Out" : "In", p); seq_printf(seq, "%-32s\t%lu\n", name, - snmp_fold_field(mib, i)); + pcpumib ? snmp_fold_field(pcpumib, i) : atomic_long_read(smib + i)); } /* print by number (nonzero only) - ICMPMsgStat format */ for (i = 0; i < ICMP6MSG_MIB_MAX; i++) { unsigned long val; - val = snmp_fold_field(mib, i); + val = pcpumib ? snmp_fold_field(pcpumib, i) : atomic_long_read(smib + i); if (!val) continue; snprintf(name, sizeof(name), "Icmp6%sType%u", @@ -174,14 +178,22 @@ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib) } } -static void snmp6_seq_show_item(struct seq_file *seq, void __percpu **mib, +/* can be called either with percpu mib (pcpumib != NULL), + * or shared one (smib != NULL) + */ +static void snmp6_seq_show_item(struct seq_file *seq, void __percpu **pcpumib, + atomic_long_t *smib, const struct snmp_mib *itemlist) { int i; + unsigned long val; - for (i = 0; itemlist[i].name; i++) - seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, - snmp_fold_field(mib, itemlist[i].entry)); + for (i = 0; itemlist[i].name; i++) { + val = pcpumib ? + snmp_fold_field(pcpumib, itemlist[i].entry) : + atomic_long_read(smib + itemlist[i].entry); + seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, val); + } } static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu **mib, @@ -201,13 +213,13 @@ static int snmp6_seq_show(struct seq_file *seq, void *v) snmp6_seq_show_item64(seq, (void __percpu **)net->mib.ipv6_statistics, snmp6_ipstats_list, offsetof(struct ipstats_mib, syncp)); snmp6_seq_show_item(seq, (void __percpu **)net->mib.icmpv6_statistics, - snmp6_icmp6_list); + NULL, snmp6_icmp6_list); snmp6_seq_show_icmpv6msg(seq, - (void __percpu **)net->mib.icmpv6msg_statistics); + (void __percpu **)net->mib.icmpv6msg_statistics, NULL); snmp6_seq_show_item(seq, (void __percpu **)net->mib.udp_stats_in6, - snmp6_udp6_list); + NULL, snmp6_udp6_list); snmp6_seq_show_item(seq, (void __percpu **)net->mib.udplite_stats_in6, - snmp6_udplite6_list); + NULL, snmp6_udplite6_list); return 0; } @@ -229,11 +241,11 @@ static int snmp6_dev_seq_show(struct seq_file *seq, void *v) struct inet6_dev *idev = (struct inet6_dev *)seq->private; seq_printf(seq, "%-32s\t%u\n", "ifIndex", idev->dev->ifindex); - snmp6_seq_show_item(seq, (void __percpu **)idev->stats.ipv6, + snmp6_seq_show_item(seq, (void __percpu **)idev->stats.ipv6, NULL, snmp6_ipstats_list); - snmp6_seq_show_item(seq, (void __percpu **)idev->stats.icmpv6, + snmp6_seq_show_item(seq, NULL, idev->stats.icmpv6dev->mibs, snmp6_icmp6_list); - snmp6_seq_show_icmpv6msg(seq, (void __percpu **)idev->stats.icmpv6msg); + snmp6_seq_show_icmpv6msg(seq, NULL, idev->stats.icmpv6msgdev->mibs); return 0; } ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes 2011-05-19 11:14 ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet @ 2011-05-19 11:26 ` Denys Fedoryshchenko 2011-05-19 20:19 ` David Miller 1 sibling, 0 replies; 10+ messages in thread From: Denys Fedoryshchenko @ 2011-05-19 11:26 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev On Thu, 19 May 2011 13:14:23 +0200, Eric Dumazet wrote: > Le jeudi 19 mai 2011 à 08:55 +0200, Eric Dumazet a écrit : > >> Looking at snmp6_alloc_dev(), we allocate three mib per device : >> >> ipstats_mib (30 * sizeof(u64) * number_of_possible_cpus) >> icmpv6_mib (4 * sizeof(long) * number_of_possible_cpus) >> icmpv6msg_mib (26 * sizeof(long)) >> > > Oops, I forgot that mibs were doubled (one set for USER, one set for > BH) > > And : > #define __ICMP6MSG_MIB_MAX 512 > > So icmpv6msg_mib is really 512*sizeof(long)*number_of_possible_cpus*2 > > 32 kbytes per device on a 8cpu machine, 32bit kernel. > > Plus all other mibs... yes thats way too big for a seldom used stuff. > > Here is patch I cooked and tested on my machine : > > [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes. I'll test it tonight, thanks a lot :-) I guess it will help also for people with lot of interfaces (virtualisation?), not only ppp. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes 2011-05-19 11:14 ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet 2011-05-19 11:26 ` Denys Fedoryshchenko @ 2011-05-19 20:19 ` David Miller 1 sibling, 0 replies; 10+ messages in thread From: David Miller @ 2011-05-19 20:19 UTC (permalink / raw) To: eric.dumazet; +Cc: denys, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 19 May 2011 13:14:23 +0200 > [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes. > > ipv6 has per device ICMP SNMP counters, taking too much space because > they use percpu storage. > > needed size per device is : > (512+4)*sizeof(long)*number_of_possible_cpus*2 > > On a 32bit kernel, 16 possible cpus, this wastes more than 64kbytes of > memory per ipv6 enabled network device, taken in vmalloc pool. > > Since ICMP messages are rare, just use shared counters (atomic_long_t) > > Per network space ICMP counters are still using percpu memory, we might > also convert them to shared counters in a future patch. > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Applied. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed 2011-05-19 6:35 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed Denys Fedoryshchenko 2011-05-19 6:39 ` Eric Dumazet @ 2011-05-19 7:51 ` David Miller 1 sibling, 0 replies; 10+ messages in thread From: David Miller @ 2011-05-19 7:51 UTC (permalink / raw) To: denys; +Cc: netdev From: Denys Fedoryshchenko <denys@visp.net.lb> Date: Thu, 19 May 2011 09:35:29 +0300 > I am not sure it is a bug, but it looks i had free memory(the box had > 8GB free), and lowmem too, also i will try to enable there 64bit > kernel at evening. It's not free memory, you ran out of per-cpu chunks which are allocated in fixed virtual region(s). ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-05-19 20:19 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-05-19 6:35 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed Denys Fedoryshchenko 2011-05-19 6:39 ` Eric Dumazet 2011-05-19 6:47 ` Denys Fedoryshchenko 2011-05-19 6:55 ` Eric Dumazet 2011-05-19 7:28 ` Denys Fedoryshchenko 2011-05-19 7:44 ` Eric Dumazet 2011-05-19 11:14 ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet 2011-05-19 11:26 ` Denys Fedoryshchenko 2011-05-19 20:19 ` David Miller 2011-05-19 7:51 ` 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.