All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU:  allocation failed
@ 2011-05-19  6:35 Denys Fedoryshchenko
  2011-05-19  6:39 ` Eric Dumazet
  2011-05-19  7:51 ` 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed David Miller
  0 siblings, 2 replies; 10+ messages in thread
From: Denys Fedoryshchenko @ 2011-05-19  6:35 UTC (permalink / raw)
  To: netdev

 Hi, again

 Just tried to upgrade large NAS from 2.6.38.6 to 2.6.39-rc7-git11, and 
 at same time enabling ipv6 on it.
 Got that, after ppp2897 brought up (sure it means there is other 2896 
 available, and also few ethernet vlans, around 32).
 I am not sure it is a bug, but it looks i had free memory(the box had 
 8GB free), and lowmem too, also i will try to enable there 64bit kernel 
 at evening.

 May 17 16:00:42 194.146.155.70 kernel: [14925.897799] PERCPU: 
 allocation failed, size=2048 align=4, failed to allocate new chunk
 May 17 16:00:42 194.146.155.70 kernel: [14925.898163] Pid: 24207, comm: 
 pppd Not tainted 2.6.39-rc7-git11-build-0058 #4
 May 17 16:00:42 194.146.155.70 kernel: [14925.898164] Call Trace:
 May 17 16:00:42 194.146.155.70 kernel: [14925.898169]  [<c0335548>] ? 
 printk+0x18/0x20
 May 17 16:00:42 194.146.155.70 kernel: [14925.898173]  [<c017ecd0>] 
 pcpu_alloc+0x616/0x67a
 May 17 16:00:42 194.146.155.70 kernel: [14925.898176]  [<c0194a80>] ? 
 __kmalloc_track_caller+0x68/0xc0
 May 17 16:00:42 194.146.155.70 kernel: [14925.898189]  [<f8ae196c>] ? 
 kzalloc+0xb/0xd [ipv6]
 May 17 16:00:42 194.146.155.70 kernel: [14925.898193]  [<c01320a5>] ? 
 _local_bh_enable_ip.clone.6+0x18/0x71
 May 17 16:00:42 194.146.155.70 kernel: [14925.898195]  [<c017ed3e>] 
 __alloc_percpu+0xa/0xc
 May 17 16:00:42 194.146.155.70 kernel: [14925.898198]  [<c030aa7d>] 
 snmp_mib_init+0x2f/0x51
 May 17 16:00:42 194.146.155.70 kernel: [14925.898207]  [<f8ae2ad0>] 
 ipv6_add_dev+0x133/0x2a3 [ipv6]
 May 17 16:00:42 194.146.155.70 kernel: [14925.898209]  [<c030e12d>] ? 
 ip_mc_init_dev+0x75/0x86
 May 17 16:00:42 194.146.155.70 kernel: [14925.898211]  [<c0309321>] ? 
 devinet_sysctl_register+0x34/0x38
 May 17 16:00:42 194.146.155.70 kernel: [14925.898221]  [<f8ae5754>] 
 addrconf_notify+0x50/0x6a5 [ipv6]
 May 17 16:00:42 194.146.155.70 kernel: [14925.898224]  [<c0218f52>] ? 
 add_uevent_var+0xa3/0xa3
 May 17 16:00:42 194.146.155.70 kernel: [14925.898226]  [<c0309901>] ? 
 inetdev_event+0x55/0x3c0
 May 17 16:00:42 194.146.155.70 kernel: [14925.898230]  [<c01446f9>] 
 notifier_call_chain+0x26/0x48
 May 17 16:00:42 194.146.155.70 kernel: [14925.898232]  [<c01447a7>] 
 raw_notifier_call_chain+0x1a/0x1c
 May 17 16:00:42 194.146.155.70 kernel: [14925.898236]  [<c02c8115>] 
 call_netdevice_notifiers+0x44/0x4b
 May 17 16:00:42 194.146.155.70 kernel: [14925.898238]  [<c01320a5>] ? 
 _local_bh_enable_ip.clone.6+0x18/0x71
 May 17 16:00:42 194.146.155.70 kernel: [14925.898240]  [<c0132106>] ? 
 local_bh_enable_ip+0x8/0xa
 May 17 16:00:42 194.146.155.70 kernel: [14925.898242]  [<c02ca19b>] 
 register_netdevice+0x1fb/0x255
 May 17 16:00:42 194.146.155.70 kernel: [14925.898244]  [<c02ca227>] 
 register_netdev+0x32/0x41
 May 17 16:00:42 194.146.155.70 kernel: [14925.898247]  [<c021d5cf>] ? 
 sprintf+0x1c/0x1e
 May 17 16:00:42 194.146.155.70 kernel: [14925.898249]  [<c029647a>] 
 ppp_ioctl+0x224/0xaea
 May 17 16:00:42 194.146.155.70 kernel: [14925.898252]  [<c01a35cc>] ? 
 do_filp_open+0x26/0x67
 May 17 16:00:42 194.146.155.70 kernel: [14925.898254]  [<c0296256>] ? 
 ppp_write+0x98/0x98
 May 17 16:00:42 194.146.155.70 kernel: [14925.898256]  [<c01a53ce>] 
 do_vfs_ioctl+0x45e/0x498
 May 17 16:00:42 194.146.155.70 kernel: [14925.898258]  [<c01a118e>] ? 
 getname_flags+0x1e/0xad
 May 17 16:00:42 194.146.155.70 kernel: [14925.898260]  [<c019391b>] ? 
 kmem_cache_free+0x14/0x83
 May 17 16:00:42 194.146.155.70 kernel: [14925.898262]  [<c01ab5bb>] ? 
 alloc_fd+0x4e/0xba
 May 17 16:00:42 194.146.155.70 kernel: [14925.898265]  [<c0199465>] ? 
 do_sys_open+0xdb/0xe5
 May 17 16:00:42 194.146.155.70 kernel: [14925.898266]  [<c019ac7b>] ? 
 fput+0x13/0x155
 May 17 16:00:42 194.146.155.70 kernel: [14925.898268]  [<c01a4387>] ? 
 do_fcntl+0x227/0x3aa
 May 17 16:00:42 194.146.155.70 kernel: [14925.898270]  [<c01a543b>] 
 sys_ioctl+0x33/0x4c
 May 17 16:00:42 194.146.155.70 kernel: [14925.898273]  [<c0336edd>] 
 syscall_call+0x7/0xb

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU:  allocation failed
  2011-05-19  6:35 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed Denys Fedoryshchenko
@ 2011-05-19  6:39 ` Eric Dumazet
  2011-05-19  6:47   ` Denys Fedoryshchenko
  2011-05-19  6:55   ` Eric Dumazet
  2011-05-19  7:51 ` 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed David Miller
  1 sibling, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2011-05-19  6:39 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev

Le jeudi 19 mai 2011 à 09:35 +0300, Denys Fedoryshchenko a écrit :
> Hi, again
> 
>  Just tried to upgrade large NAS from 2.6.38.6 to 2.6.39-rc7-git11, and 
>  at same time enabling ipv6 on it.
>  Got that, after ppp2897 brought up (sure it means there is other 2896 
>  available, and also few ethernet vlans, around 32).
>  I am not sure it is a bug, but it looks i had free memory(the box had 
>  8GB free), and lowmem too, also i will try to enable there 64bit kernel 
>  at evening.
> 
>  May 17 16:00:42 194.146.155.70 kernel: [14925.897799] PERCPU: 
>  allocation failed, size=2048 align=4, failed to allocate new chunk
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898163] Pid: 24207, comm: 
>  pppd Not tainted 2.6.39-rc7-git11-build-0058 #4
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898164] Call Trace:
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898169]  [<c0335548>] ? 
>  printk+0x18/0x20
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898173]  [<c017ecd0>] 
>  pcpu_alloc+0x616/0x67a
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898176]  [<c0194a80>] ? 
>  __kmalloc_track_caller+0x68/0xc0
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898189]  [<f8ae196c>] ? 
>  kzalloc+0xb/0xd [ipv6]
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898193]  [<c01320a5>] ? 
>  _local_bh_enable_ip.clone.6+0x18/0x71
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898195]  [<c017ed3e>] 
>  __alloc_percpu+0xa/0xc
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898198]  [<c030aa7d>] 
>  snmp_mib_init+0x2f/0x51
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898207]  [<f8ae2ad0>] 
>  ipv6_add_dev+0x133/0x2a3 [ipv6]
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898209]  [<c030e12d>] ? 
>  ip_mc_init_dev+0x75/0x86
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898211]  [<c0309321>] ? 
>  devinet_sysctl_register+0x34/0x38
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898221]  [<f8ae5754>] 
>  addrconf_notify+0x50/0x6a5 [ipv6]
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898224]  [<c0218f52>] ? 
>  add_uevent_var+0xa3/0xa3
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898226]  [<c0309901>] ? 
>  inetdev_event+0x55/0x3c0
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898230]  [<c01446f9>] 
>  notifier_call_chain+0x26/0x48
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898232]  [<c01447a7>] 
>  raw_notifier_call_chain+0x1a/0x1c
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898236]  [<c02c8115>] 
>  call_netdevice_notifiers+0x44/0x4b
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898238]  [<c01320a5>] ? 
>  _local_bh_enable_ip.clone.6+0x18/0x71
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898240]  [<c0132106>] ? 
>  local_bh_enable_ip+0x8/0xa
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898242]  [<c02ca19b>] 
>  register_netdevice+0x1fb/0x255
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898244]  [<c02ca227>] 
>  register_netdev+0x32/0x41
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898247]  [<c021d5cf>] ? 
>  sprintf+0x1c/0x1e
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898249]  [<c029647a>] 
>  ppp_ioctl+0x224/0xaea
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898252]  [<c01a35cc>] ? 
>  do_filp_open+0x26/0x67
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898254]  [<c0296256>] ? 
>  ppp_write+0x98/0x98
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898256]  [<c01a53ce>] 
>  do_vfs_ioctl+0x45e/0x498
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898258]  [<c01a118e>] ? 
>  getname_flags+0x1e/0xad
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898260]  [<c019391b>] ? 
>  kmem_cache_free+0x14/0x83
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898262]  [<c01ab5bb>] ? 
>  alloc_fd+0x4e/0xba
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898265]  [<c0199465>] ? 
>  do_sys_open+0xdb/0xe5
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898266]  [<c019ac7b>] ? 
>  fput+0x13/0x155
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898268]  [<c01a4387>] ? 
>  do_fcntl+0x227/0x3aa
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898270]  [<c01a543b>] 
>  sys_ioctl+0x33/0x4c
>  May 17 16:00:42 194.146.155.70 kernel: [14925.898273]  [<c0336edd>] 
>  syscall_call+0x7/0xb
> --

Its a known problem : When ipv6 is enabled, we allocate percpu memory to
hold per device snmp counters.

make sure kernel idea of max possible cpus matches real number of cpus.

And yes, switching to 64bit kernel helps a lot.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface,  PERCPU:  allocation failed
  2011-05-19  6:39 ` Eric Dumazet
@ 2011-05-19  6:47   ` Denys Fedoryshchenko
  2011-05-19  6:55   ` Eric Dumazet
  1 sibling, 0 replies; 10+ messages in thread
From: Denys Fedoryshchenko @ 2011-05-19  6:47 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

 On Thu, 19 May 2011 08:39:18 +0200, Eric Dumazet wrote:
> Le jeudi 19 mai 2011 à 09:35 +0300, Denys Fedoryshchenko a écrit :
>> Hi, again
>>
>>  Just tried to upgrade large NAS from 2.6.38.6 to 2.6.39-rc7-git11, 
>> and
>>  at same time enabling ipv6 on it.
>>  Got that, after ppp2897 brought up (sure it means there is other 
>> 2896
>>  available, and also few ethernet vlans, around 32).
>>  I am not sure it is a bug, but it looks i had free memory(the box 
>> had
>>  8GB free), and lowmem too, also i will try to enable there 64bit 
>> kernel
>>  at evening.
>>
>>  May 17 16:00:42 194.146.155.70 kernel: [14925.897799] PERCPU:
>>  allocation failed, size=2048 align=4, failed to allocate new chunk
>>  May 17 16:00:42 194.146.155.70 kernel: [14925.898163] Pid: 24207, 
>> comm:
>>  pppd Not tainted 2.6.39-rc7-git11-build-0058 #4
>
> Its a known problem : When ipv6 is enabled, we allocate percpu memory 
> to
> hold per device snmp counters.
>
> make sure kernel idea of max possible cpus matches real number of 
> cpus.
>
> And yes, switching to 64bit kernel helps a lot.
>
 Yes, it matches, i guess.
 CONFIG_NR_CPUS=8

 processor       : 7
 vendor_id       : GenuineIntel
 cpu family      : 6
 model           : 26
 model name      : Intel(R) Core(TM) i7 CPU         950  @ 3.07GHz

 Thanks. Then i will simply switch kernel to 64bit, but for now with 
 32bit userspace, since this semi-embedded system
 mass deployed, and i have to maintain it alone (cannot handle both 
 32/64 bit userspace), and some pc's don't have
 lm flag in cpuinfo :)

 I am hitting a lot lowmem limits lately, but the only application that 
 was not working right 32bit userspace/64bit kernel - ipvsadm.
 Should i report it as a bug (i will check if it is still an issue)?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU:  allocation failed
  2011-05-19  6:39 ` Eric Dumazet
  2011-05-19  6:47   ` Denys Fedoryshchenko
@ 2011-05-19  6:55   ` Eric Dumazet
  2011-05-19  7:28     ` Denys Fedoryshchenko
  2011-05-19 11:14     ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet
  1 sibling, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2011-05-19  6:55 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev

Le jeudi 19 mai 2011 à 08:39 +0200, Eric Dumazet a écrit :

> Its a known problem : When ipv6 is enabled, we allocate percpu memory to
> hold per device snmp counters.
> 
> make sure kernel idea of max possible cpus matches real number of cpus.
> 
> And yes, switching to 64bit kernel helps a lot.
> 
> 

Looking at snmp6_alloc_dev(), we allocate three mib per device :

ipstats_mib  (30 * sizeof(u64) * number_of_possible_cpus)
icmpv6_mib    (4 * sizeof(long) * number_of_possible_cpus)
icmpv6msg_mib  (26 * sizeof(long))

For sure icmp ones dont need percpu counter. Plain atomic_long_t
(shared) would be enough, since ICMP messages are rare enough.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface,  PERCPU:  allocation failed
  2011-05-19  6:55   ` Eric Dumazet
@ 2011-05-19  7:28     ` Denys Fedoryshchenko
  2011-05-19  7:44       ` Eric Dumazet
  2011-05-19 11:14     ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet
  1 sibling, 1 reply; 10+ messages in thread
From: Denys Fedoryshchenko @ 2011-05-19  7:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

 On Thu, 19 May 2011 08:55:13 +0200, Eric Dumazet wrote:
> Le jeudi 19 mai 2011 à 08:39 +0200, Eric Dumazet a écrit :
>
>> Its a known problem : When ipv6 is enabled, we allocate percpu 
>> memory to
>> hold per device snmp counters.
>>
>> make sure kernel idea of max possible cpus matches real number of 
>> cpus.
>>
>> And yes, switching to 64bit kernel helps a lot.
>>
>>
>
> Looking at snmp6_alloc_dev(), we allocate three mib per device :
>
> ipstats_mib  (30 * sizeof(u64) * number_of_possible_cpus)
> icmpv6_mib    (4 * sizeof(long) * number_of_possible_cpus)
> icmpv6msg_mib  (26 * sizeof(long))
 1920 +
 256 +
 208 = 2386 * 3000ppp's = 7152000, i think it is not that much at any 
 case, if i am not wrong.

 But at any case i will try 64bit.

>
> For sure icmp ones dont need percpu counter. Plain atomic_long_t
> (shared) would be enough, since ICMP messages are rare enough.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU:  allocation failed
  2011-05-19  7:28     ` Denys Fedoryshchenko
@ 2011-05-19  7:44       ` Eric Dumazet
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Dumazet @ 2011-05-19  7:44 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev

Le jeudi 19 mai 2011 à 10:28 +0300, Denys Fedoryshchenko a écrit :
> On Thu, 19 May 2011 08:55:13 +0200, Eric Dumazet wrote:
> > Le jeudi 19 mai 2011 à 08:39 +0200, Eric Dumazet a écrit :
> >
> >> Its a known problem : When ipv6 is enabled, we allocate percpu 
> >> memory to
> >> hold per device snmp counters.
> >>
> >> make sure kernel idea of max possible cpus matches real number of 
> >> cpus.
> >>
> >> And yes, switching to 64bit kernel helps a lot.
> >>
> >>
> >
> > Looking at snmp6_alloc_dev(), we allocate three mib per device :
> >
> > ipstats_mib  (30 * sizeof(u64) * number_of_possible_cpus)
> > icmpv6_mib    (4 * sizeof(long) * number_of_possible_cpus)
> > icmpv6msg_mib  (26 * sizeof(long))
>  1920 +
>  256 +
>  208 = 2386 * 3000ppp's = 7152000, i think it is not that much at any 
>  case, if i am not wrong.
> 
>  But at any case i will try 64bit.

If you really want to stay 32bit, you might try to enlarge vmalloc aread
(128 Mbytes default) to get room for pcpu data :

grep pcpu /proc/vmallocinfo 


boot param : vmalloc=256M




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed
  2011-05-19  6:35 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed Denys Fedoryshchenko
  2011-05-19  6:39 ` Eric Dumazet
@ 2011-05-19  7:51 ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2011-05-19  7:51 UTC (permalink / raw)
  To: denys; +Cc: netdev

From: Denys Fedoryshchenko <denys@visp.net.lb>
Date: Thu, 19 May 2011 09:35:29 +0300

> I am not sure it is a bug, but it looks i had free memory(the box had
> 8GB free), and lowmem too, also i will try to enable there 64bit
> kernel at evening.

It's not free memory, you ran out of per-cpu chunks which are allocated
in fixed virtual region(s).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes
  2011-05-19  6:55   ` Eric Dumazet
  2011-05-19  7:28     ` Denys Fedoryshchenko
@ 2011-05-19 11:14     ` Eric Dumazet
  2011-05-19 11:26       ` Denys Fedoryshchenko
  2011-05-19 20:19       ` David Miller
  1 sibling, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2011-05-19 11:14 UTC (permalink / raw)
  To: Denys Fedoryshchenko, David Miller; +Cc: netdev

Le jeudi 19 mai 2011 à 08:55 +0200, Eric Dumazet a écrit :

> Looking at snmp6_alloc_dev(), we allocate three mib per device :
> 
> ipstats_mib  (30 * sizeof(u64) * number_of_possible_cpus)
> icmpv6_mib    (4 * sizeof(long) * number_of_possible_cpus)
> icmpv6msg_mib  (26 * sizeof(long))
> 

Oops, I forgot that mibs were doubled (one set for USER, one set for BH)

And :
#define __ICMP6MSG_MIB_MAX 512

So icmpv6msg_mib is really 512*sizeof(long)*number_of_possible_cpus*2 

32 kbytes per device on a 8cpu machine, 32bit kernel.

Plus all other mibs... yes thats way too big for a seldom used stuff.

Here is patch I cooked and tested on my machine :

[PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes.

ipv6 has per device ICMP SNMP counters, taking too much space because
they use percpu storage.

needed size per device is : 
(512+4)*sizeof(long)*number_of_possible_cpus*2 

On a 32bit kernel, 16 possible cpus, this wastes more than 64kbytes of
memory per ipv6 enabled network device, taken in vmalloc pool.

Since ICMP messages are rare, just use shared counters (atomic_long_t)

Per network space ICMP counters are still using percpu memory, we might
also convert them to shared counters in a future patch.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Denys Fedoryshchenko <denys@visp.net.lb>
---
 include/net/if_inet6.h |    4 +--
 include/net/ipv6.h     |   19 +++++++++++++-----
 include/net/snmp.h     |   14 +++++++++++++
 net/ipv6/addrconf.c    |   24 +++++++++++------------
 net/ipv6/proc.c        |   40 +++++++++++++++++++++++++--------------
 5 files changed, 68 insertions(+), 33 deletions(-)

diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index 0c603fe..11cf373 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -154,8 +154,8 @@ struct ifacaddr6 {
 struct ipv6_devstat {
 	struct proc_dir_entry	*proc_dir_entry;
 	DEFINE_SNMP_STAT(struct ipstats_mib, ipv6);
-	DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6);
-	DEFINE_SNMP_STAT(struct icmpv6msg_mib, icmpv6msg);
+	DEFINE_SNMP_STAT_ATOMIC(struct icmpv6_mib_device, icmpv6dev);
+	DEFINE_SNMP_STAT_ATOMIC(struct icmpv6msg_mib_device, icmpv6msgdev);
 };
 
 struct inet6_dev {
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index e1c60b4..c033ed0 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -123,6 +123,15 @@ extern struct ctl_path net_ipv6_ctl_path[];
 	SNMP_INC_STATS##modifier((net)->mib.statname##_statistics, (field));\
 })
 
+/* per device counters are atomic_long_t */
+#define _DEVINCATOMIC(net, statname, modifier, idev, field)		\
+({									\
+	struct inet6_dev *_idev = (idev);				\
+	if (likely(_idev != NULL))					\
+		SNMP_INC_STATS_ATOMIC_LONG((_idev)->stats.statname##dev, (field)); \
+	SNMP_INC_STATS##modifier((net)->mib.statname##_statistics, (field));\
+})
+
 #define _DEVADD(net, statname, modifier, idev, field, val)		\
 ({									\
 	struct inet6_dev *_idev = (idev);				\
@@ -154,16 +163,16 @@ extern struct ctl_path net_ipv6_ctl_path[];
 #define IP6_UPD_PO_STATS_BH(net, idev,field,val)   \
 		_DEVUPD(net, ipv6, 64_BH, idev, field, val)
 #define ICMP6_INC_STATS(net, idev, field)	\
-		_DEVINC(net, icmpv6, , idev, field)
+		_DEVINCATOMIC(net, icmpv6, , idev, field)
 #define ICMP6_INC_STATS_BH(net, idev, field)	\
-		_DEVINC(net, icmpv6, _BH, idev, field)
+		_DEVINCATOMIC(net, icmpv6, _BH, idev, field)
 
 #define ICMP6MSGOUT_INC_STATS(net, idev, field)		\
-	_DEVINC(net, icmpv6msg, , idev, field +256)
+	_DEVINCATOMIC(net, icmpv6msg, , idev, field +256)
 #define ICMP6MSGOUT_INC_STATS_BH(net, idev, field)	\
-	_DEVINC(net, icmpv6msg, _BH, idev, field +256)
+	_DEVINCATOMIC(net, icmpv6msg, _BH, idev, field +256)
 #define ICMP6MSGIN_INC_STATS_BH(net, idev, field)	\
-	_DEVINC(net, icmpv6msg, _BH, idev, field)
+	_DEVINCATOMIC(net, icmpv6msg, _BH, idev, field)
 
 struct ip6_ra_chain {
 	struct ip6_ra_chain	*next;
diff --git a/include/net/snmp.h b/include/net/snmp.h
index 27461d6..479083a 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -72,14 +72,24 @@ struct icmpmsg_mib {
 
 /* ICMP6 (IPv6-ICMP) */
 #define ICMP6_MIB_MAX	__ICMP6_MIB_MAX
+/* per network ns counters */
 struct icmpv6_mib {
 	unsigned long	mibs[ICMP6_MIB_MAX];
 };
+/* per device counters, (shared on all cpus) */
+struct icmpv6_mib_device {
+	atomic_long_t	mibs[ICMP6_MIB_MAX];
+};
 
 #define ICMP6MSG_MIB_MAX  __ICMP6MSG_MIB_MAX
+/* per network ns counters */
 struct icmpv6msg_mib {
 	unsigned long	mibs[ICMP6MSG_MIB_MAX];
 };
+/* per device counters, (shared on all cpus) */
+struct icmpv6msg_mib_device {
+	atomic_long_t	mibs[ICMP6MSG_MIB_MAX];
+};
 
 
 /* TCP */
@@ -114,6 +124,8 @@ struct linux_xfrm_mib {
  */ 
 #define DEFINE_SNMP_STAT(type, name)	\
 	__typeof__(type) __percpu *name[2]
+#define DEFINE_SNMP_STAT_ATOMIC(type, name)	\
+	__typeof__(type) *name
 #define DECLARE_SNMP_STAT(type, name)	\
 	extern __typeof__(type) __percpu *name[2]
 
@@ -124,6 +136,8 @@ struct linux_xfrm_mib {
 			__this_cpu_inc(mib[0]->mibs[field])
 #define SNMP_INC_STATS_USER(mib, field)	\
 			this_cpu_inc(mib[1]->mibs[field])
+#define SNMP_INC_STATS_ATOMIC_LONG(mib, field)	\
+			atomic_long_inc(&mib->mibs[field])
 #define SNMP_INC_STATS(mib, field)	\
 			this_cpu_inc(mib[!in_softirq()]->mibs[field])
 #define SNMP_DEC_STATS(mib, field)	\
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f2f9b2e..3cfbbf3 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -289,19 +289,19 @@ static int snmp6_alloc_dev(struct inet6_dev *idev)
 			  sizeof(struct ipstats_mib),
 			  __alignof__(struct ipstats_mib)) < 0)
 		goto err_ip;
-	if (snmp_mib_init((void __percpu **)idev->stats.icmpv6,
-			  sizeof(struct icmpv6_mib),
-			  __alignof__(struct icmpv6_mib)) < 0)
+	idev->stats.icmpv6dev = kzalloc(sizeof(struct icmpv6_mib_device),
+					GFP_KERNEL);
+	if (!idev->stats.icmpv6dev)
 		goto err_icmp;
-	if (snmp_mib_init((void __percpu **)idev->stats.icmpv6msg,
-			  sizeof(struct icmpv6msg_mib),
-			  __alignof__(struct icmpv6msg_mib)) < 0)
+	idev->stats.icmpv6msgdev = kzalloc(sizeof(struct icmpv6msg_mib_device),
+					   GFP_KERNEL);
+	if (!idev->stats.icmpv6msgdev)
 		goto err_icmpmsg;
 
 	return 0;
 
 err_icmpmsg:
-	snmp_mib_free((void __percpu **)idev->stats.icmpv6);
+	kfree(idev->stats.icmpv6dev);
 err_icmp:
 	snmp_mib_free((void __percpu **)idev->stats.ipv6);
 err_ip:
@@ -310,8 +310,8 @@ err_ip:
 
 static void snmp6_free_dev(struct inet6_dev *idev)
 {
-	snmp_mib_free((void __percpu **)idev->stats.icmpv6msg);
-	snmp_mib_free((void __percpu **)idev->stats.icmpv6);
+	kfree(idev->stats.icmpv6msgdev);
+	kfree(idev->stats.icmpv6dev);
 	snmp_mib_free((void __percpu **)idev->stats.ipv6);
 }
 
@@ -3838,7 +3838,7 @@ static inline size_t inet6_if_nlmsg_size(void)
 	       + nla_total_size(inet6_ifla6_size()); /* IFLA_PROTINFO */
 }
 
-static inline void __snmp6_fill_stats(u64 *stats, void __percpu **mib,
+static inline void __snmp6_fill_statsdev(u64 *stats, atomic_long_t *mib,
 				      int items, int bytes)
 {
 	int i;
@@ -3848,7 +3848,7 @@ static inline void __snmp6_fill_stats(u64 *stats, void __percpu **mib,
 	/* Use put_unaligned() because stats may not be aligned for u64. */
 	put_unaligned(items, &stats[0]);
 	for (i = 1; i < items; i++)
-		put_unaligned(snmp_fold_field(mib, i), &stats[i]);
+		put_unaligned(atomic_long_read(&mib[i]), &stats[i]);
 
 	memset(&stats[items], 0, pad);
 }
@@ -3877,7 +3877,7 @@ static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype,
 				     IPSTATS_MIB_MAX, bytes, offsetof(struct ipstats_mib, syncp));
 		break;
 	case IFLA_INET6_ICMP6STATS:
-		__snmp6_fill_stats(stats, (void __percpu **)idev->stats.icmpv6, ICMP6_MIB_MAX, bytes);
+		__snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, ICMP6_MIB_MAX, bytes);
 		break;
 	}
 }
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 24b3558..18ff5df 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -141,7 +141,11 @@ static const struct snmp_mib snmp6_udplite6_list[] = {
 	SNMP_MIB_SENTINEL
 };
 
-static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib)
+/* can be called either with percpu mib (pcpumib != NULL),
+ * or shared one (smib != NULL)
+ */
+static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **pcpumib,
+				     atomic_long_t *smib)
 {
 	char name[32];
 	int i;
@@ -158,14 +162,14 @@ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib)
 		snprintf(name, sizeof(name), "Icmp6%s%s",
 			i & 0x100 ? "Out" : "In", p);
 		seq_printf(seq, "%-32s\t%lu\n", name,
-			snmp_fold_field(mib, i));
+			pcpumib ? snmp_fold_field(pcpumib, i) : atomic_long_read(smib + i));
 	}
 
 	/* print by number (nonzero only) - ICMPMsgStat format */
 	for (i = 0; i < ICMP6MSG_MIB_MAX; i++) {
 		unsigned long val;
 
-		val = snmp_fold_field(mib, i);
+		val = pcpumib ? snmp_fold_field(pcpumib, i) : atomic_long_read(smib + i);
 		if (!val)
 			continue;
 		snprintf(name, sizeof(name), "Icmp6%sType%u",
@@ -174,14 +178,22 @@ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib)
 	}
 }
 
-static void snmp6_seq_show_item(struct seq_file *seq, void __percpu **mib,
+/* can be called either with percpu mib (pcpumib != NULL),
+ * or shared one (smib != NULL)
+ */
+static void snmp6_seq_show_item(struct seq_file *seq, void __percpu **pcpumib,
+				atomic_long_t *smib,
 				const struct snmp_mib *itemlist)
 {
 	int i;
+	unsigned long val;
 
-	for (i = 0; itemlist[i].name; i++)
-		seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name,
-			   snmp_fold_field(mib, itemlist[i].entry));
+	for (i = 0; itemlist[i].name; i++) {
+		val = pcpumib ?
+			snmp_fold_field(pcpumib, itemlist[i].entry) :
+			atomic_long_read(smib + itemlist[i].entry);
+		seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, val);
+	}
 }
 
 static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu **mib,
@@ -201,13 +213,13 @@ static int snmp6_seq_show(struct seq_file *seq, void *v)
 	snmp6_seq_show_item64(seq, (void __percpu **)net->mib.ipv6_statistics,
 			    snmp6_ipstats_list, offsetof(struct ipstats_mib, syncp));
 	snmp6_seq_show_item(seq, (void __percpu **)net->mib.icmpv6_statistics,
-			    snmp6_icmp6_list);
+			    NULL, snmp6_icmp6_list);
 	snmp6_seq_show_icmpv6msg(seq,
-			    (void __percpu **)net->mib.icmpv6msg_statistics);
+			    (void __percpu **)net->mib.icmpv6msg_statistics, NULL);
 	snmp6_seq_show_item(seq, (void __percpu **)net->mib.udp_stats_in6,
-			    snmp6_udp6_list);
+			    NULL, snmp6_udp6_list);
 	snmp6_seq_show_item(seq, (void __percpu **)net->mib.udplite_stats_in6,
-			    snmp6_udplite6_list);
+			    NULL, snmp6_udplite6_list);
 	return 0;
 }
 
@@ -229,11 +241,11 @@ static int snmp6_dev_seq_show(struct seq_file *seq, void *v)
 	struct inet6_dev *idev = (struct inet6_dev *)seq->private;
 
 	seq_printf(seq, "%-32s\t%u\n", "ifIndex", idev->dev->ifindex);
-	snmp6_seq_show_item(seq, (void __percpu **)idev->stats.ipv6,
+	snmp6_seq_show_item(seq, (void __percpu **)idev->stats.ipv6, NULL,
 			    snmp6_ipstats_list);
-	snmp6_seq_show_item(seq, (void __percpu **)idev->stats.icmpv6,
+	snmp6_seq_show_item(seq, NULL, idev->stats.icmpv6dev->mibs,
 			    snmp6_icmp6_list);
-	snmp6_seq_show_icmpv6msg(seq, (void __percpu **)idev->stats.icmpv6msg);
+	snmp6_seq_show_icmpv6msg(seq, NULL, idev->stats.icmpv6msgdev->mibs);
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes
  2011-05-19 11:14     ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet
@ 2011-05-19 11:26       ` Denys Fedoryshchenko
  2011-05-19 20:19       ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: Denys Fedoryshchenko @ 2011-05-19 11:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

 On Thu, 19 May 2011 13:14:23 +0200, Eric Dumazet wrote:
> Le jeudi 19 mai 2011 à 08:55 +0200, Eric Dumazet a écrit :
>
>> Looking at snmp6_alloc_dev(), we allocate three mib per device :
>>
>> ipstats_mib  (30 * sizeof(u64) * number_of_possible_cpus)
>> icmpv6_mib    (4 * sizeof(long) * number_of_possible_cpus)
>> icmpv6msg_mib  (26 * sizeof(long))
>>
>
> Oops, I forgot that mibs were doubled (one set for USER, one set for 
> BH)
>
> And :
> #define __ICMP6MSG_MIB_MAX 512
>
> So icmpv6msg_mib is really 512*sizeof(long)*number_of_possible_cpus*2
>
> 32 kbytes per device on a 8cpu machine, 32bit kernel.
>
> Plus all other mibs... yes thats way too big for a seldom used stuff.
>
> Here is patch I cooked and tested on my machine :
>
> [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes.

 I'll test it tonight, thanks a lot :-)
 I guess it will help also for people with lot of interfaces 
 (virtualisation?), not only ppp.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes
  2011-05-19 11:14     ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet
  2011-05-19 11:26       ` Denys Fedoryshchenko
@ 2011-05-19 20:19       ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2011-05-19 20:19 UTC (permalink / raw)
  To: eric.dumazet; +Cc: denys, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 19 May 2011 13:14:23 +0200

> [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes.
> 
> ipv6 has per device ICMP SNMP counters, taking too much space because
> they use percpu storage.
> 
> needed size per device is : 
> (512+4)*sizeof(long)*number_of_possible_cpus*2 
> 
> On a 32bit kernel, 16 possible cpus, this wastes more than 64kbytes of
> memory per ipv6 enabled network device, taken in vmalloc pool.
> 
> Since ICMP messages are rare, just use shared counters (atomic_long_t)
> 
> Per network space ICMP counters are still using percpu memory, we might
> also convert them to shared counters in a future patch.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-05-19 20:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-19  6:35 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed Denys Fedoryshchenko
2011-05-19  6:39 ` Eric Dumazet
2011-05-19  6:47   ` Denys Fedoryshchenko
2011-05-19  6:55   ` Eric Dumazet
2011-05-19  7:28     ` Denys Fedoryshchenko
2011-05-19  7:44       ` Eric Dumazet
2011-05-19 11:14     ` [PATCH net-next-2.6] ipv6: reduce per device ICMP mib sizes Eric Dumazet
2011-05-19 11:26       ` Denys Fedoryshchenko
2011-05-19 20:19       ` David Miller
2011-05-19  7:51 ` 2.6.39-rc7-git11, x86/32, failed on ppp2897'th interface, PERCPU: allocation failed David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.