linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.9 - e1000 - page allocation failed
@ 2004-10-21 22:16 Lukas Hejtmanek
  2004-10-21 22:58 ` Francois Romieu
  0 siblings, 1 reply; 11+ messages in thread
From: Lukas Hejtmanek @ 2004-10-21 22:16 UTC (permalink / raw)
  To: linux-kernel

Hello,

I'm getting this message in kernel log:

swapper: page allocation failure. order:0, mode:0x20
 [__alloc_pages+441/862] __alloc_pages+0x1b9/0x35e
 [__get_free_pages+37/63] __get_free_pages+0x25/0x3f
 [kmem_getpages+33/201] kmem_getpages+0x21/0xc9
 [cache_grow+171/333] cache_grow+0xab/0x14d
 [cache_alloc_refill+372/537] cache_alloc_refill+0x174/0x219
 [__kmalloc+133/140] __kmalloc+0x85/0x8c
 [alloc_skb+71/224] alloc_skb+0x47/0xe0
 [e1000_alloc_rx_buffers+68/227] e1000_alloc_rx_buffers+0x44/0xe3
 [e1000_clean_rx_irq+398/1095] e1000_clean_rx_irq+0x18e/0x447
 [__kfree_skb+131/253] __kfree_skb+0x83/0xfd
 [e1000_clean+81/202] e1000_clean+0x51/0xca
 [net_rx_action+119/246] net_rx_action+0x77/0xf6
 [__do_softirq+183/198] __do_softirq+0xb7/0xc6
 [do_softirq+45/47] do_softirq+0x2d/0x2f
 [do_IRQ+274/304] do_IRQ+0x112/0x130
 [common_interrupt+24/32] common_interrupt+0x18/0x20
 [default_idle+0/44] default_idle+0x0/0x2c
 [default_idle+41/44] default_idle+0x29/0x2c
 [cpu_idle+63/88] cpu_idle+0x3f/0x58
 [start_kernel+361/388] start_kernel+0x169/0x184
 [unknown_bootoption+0/348] unknown_bootoption+0x0/0x15c

That machine has 300MB of free memory of 1GB total and 4GB of free swap. So what
is wrong?

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-21 22:16 2.6.9 - e1000 - page allocation failed Lukas Hejtmanek
@ 2004-10-21 22:58 ` Francois Romieu
  2004-10-22  9:51   ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Francois Romieu @ 2004-10-21 22:58 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: linux-kernel

Lukas Hejtmanek <xhejtman@mail.muni.cz> :
[page allocation failure with e1000]

If you are using TSO, try patch below by Herbert Xu (available
from http://marc.theaimsgroup.com/?l=linux-netdev&m=109799935603132&w=3)

--- 1.67/net/ipv4/tcp_output.c	2004-10-01 13:56:45 +10:00
+++ edited/net/ipv4/tcp_output.c	2004-10-17 18:58:47 +10:00
@@ -455,8 +455,12 @@
 {
 	struct tcp_opt *tp = tcp_sk(sk);
 	struct sk_buff *buff;
-	int nsize = skb->len - len;
+	int nsize;
 	u16 flags;
+
+	nsize = skb_headlen(skb) - len;
+	if (nsize < 0)
+		nsize = 0;
 
 	if (skb_cloned(skb) &&
 	    skb_is_nonlinear(skb) &&


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-21 22:58 ` Francois Romieu
@ 2004-10-22  9:51   ` Andrew Morton
  2004-10-22 12:08     ` Lukas Hejtmanek
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Andrew Morton @ 2004-10-22  9:51 UTC (permalink / raw)
  To: Francois Romieu; +Cc: xhejtman, linux-kernel

Francois Romieu <romieu@fr.zoreil.com> wrote:
>
> Lukas Hejtmanek <xhejtman@mail.muni.cz> :
> [page allocation failure with e1000]
> 
> If you are using TSO, try patch below by Herbert Xu (available
> from http://marc.theaimsgroup.com/?l=linux-netdev&m=109799935603132&w=3)
> 
> --- 1.67/net/ipv4/tcp_output.c	2004-10-01 13:56:45 +10:00
> +++ edited/net/ipv4/tcp_output.c	2004-10-17 18:58:47 +10:00
> @@ -455,8 +455,12 @@
>  {
>  	struct tcp_opt *tp = tcp_sk(sk);
>  	struct sk_buff *buff;
> -	int nsize = skb->len - len;
> +	int nsize;
>  	u16 flags;
> +
> +	nsize = skb_headlen(skb) - len;
> +	if (nsize < 0)
> +		nsize = 0;
>  
>  	if (skb_cloned(skb) &&
>  	    skb_is_nonlinear(skb) &&

I'd be interested in knowing if this fixes it - I don't expect it will,
because that's a zero-order allocation failure.  He's really out of memory.

The e1000 driver has a default rx ring size of 256 which seems a bit nutty:
a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust the
page allocator pools.

Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-22  9:51   ` Andrew Morton
@ 2004-10-22 12:08     ` Lukas Hejtmanek
  2004-10-22 12:45       ` Nick Piggin
  2004-11-03 13:22     ` Lukas Hejtmanek
  2004-11-03 14:10     ` Lukas Hejtmanek
  2 siblings, 1 reply; 11+ messages in thread
From: Lukas Hejtmanek @ 2004-10-22 12:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Francois Romieu, linux-kernel

On Fri, Oct 22, 2004 at 02:51:58AM -0700, Andrew Morton wrote:
> I'd be interested in knowing if this fixes it - I don't expect it will,
> because that's a zero-order allocation failure.  He's really out of memory.
> 
> The e1000 driver has a default rx ring size of 256 which seems a bit nutty:
> a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust the
> page allocator pools.
> 
> Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.

It did not help.

However I tweak network stack a little bit:
/sbin/sysctl -w net/core/rmem_max=8388608
/sbin/sysctl -w net/core/wmem_max=8388608
/sbin/sysctl -w net/core/rmem_default=1048576
/sbin/sysctl -w net/core/wmem_default=1048576
/sbin/sysctl -w net/ipv4/tcp_window_scaling=1
/sbin/sysctl -w net/ipv4/tcp_rmem="4096 1048576 8388608"
/sbin/sysctl -w net/ipv4/tcp_wmem="4096 1048576 8388608"
/sbin/ifconfig eth0 txqueuelen 1000

this is /proc/meminfo:
cat /proc/meminfo 
MemTotal:      1035116 kB
MemFree:        447028 kB
Buffers:             0 kB
Cached:         522444 kB
SwapCached:          0 kB
Active:          98092 kB
Inactive:       460408 kB
HighTotal:      131008 kB
HighFree:          308 kB
LowTotal:       904108 kB
LowFree:        446720 kB
SwapTotal:     4008208 kB
SwapFree:      4008204 kB
Dirty:               0 kB
Writeback:           0 kB
Mapped:          45804 kB
Slab:            21112 kB
Committed_AS:   128148 kB
PageTables:       1528 kB
VmallocTotal:   114680 kB
VmallocUsed:      2964 kB
VmallocChunk:   111700 kB

Note that there is 400MB of _free_ memory.

This is from slabinfo (hope that it is relevant info..)

size-131072(DMA)       0      0 131072    1   32 : tunables
8    4    0 : slabdata      0      0      0
size-131072            0      0 131072    1   32 : tunables
8    4    0 : slabdata      0      0      0
size-65536(DMA)        0      0  65536    1   16 : tunables
8    4    0 : slabdata      0      0      0
size-65536             1      1  65536    1   16 : tunables
8    4    0 : slabdata      1      1      0
size-32768(DMA)        0      0  32768    1    8 : tunables
8    4    0 : slabdata      0      0      0
size-32768            16     16  32768    1    8 : tunables
8    4    0 : slabdata     16     16      0
size-16384(DMA)        0      0  16384    1    4 : tunables
8    4    0 : slabdata      0      0      0
size-16384             2      2  16384    1    4 : tunables
8    4    0 : slabdata      2      2      0
size-8192(DMA)         0      0   8192    1    2 : tunables
8    4    0 : slabdata      0      0      0
size-8192            120    120   8192    1    2 : tunables
8    4    0 : slabdata    120    120      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12
8 : slabdata      0      0      0
size-4096            300    300   4096    1    1 : tunables   24   12
8 : slabdata    300    300      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12
8 : slabdata      0      0      0
size-2048            106    106   2048    2    1 : tunables   24   12
8 : slabdata     53     53      0
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27
8 : slabdata      0      0      0
size-1024           1052   1052   1024    4    1 : tunables   54   27
8 : slabdata    263    263      0
size-512(DMA)          0      0    512    8    1 : tunables   54   27
8 : slabdata      0      0      0
size-512             294   1240    512    8    1 : tunables   54   27
8 : slabdata    155    155      0
size-256(DMA)          0      0    256   15    1 : tunables  120   60
8 : slabdata      0      0      0
size-256             211   1170    256   15    1 : tunables  120   60
8 : slabdata     78     78      0
size-128(DMA)          0      0    128   31    1 : tunables  120   60
8 : slabdata      0      0      0
size-128            1688   2728    128   31    1 : tunables  120   60
8 : slabdata     88     88      0
size-64(DMA)           0      0     64   61    1 : tunables  120   60
8 : slabdata      0      0      0
size-64              610   2318     64   61    1 : tunables  120   60
8 : slabdata     38     38      0
size-32(DMA)           0      0     32  119    1 : tunables  120   60
8 : slabdata      0      0      0
size-32             1381   4641     32  119    1 : tunables  120   60
8 : slabdata     39     39     15
kmem_cache           165    165    256   15    1 : tunables  120   60
8 : slabdata     11     11      0

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-22 12:08     ` Lukas Hejtmanek
@ 2004-10-22 12:45       ` Nick Piggin
  2004-12-03 18:37         ` Lukas Hejtmanek
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2004-10-22 12:45 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Andrew Morton, Francois Romieu, linux-kernel

Lukas Hejtmanek wrote:
> On Fri, Oct 22, 2004 at 02:51:58AM -0700, Andrew Morton wrote:
> 
>>I'd be interested in knowing if this fixes it - I don't expect it will,
>>because that's a zero-order allocation failure.  He's really out of memory.
>>
>>The e1000 driver has a default rx ring size of 256 which seems a bit nutty:
>>a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust the
>>page allocator pools.
>>
>>Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.
> 
> 
> It did not help.
> 

What did you increase it to? What was the allocation failure message?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-22  9:51   ` Andrew Morton
  2004-10-22 12:08     ` Lukas Hejtmanek
@ 2004-11-03 13:22     ` Lukas Hejtmanek
  2004-11-03 14:10     ` Lukas Hejtmanek
  2 siblings, 0 replies; 11+ messages in thread
From: Lukas Hejtmanek @ 2004-11-03 13:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Francois Romieu, linux-kernel

On Fri, Oct 22, 2004 at 02:51:58AM -0700, Andrew Morton wrote:
> I'd be interested in knowing if this fixes it - I don't expect it will,
> because that's a zero-order allocation failure.  He's really out of memory.
> 
> The e1000 driver has a default rx ring size of 256 which seems a bit nutty:
> a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust the
> page allocator pools.
> 
> Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.

Well is there some possiblility to solve it? I have lots of free memory (300MB)
and it still complains about not enough free memory.

I'm using this:
/sbin/sysctl -w net/core/rmem_max=8388608
/sbin/sysctl -w net/core/wmem_max=8388608
/sbin/sysctl -w net/core/rmem_default=1048576
/sbin/sysctl -w net/core/wmem_default=1048576
/sbin/sysctl -w net/ipv4/tcp_window_scaling=1
/sbin/sysctl -w net/ipv4/tcp_rmem="4096 1048576 8388608"
/sbin/sysctl -w net/ipv4/tcp_wmem="4096 1048576 8388608"
/sbin/ifconfig eth0 txqueuelen 1000

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-22  9:51   ` Andrew Morton
  2004-10-22 12:08     ` Lukas Hejtmanek
  2004-11-03 13:22     ` Lukas Hejtmanek
@ 2004-11-03 14:10     ` Lukas Hejtmanek
  2 siblings, 0 replies; 11+ messages in thread
From: Lukas Hejtmanek @ 2004-11-03 14:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Francois Romieu, linux-kernel

On Fri, Oct 22, 2004 at 02:51:58AM -0700, Andrew Morton wrote:
> I'd be interested in knowing if this fixes it - I don't expect it will,
> because that's a zero-order allocation failure.  He's really out of memory.
> 
> The e1000 driver has a default rx ring size of 256 which seems a bit nutty:
> a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust the
> page allocator pools.
> 
> Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.

However. I have two machines very similar and one opses and one does not.
I wondered why. I've found out that if I turn whole netfilter to module and
disable:
CONFIG_NET_IPGRE=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y

Then problem seems to disappeare on both...

(I do not use iptables on any machine)

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-22 12:45       ` Nick Piggin
@ 2004-12-03 18:37         ` Lukas Hejtmanek
  0 siblings, 0 replies; 11+ messages in thread
From: Lukas Hejtmanek @ 2004-12-03 18:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Francois Romieu, linux-kernel

On Fri, Oct 22, 2004 at 10:45:10PM +1000, Nick Piggin wrote:
> >>a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust the
> >>page allocator pools.
> >>
> >>Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.
> >
> >
> >It did not help.
> >
> 
> What did you increase it to? What was the allocation failure message?

Sorry for late answer I missed this mail. I increased it to 10MB and failuer
message was the same...

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.9 - e1000 - page allocation failed
  2004-10-22 10:55 Piszcz, Justin Michael
@ 2004-10-22 19:35 ` Francois Romieu
  0 siblings, 0 replies; 11+ messages in thread
From: Francois Romieu @ 2004-10-22 19:35 UTC (permalink / raw)
  To: Piszcz, Justin Michael; +Cc: Andrew Morton, xhejtman, linux-kernel

Piszcz, Justin Michael <justin.piszcz@mitretek.org> :
[...]
> I found this in regards to TSO:
> http://www.kerneltrap.org/node.php?id=397
> 
> Which option enables TSO?

It is included in the IP stack and depends on the ability of your hardware.
'ethtool -K ethX tso on' should activate it. See 'man 8 ethtool' for details.

This question would probably be more accurate on netdev@oss.sgi.com.

--
Ueimor

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: 2.6.9 - e1000 - page allocation failed
@ 2004-10-22 11:16 Piszcz, Justin Michael
  0 siblings, 0 replies; 11+ messages in thread
From: Piszcz, Justin Michael @ 2004-10-22 11:16 UTC (permalink / raw)
  To: Andrew Morton, Francois Romieu; +Cc: xhejtman, linux-kernel

# patch -p1 < ../e1000.patch
patching file net/ipv4/tcp_output.c
# lilo
Added 2.6.9-2 *
# 

I am copying files on the NIC @ 24-28MB/s (normal) over NFS (16GB), no
problems yet.

I will let you know if I get any more page allocation failures.

Also, on the topic of page allocation failures, if I increase the MTU to
9000 I always get page allocation failures on the Optiplex GX1 box, on a
P4 box I do not get the page allocation failures (I wanted to see what
kind of speeds could be achieved using a 9000 byte MTU vs 1500).

So far, no problems, I will Re: if there if the errors re-occur.


-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Andrew Morton
Sent: Friday, October 22, 2004 5:52 AM
To: Francois Romieu
Cc: xhejtman@mail.muni.cz; linux-kernel@vger.kernel.org
Subject: Re: 2.6.9 - e1000 - page allocation failed

Francois Romieu <romieu@fr.zoreil.com> wrote:
>
> Lukas Hejtmanek <xhejtman@mail.muni.cz> :
> [page allocation failure with e1000]
> 
> If you are using TSO, try patch below by Herbert Xu (available
> from
http://marc.theaimsgroup.com/?l=linux-netdev&m=109799935603132&w=3)
> 
> --- 1.67/net/ipv4/tcp_output.c	2004-10-01 13:56:45 +10:00
> +++ edited/net/ipv4/tcp_output.c	2004-10-17 18:58:47 +10:00
> @@ -455,8 +455,12 @@
>  {
>  	struct tcp_opt *tp = tcp_sk(sk);
>  	struct sk_buff *buff;
> -	int nsize = skb->len - len;
> +	int nsize;
>  	u16 flags;
> +
> +	nsize = skb_headlen(skb) - len;
> +	if (nsize < 0)
> +		nsize = 0;
>  
>  	if (skb_cloned(skb) &&
>  	    skb_is_nonlinear(skb) &&

I'd be interested in knowing if this fixes it - I don't expect it will,
because that's a zero-order allocation failure.  He's really out of
memory.

The e1000 driver has a default rx ring size of 256 which seems a bit
nutty:
a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust
the
page allocator pools.

Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: 2.6.9 - e1000 - page allocation failed
@ 2004-10-22 10:55 Piszcz, Justin Michael
  2004-10-22 19:35 ` Francois Romieu
  0 siblings, 1 reply; 11+ messages in thread
From: Piszcz, Justin Michael @ 2004-10-22 10:55 UTC (permalink / raw)
  To: Andrew Morton, Francois Romieu; +Cc: xhejtman, linux-kernel

Question regarding TSO:

I found this in regards to TSO:
http://www.kerneltrap.org/node.php?id=397

Which option enables TSO?

$ grep -i tso .config
$ grep -i tcp .config
CONFIG_NFSD_TCP=y

I get them almost 1-10 minutes after rebooting into 2.6.9.
Machine = Dell Optiplex GX1
RAM = 768MB (ECC)
SWAP = 2048MB

So try the patch and increasing /proc/sys/vm/min_free_kbytes? I will
give this a shot and report back.

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Andrew Morton
Sent: Friday, October 22, 2004 5:52 AM
To: Francois Romieu
Cc: xhejtman@mail.muni.cz; linux-kernel@vger.kernel.org
Subject: Re: 2.6.9 - e1000 - page allocation failed

Francois Romieu <romieu@fr.zoreil.com> wrote:
>
> Lukas Hejtmanek <xhejtman@mail.muni.cz> :
> [page allocation failure with e1000]
> 
> If you are using TSO, try patch below by Herbert Xu (available
> from
http://marc.theaimsgroup.com/?l=linux-netdev&m=109799935603132&w=3)
> 
> --- 1.67/net/ipv4/tcp_output.c	2004-10-01 13:56:45 +10:00
> +++ edited/net/ipv4/tcp_output.c	2004-10-17 18:58:47 +10:00
> @@ -455,8 +455,12 @@
>  {
>  	struct tcp_opt *tp = tcp_sk(sk);
>  	struct sk_buff *buff;
> -	int nsize = skb->len - len;
> +	int nsize;
>  	u16 flags;
> +
> +	nsize = skb_headlen(skb) - len;
> +	if (nsize < 0)
> +		nsize = 0;
>  
>  	if (skb_cloned(skb) &&
>  	    skb_is_nonlinear(skb) &&

I'd be interested in knowing if this fixes it - I don't expect it will,
because that's a zero-order allocation failure.  He's really out of
memory.

The e1000 driver has a default rx ring size of 256 which seems a bit
nutty:
a back-to-back GFP_ATOMIC allocation of 256 skbs could easily exhaust
the
page allocator pools.

Probably this machine needs to increase /proc/sys/vm/min_free_kbytes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-12-03 18:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-21 22:16 2.6.9 - e1000 - page allocation failed Lukas Hejtmanek
2004-10-21 22:58 ` Francois Romieu
2004-10-22  9:51   ` Andrew Morton
2004-10-22 12:08     ` Lukas Hejtmanek
2004-10-22 12:45       ` Nick Piggin
2004-12-03 18:37         ` Lukas Hejtmanek
2004-11-03 13:22     ` Lukas Hejtmanek
2004-11-03 14:10     ` Lukas Hejtmanek
2004-10-22 10:55 Piszcz, Justin Michael
2004-10-22 19:35 ` Francois Romieu
2004-10-22 11:16 Piszcz, Justin Michael

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).