linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* __alloc_pages: 0-order allocation failed.
@ 2001-09-04 13:11 Martin MOKREJŠ
  2001-09-04 16:12 ` Daniel Phillips
  0 siblings, 1 reply; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-04 13:11 UTC (permalink / raw)
  To: linux-kernel

Hi,
  I'm getting the above error on 2.4.9 kernel with kernel HIGHMEM option
enabled to 2GB, 2x Intel PentiumIII. The machine has 1GB RAM
physically. Althougj I've found many report to linux-kernel list during
past months, not a real solution. Maybe only:
http://www.alsa-project.org/archive/alsa-devel/msg08629.html

  I hope it's not related to memory chunks allocated twice, so I think
it's another problem in 2.4.9, right?

Linux version 2.4.9 (user@host) (gcc version 2.95.2 20000220 (Debian GNU/Linux)) #4 SMP Thu Aug 30 15:10:26 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
 BIOS-e820: 000000000009f400 - 000000000009f800 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 0000000040000000 (usable)
 BIOS-e820: 00000000fec00000 - 00000000fec02000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
128MB HIGHMEM available.
Scan SMP from c0000000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f0000 for 65536 bytes.
found SMP MP-table at 000ff780
hm, page 000ff000 reserved twice.
hm, page 00100000 reserved twice.
hm, page 000f0000 reserved twice.
hm, page 000f1000 reserved twice.
On node 0 totalpages: 262144
zone(0): 4096 pages.
zone(1): 225280 pages.
zone(2): 32768 pages.

shell$ free
             total       used       free     shared    buffers     cached
Mem:       1028480     992840      35640          0      20832     821524
-/+ buffers/cache:     150484     877996
Swap:      2097136     100868    1996268


  The machine is running apache 1.3.20 and mysql-3.23.41 only, and is
not loaded yet. :( Any ideas? Thanks.
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed.
  2001-09-04 13:11 __alloc_pages: 0-order allocation failed Martin MOKREJŠ
@ 2001-09-04 16:12 ` Daniel Phillips
  2001-09-07 12:53   ` Martin MOKREJŠ
  2001-09-07 13:06   ` Martin MOKREJŠ
  0 siblings, 2 replies; 59+ messages in thread
From: Daniel Phillips @ 2001-09-04 16:12 UTC (permalink / raw)
  To: Martin MOKREJ? , linux-kernel

On September 4, 2001 03:11 pm, Martin MOKREJ? wrote:
> Hi,
>   I'm getting the above error on 2.4.9 kernel with kernel HIGHMEM option
> enabled to 2GB, 2x Intel PentiumIII. The machine has 1GB RAM
> physically. Althougj I've found many report to linux-kernel list during
> past months, not a real solution. Maybe only:
> http://www.alsa-project.org/archive/alsa-devel/msg08629.html

Try 2.4.10-pre4.

>   I hope it's not related to memory chunks allocated twice,

It's not

> so I think it's another problem in 2.4.9, right?

Yep.  Most probably bounce buffers, patch by Marcelo already in Linus's
tree.

--
Daniel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed.
  2001-09-04 16:12 ` Daniel Phillips
@ 2001-09-07 12:53   ` Martin MOKREJŠ
  2001-09-07 13:06   ` Martin MOKREJŠ
  1 sibling, 0 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-07 12:53 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, 4 Sep 2001, Daniel Phillips wrote:

Hi,

> On September 4, 2001 03:11 pm, Martin MOKREJ? wrote:
> > Hi,
> >   I'm getting the above error on 2.4.9 kernel with kernel HIGHMEM option
> > enabled to 2GB, 2x Intel PentiumIII. The machine has 1GB RAM
> > physically. Althougj I've found many report to linux-kernel list during
> > past months, not a real solution. Maybe only:
> > http://www.alsa-project.org/archive/alsa-devel/msg08629.html
> 
> Try 2.4.10-pre4.

Hmm, so after a day of run we got it again:
__alloc_pages: 0-order allocation failed (gfp=0x70/1).

> > so I think it's another problem in 2.4.9, right?
> 
> Yep.  Most probably bounce buffers, patch by Marcelo already in Linus's
> tree.

So it did not fix it? But the output now has extra "(gfp=0x70/1)" string
appended.

Any ideas?
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed.
  2001-09-04 16:12 ` Daniel Phillips
  2001-09-07 12:53   ` Martin MOKREJŠ
@ 2001-09-07 13:06   ` Martin MOKREJŠ
  2001-09-07 20:43     ` Daniel Phillips
  2001-09-07 21:00     ` Daniel Phillips
  1 sibling, 2 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-07 13:06 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, 4 Sep 2001, Daniel Phillips wrote:

> On September 4, 2001 03:11 pm, Martin MOKREJ? wrote:
> > Hi,
> >   I'm getting the above error on 2.4.9 kernel with kernel HIGHMEM option
> > enabled to 2GB, 2x Intel PentiumIII. The machine has 1GB RAM
> > physically. Althougj I've found many report to linux-kernel list during
> > past months, not a real solution. Maybe only:
> > http://www.alsa-project.org/archive/alsa-devel/msg08629.html
> 
> Try 2.4.10-pre4.


Wow, I've just now realized that I get two types of error message:
__alloc_pages: 0-order allocatiocation failed (gfp=0x70/1).
__alloc_pages: 0-order allocation failed (gfp=0x70/1).

We are using LVM and ReiserFS, HIGMEM kernel.

Maybe it helps to track it down. Any ideas?
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed.
  2001-09-07 13:06   ` Martin MOKREJŠ
@ 2001-09-07 20:43     ` Daniel Phillips
  2001-09-07 21:00     ` Daniel Phillips
  1 sibling, 0 replies; 59+ messages in thread
From: Daniel Phillips @ 2001-09-07 20:43 UTC (permalink / raw)
  To: Martin MOKREJ? ; +Cc: linux-kernel

On September 7, 2001 03:06 pm, Martin MOKREJ? wrote:
> On Tue, 4 Sep 2001, Daniel Phillips wrote:
> 
> > On September 4, 2001 03:11 pm, Martin MOKREJ? wrote:
> > > Hi,
> > >   I'm getting the above error on 2.4.9 kernel with kernel HIGHMEM option
> > > enabled to 2GB, 2x Intel PentiumIII. The machine has 1GB RAM
> > > physically. Althougj I've found many report to linux-kernel list during
> > > past months, not a real solution. Maybe only:
> > > http://www.alsa-project.org/archive/alsa-devel/msg08629.html
> > 
> > Try 2.4.10-pre4.
> 
> 
> Wow, I've just now realized that I get two types of error message:
> __alloc_pages: 0-order allocatiocation failed (gfp=0x70/1).
> __alloc_pages: 0-order allocation failed (gfp=0x70/1).
> 
> We are using LVM and ReiserFS, HIGMEM kernel.
> 
> Maybe it helps to track it down. Any ideas?

printk has a limited amount of space for buffering messages, a ring buffer 
(sort of) and will start dropping text when the buffer fills up, so as not
to slow the kernel down and/or interfere with interrupts.  So that is why
two lines of output got combined above, they are all the same message.

The gfp=0x70/1 identifies the failure as GFP_NOIO, PF_MEMALLOC, which by
process of eliminate, comes from alloc_bounce_page.  Marcelo's patch for
bounce buffer allocation is *not* in 2.4.10-pre4, so we haven't proved
anything yet.

You can get the patch from Marcelo's post on lkml on Aug 22 under the
subject "Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on
7899P)".  Note the correction posted in his next message in the thread.
It applies to 2.4.9.  Please try it and see if these failures go away.

This patch *should* be in the main tree soon.  Some testing by you would
help a lot.

--
Daniel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed.
  2001-09-07 13:06   ` Martin MOKREJŠ
  2001-09-07 20:43     ` Daniel Phillips
@ 2001-09-07 21:00     ` Daniel Phillips
  2001-09-12 13:06       ` Martin MOKREJŠ
  1 sibling, 1 reply; 59+ messages in thread
From: Daniel Phillips @ 2001-09-07 21:00 UTC (permalink / raw)
  To: Martin MOKREJ?; +Cc: linux-kernel

On September 7, 2001 10:43 pm, Daniel Phillips wrote:
> On September 7, 2001 03:06 pm, Martin MOKREJ? wrote:
> > On Tue, 4 Sep 2001, Daniel Phillips wrote:
> > 
> > > On September 4, 2001 03:11 pm, Martin MOKREJ? wrote:
> > > > Hi,
> > > >   I'm getting the above error on 2.4.9 kernel with kernel HIGHMEM option
> > > > enabled to 2GB, 2x Intel PentiumIII. The machine has 1GB RAM
> > > > physically. Althougj I've found many report to linux-kernel list during
> > > > past months, not a real solution. Maybe only:
> > > > http://www.alsa-project.org/archive/alsa-devel/msg08629.html
> > > 
> > > Try 2.4.10-pre4.
> > 
> > 
> > Wow, I've just now realized that I get two types of error message:
> > __alloc_pages: 0-order allocatiocation failed (gfp=0x70/1).
> > __alloc_pages: 0-order allocation failed (gfp=0x70/1).
> > 
> > We are using LVM and ReiserFS, HIGMEM kernel.
> > 
> > Maybe it helps to track it down. Any ideas?
> 
> printk has a limited amount of space for buffering messages, a ring buffer 
> (sort of) and will start dropping text when the buffer fills up, so as not
> to slow the kernel down and/or interfere with interrupts.  So that is why
> two lines of output got combined above, they are all the same message.
> 
> The gfp=0x70/1 identifies the failure as GFP_NOIO, PF_MEMALLOC, which by
> process of eliminate, comes from alloc_bounce_page.  Marcelo's patch for
> bounce buffer allocation is *not* in 2.4.10-pre4, so we haven't proved
> anything yet.
> 
> You can get the patch from Marcelo's post on lkml on Aug 22 under the
> subject "Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on
> 7899P)".  Note the correction posted in his next message in the thread.
> It applies to 2.4.9.  Please try it and see if these failures go away.
> 
> This patch *should* be in the main tree soon.  Some testing by you would
> help a lot.

Correction, it's in Linus's tree all write, with some changed names.  So...
conclusion: Marcelo's approach is not airtight.  Or there was an error in
translation.  Arjan has a patch going in soon to the -ac tree, so stay
tuned.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed.
  2001-09-07 21:00     ` Daniel Phillips
@ 2001-09-12 13:06       ` Martin MOKREJŠ
  2001-09-19 14:21         ` __alloc_pages: 0-order allocation failed still in -pre12 Martin MOKREJŠ
  0 siblings, 1 reply; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-12 13:06 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Fri, 7 Sep 2001, Daniel Phillips wrote:

> > You can get the patch from Marcelo's post on lkml on Aug 22 under the
> > subject "Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on
> > 7899P)".  Note the correction posted in his next message in the thread.
> > It applies to 2.4.9.  Please try it and see if these failures go away.

Yes, it fixed my problem. I had to aplly also "patch" from someone from
this list, who replied to Daniel, because in original Daniels version of
patch were two typo mistakes.

> > This patch *should* be in the main tree soon.  Some testing by you would
> > help a lot.

I had a look on monday into the -pre7 but it did not look like it contains
this patch.

> Correction, it's in Linus's tree all write, with some changed names.  So...
> conclusion: Marcelo's approach is not airtight.  Or there was an error in
> translation.  Arjan has a patch going in soon to the -ac tree, so stay
> tuned.

I don't know what's Arjan's patch, but Marcelo's patch applied to plain
2.4.9 sources (manually applied) works for 2 days already here.

If you know how to push Marcelo's patch into the -preX version, please do
so. ;-)
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-12 13:06       ` Martin MOKREJŠ
@ 2001-09-19 14:21         ` Martin MOKREJŠ
  2001-09-19 15:03           ` Martin MOKREJŠ
                             ` (3 more replies)
  0 siblings, 4 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-19 14:21 UTC (permalink / raw)
  To: linux-kernel

Hi,
  I tried 2.4.10-pre12 and run some mysql big tests (actually
mysql/tests/fork_big.pl ). And, the load is coming up and down from 17 to
6 .... and now, it's 1.7 only and I see in dmesg:

__alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012e3e2
__alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012e3e2

Filename                        Type            Size    Used    Priority
/dev/sda2                       partition       2097136 41392   -1

The swap usage grew up from 11MB 40MB.

free gives:
             total       used       free     shared    buffers     cached
Mem:       1029776    1007360      22416          0       4548     463936
-/+ buffers/cache:     538876     490900
Swap:      2097136      41392    2055744

The system started to page-out when there were almost no buffers available
and many cached pages. The system started after bootup with cached=18k or
something like that.

/proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  1054490624 880287744 174202880        0  4653056 460627968
Swap: 2147467264 42909696 2104557568
MemTotal:      1029776 kB
MemFree:        170120 kB
MemShared:           0 kB
Buffers:          4544 kB
Cached:         448416 kB
SwapCached:       1416 kB
Active:         377868 kB
Inactive:        76508 kB
HighTotal:      131072 kB
HighFree:         2044 kB
LowTotal:       898704 kB
LowFree:        168076 kB
SwapTotal:     2097136 kB
SwapFree:      2055232 kB


I have to say I've been using for a week without any "0-order allocation
failed" patch from Marcelo. Now I see am back to the old stage. ;(

> > You can get the patch from Marcelo's post on lkml on Aug 22 under the
> > subject "Re: With Daniel Phillips Patch (was: aic7xxx with 2.4.9 on
> > 7899P)".  Note the correction posted in his next message in the thread.
> > It applies to 2.4.9.  Please try it and see if these failures go away.

Please Cc: me in reply, if possible.
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 14:21         ` __alloc_pages: 0-order allocation failed still in -pre12 Martin MOKREJŠ
@ 2001-09-19 15:03           ` Martin MOKREJŠ
  2001-09-19 15:16           ` Rik van Riel
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-19 15:03 UTC (permalink / raw)
  To: linux-kernel

Hi,
  one more addition:
I used mysqldump to dump some big database, and here's something weird:

jerboas:/mnt# ls -la
total 372024
drwxr-xr-x    2 root     root           80 Sep 19 16:34 .
drwxr-xr-x   20 root     root         4096 Sep 19 11:32 ..
-rw-r--r--    1 root     root     380946064 Sep 19 16:35 Celegans.sql
jerboas:/mnt# ls -la
ls: Celegans.sql: Value too large for defined data type
total 4
drwxr-xr-x    2 root     root           80 Sep 19 16:34 .
drwxr-xr-x   20 root     root         4096 Sep 19 11:32 ..
[1]-  Done                    /usr/local/mysql/bin/mysqldump -hlocalhost -P3306 -upedant Celegans >Celegans.sql
jerboas:/mnt# ls -la
ls: Celegans.sql: Value too large for defined data type
total 4
drwxr-xr-x    2 root     root           80 Sep 19 16:34 .
drwxr-xr-x   20 root     root         4096 Sep 19 11:32 ..
jerboas:/mnt# 

Running `mc' in this directory says:

File 'Celegans.sql' exists but can not be stat-ed: Value too large for defined data type 

It's on freshly made reiserfs filesystem, if it helps.
Sep 19 16:32:28 jerboas kernel: reiserfs: checking transaction log (device 03:41) ...
Sep 19 16:32:30 jerboas kernel: Using r5 hash to sort names
Sep 19 16:32:30 jerboas kernel: ReiserFS version 3.6.25

The source mysql directory on reiserfs on different disk, has 1693948 kB
(multiple files). In the /mnt it should the dump be all in one file, also
on reiserfs. The machine has 1GB RAM, SMP kernel, HIGHMEM enabled.
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 14:21         ` __alloc_pages: 0-order allocation failed still in -pre12 Martin MOKREJŠ
  2001-09-19 15:03           ` Martin MOKREJŠ
@ 2001-09-19 15:16           ` Rik van Riel
  2001-09-19 15:51             ` Martin MOKREJŠ
  2001-09-19 22:34           ` Shane Wegner
  2001-09-19 22:39           ` __alloc_pages: 0-order allocation failed still in -pre12 Andrea Arcangeli
  3 siblings, 1 reply; 59+ messages in thread
From: Rik van Riel @ 2001-09-19 15:16 UTC (permalink / raw)
  To: Martin MOKREJŠ; +Cc: linux-kernel

On Wed, 19 Sep 2001, [iso-8859-2] Martin MOKREJ© wrote:

>   I tried 2.4.10-pre12

> I have to say I've been using for a week without any "0-order allocation
> failed" patch from Marcelo. Now I see am back to the old stage. ;(

Impossible, the VM code which is in 2.4.10-pre11 and newer
wasn't published until sunday night, so you can't have been
using it for a week already. ;)

cheers,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 15:16           ` Rik van Riel
@ 2001-09-19 15:51             ` Martin MOKREJŠ
  0 siblings, 0 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-19 15:51 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel

On Wed, 19 Sep 2001, Rik van Riel wrote:

> On Wed, 19 Sep 2001, [iso-8859-2] Martin MOKREJŠ wrote:
> 
> >   I tried 2.4.10-pre12
> 
> > I have to say I've been using for a week without any "0-order allocation
> > failed" patch from Marcelo. Now I see am back to the old stage. ;(
> 
> Impossible, the VM code which is in 2.4.10-pre11 and newer
> wasn't published until sunday night, so you can't have been
> using it for a week already. ;)

Sorry, again: I'm currently using plain 2.4.9 patched with -pre12.
I get the allocation errors. I got the image from kernel.dk/testing/ today
morning, as someone posted this address on the list.

My previous kernel is plain 2.4.9 patched with Marcelo's patched and in a
week period I did not receive nay single error message like that.

-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 14:21         ` __alloc_pages: 0-order allocation failed still in -pre12 Martin MOKREJŠ
  2001-09-19 15:03           ` Martin MOKREJŠ
  2001-09-19 15:16           ` Rik van Riel
@ 2001-09-19 22:34           ` Shane Wegner
  2001-09-19 22:45             ` Andrea Arcangeli
  2001-09-19 22:39           ` __alloc_pages: 0-order allocation failed still in -pre12 Andrea Arcangeli
  3 siblings, 1 reply; 59+ messages in thread
From: Shane Wegner @ 2001-09-19 22:34 UTC (permalink / raw)
  To: Martin MOKREJ?; +Cc: linux-kernel, Andrea Arcangeli

Hi,

I'm getting the same thing here.  At least it looks similar
though I'm not sure what's causing it.  Dual PIII 850, 1gb
ram, 300mb swap.

__alloc_pages: 0-order allocation failed (gfp=0x20/0) from
c012e052
__alloc_pages: 0-order allocation failed (gfp=0x20/0) from
c012e052
__alloc_pages: 0-order allocation failed (gfp=0x20/0) from
c012e052


On Wed, Sep 19, 2001 at 04:21:43PM +0200, Martin MOKREJ? wrote:
> Hi,
>   I tried 2.4.10-pre12 and run some mysql big tests (actually
> mysql/tests/fork_big.pl ). And, the load is coming up and down from 17 to
> 6 .... and now, it's 1.7 only and I see in dmesg:
> 
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012e3e2
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012e3e2

-- 
Shane Wegner: shane@cm.nu
              http://www.cm.nu/~shane/
PGP:          1024D/FFE3035D
              A0ED DAC4 77EC D674 5487
              5B5C 4F89 9A4E FFE3 035D

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 14:21         ` __alloc_pages: 0-order allocation failed still in -pre12 Martin MOKREJŠ
                             ` (2 preceding siblings ...)
  2001-09-19 22:34           ` Shane Wegner
@ 2001-09-19 22:39           ` Andrea Arcangeli
  3 siblings, 0 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-19 22:39 UTC (permalink / raw)
  To: Martin MOKREJ?; +Cc: linux-kernel, Linus Torvalds, Marcelo Tosatti

On Wed, Sep 19, 2001 at 04:21:43PM +0200, Martin MOKREJ? wrote:
> Hi,
>   I tried 2.4.10-pre12 and run some mysql big tests (actually
> mysql/tests/fork_big.pl ). And, the load is coming up and down from 17 to
> 6 .... and now, it's 1.7 only and I see in dmesg:
> 
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012e3e2
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012e3e2

Ok, I'm pretty certain I got it, I didn't noticed here because it can be
reproduced only with HIGHMEM and I didn't had time to test highmem yet
(btw, highmem emulation would been enough to reproduce it).

It was really an allocator bug. Totally untested fix appended
but recommended anyways for integration.

Marcelo can you also test it in your workload (feel free to use eepro100
too now).

--- 2.4.10pre11aa1/mm/page_alloc.c.~1~	Tue Sep 18 15:39:50 2001
+++ 2.4.10pre11aa1/mm/page_alloc.c	Thu Sep 20 00:36:11 2001
@@ -369,6 +369,7 @@
 		return NULL;
 	}
 
+ rebalance:
 	page = balance_classzone(classzone, gfp_mask, order, &freed);
 	if (page)
 		return page;
@@ -380,10 +381,13 @@
 			if (!z)
 				break;
 
-			page = rmqueue(z, order);
-			if (page)
-				return page;
+			if (zone_free_pages(z, order) > z->pages_min) {
+				page = rmqueue(z, order);
+				if (page)
+					return page;
+			}
 		}
+		goto rebalance;
 	} else {
 		/* 
 		 * Check that no other task is been killed meanwhile,

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 22:34           ` Shane Wegner
@ 2001-09-19 22:45             ` Andrea Arcangeli
  2001-09-20  2:31               ` Shane Wegner
  0 siblings, 1 reply; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-19 22:45 UTC (permalink / raw)
  To: Shane Wegner; +Cc: Martin MOKREJ?, linux-kernel

On Wed, Sep 19, 2001 at 03:34:41PM -0700, Shane Wegner wrote:
> Hi,
> 
> I'm getting the same thing here.  At least it looks similar
> though I'm not sure what's causing it.  Dual PIII 850, 1gb
							 ^^^ perfect
> ram, 300mb swap.
> 
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> c012e052
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> c012e052
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> c012e052

yes, please try this fix and let me know if it helps:

--- 2.4.10pre11aa1/mm/page_alloc.c.~1~	Tue Sep 18 15:39:50 2001
+++ 2.4.10pre11aa1/mm/page_alloc.c	Thu Sep 20 00:36:11 2001
@@ -369,6 +369,7 @@
 		return NULL;
 	}
 
+ rebalance:
 	page = balance_classzone(classzone, gfp_mask, order, &freed);
 	if (page)
 		return page;
@@ -380,10 +381,13 @@
 			if (!z)
 				break;
 
-			page = rmqueue(z, order);
-			if (page)
-				return page;
+			if (zone_free_pages(z, order) > z->pages_min) {
+				page = rmqueue(z, order);
+				if (page)
+					return page;
+			}
 		}
+		goto rebalance;
 	} else {
 		/* 
 		 * Check that no other task is been killed meanwhile,


Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-19 22:45             ` Andrea Arcangeli
@ 2001-09-20  2:31               ` Shane Wegner
  2001-09-20  2:36                 ` Andrea Arcangeli
                                   ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Shane Wegner @ 2001-09-20  2:31 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Martin MOKREJ?, linux-kernel

On Thu, Sep 20, 2001 at 12:45:43AM +0200, Andrea Arcangeli wrote:
> On Wed, Sep 19, 2001 at 03:34:41PM -0700, Shane Wegner wrote:
> > 
> > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > c012e052
> > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > c012e052
> > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > c012e052
> 
> yes, please try this fix and let me know if it helps:

After some stress testing, the fix does appear to fix the
error.

Shane


-- 
Shane Wegner: shane@cm.nu
              http://www.cm.nu/~shane/
PGP:          1024D/FFE3035D
              A0ED DAC4 77EC D674 5487
              5B5C 4F89 9A4E FFE3 035D

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-20  2:31               ` Shane Wegner
@ 2001-09-20  2:36                 ` Andrea Arcangeli
  2001-09-20  2:36                 ` Shane Wegner
  2001-09-20  9:57                 ` Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian Martin MOKREJŠ
  2 siblings, 0 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-20  2:36 UTC (permalink / raw)
  To: Shane Wegner
  Cc: Martin MOKREJ?, linux-kernel, Linus Torvalds, Marcelo Tosatti

On Wed, Sep 19, 2001 at 07:31:28PM -0700, Shane Wegner wrote:
> On Thu, Sep 20, 2001 at 12:45:43AM +0200, Andrea Arcangeli wrote:
> > On Wed, Sep 19, 2001 at 03:34:41PM -0700, Shane Wegner wrote:
> > > 
> > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > c012e052
> > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > c012e052
> > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > c012e052
> > 
> > yes, please try this fix and let me know if it helps:
> 
> After some stress testing, the fix does appear to fix the
> error.

good, what about the performance, is it all right?

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-20  2:31               ` Shane Wegner
  2001-09-20  2:36                 ` Andrea Arcangeli
@ 2001-09-20  2:36                 ` Shane Wegner
  2001-09-20  2:52                   ` Andrea Arcangeli
  2001-09-20  9:57                 ` Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian Martin MOKREJŠ
  2 siblings, 1 reply; 59+ messages in thread
From: Shane Wegner @ 2001-09-20  2:36 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel

On Wed, Sep 19, 2001 at 07:31:28PM -0700, Shane Wegner wrote:
> On Thu, Sep 20, 2001 at 12:45:43AM +0200, Andrea Arcangeli wrote:
> > On Wed, Sep 19, 2001 at 03:34:41PM -0700, Shane Wegner wrote:
> > > 
> > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > c012e052
> > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > c012e052
> > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > c012e052
> > 
> > yes, please try this fix and let me know if it helps:
> 
> After some stress testing, the fix does appear to fix the
> error.

Hi,

Well just after I sent the email, it came up again.


Sep 19 19:31:52 continuum kernel: __alloc_pages: 0-order
allocation failed (gfp=0x20/0) from c012e052
Sep 19 19:33:51 continuum kernel: __alloc_pages: 0-order
allocation failed (gfp=0x20/0) from c012e052

Shane

-- 
Shane Wegner: shane@cm.nu
              http://www.cm.nu/~shane/
PGP:          1024D/FFE3035D
              A0ED DAC4 77EC D674 5487
              5B5C 4F89 9A4E FFE3 035D

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-20  2:36                 ` Shane Wegner
@ 2001-09-20  2:52                   ` Andrea Arcangeli
  2001-09-20 15:02                     ` Randy.Dunlap
  0 siblings, 1 reply; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-20  2:52 UTC (permalink / raw)
  To: Shane Wegner; +Cc: linux-kernel, Linus Torvalds, Marcelo Tosatti

On Wed, Sep 19, 2001 at 07:36:49PM -0700, Shane Wegner wrote:
> On Wed, Sep 19, 2001 at 07:31:28PM -0700, Shane Wegner wrote:
> > On Thu, Sep 20, 2001 at 12:45:43AM +0200, Andrea Arcangeli wrote:
> > > On Wed, Sep 19, 2001 at 03:34:41PM -0700, Shane Wegner wrote:
> > > > 
> > > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > > c012e052
> > > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > > c012e052
> > > > __alloc_pages: 0-order allocation failed (gfp=0x20/0) from
> > > > c012e052
> > > 
> > > yes, please try this fix and let me know if it helps:
> > 
> > After some stress testing, the fix does appear to fix the
> > error.
> 
> Hi,
> 
> Well just after I sent the email, it came up again.
> 
> 
> Sep 19 19:31:52 continuum kernel: __alloc_pages: 0-order
> allocation failed (gfp=0x20/0) from c012e052
> Sep 19 19:33:51 continuum kernel: __alloc_pages: 0-order
> allocation failed (gfp=0x20/0) from c012e052

did it happen as frequently/easily as before or did you need to stress
it much harder? And I'm also curious what happens if we simply lower the
watemark (possibly it was too high). Anyways the other patch is a good
idea to apply anyways.

So can now try the below new one?

--- 2.4.10pre11aa1/mm/page_alloc.c.~1~	Thu Sep 20 00:36:11 2001
+++ 2.4.10pre11aa1/mm/page_alloc.c	Thu Sep 20 04:45:44 2001
@@ -346,7 +346,7 @@
 		if (!z)
 			break;
 
-		if (zone_free_pages(z, order) > (gfp_mask & __GFP_HIGH ? z->pages_min / 2 : z->pages_min)) {
+		if (zone_free_pages(z, order) > (gfp_mask & __GFP_HIGH ? z->pages_min / 4 : z->pages_min)) {
 			page = rmqueue(z, order);
 			if (page)
 				return page;


the fact is, kswapd is the only entity meant to shrink the caches for
the atomic pages, it exactly knows what are the zones that needs to be
balanced and we have a min-min/2 of pages of GAP that must be refilled
in time. It just seems kswapd doesn't cope with the frequency of the
allocations sometime, this may be ok but maybe we must find a way to
more aggressively free memory for the atomic allocations or it could
simply mean that the watermark GAP was too small as Marcelo just
suggested previously.

Can you also resolve "c012e052" so we know who's allocating those pages
just in case?

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian
  2001-09-20  2:31               ` Shane Wegner
  2001-09-20  2:36                 ` Andrea Arcangeli
  2001-09-20  2:36                 ` Shane Wegner
@ 2001-09-20  9:57                 ` Martin MOKREJŠ
  2001-09-20 10:10                   ` Magnus Naeslund(f)
  2001-09-20 10:24                   ` [PATCH] Make kernel build numbers work again (was: Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian) Russell King
  2 siblings, 2 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-20  9:57 UTC (permalink / raw)
  To: Shane Wegner; +Cc: Andrea Arcangeli, linux-kernel

Hi,
  first of all, thanks to Andrea. I had a bit hard time to find sources of
his kernel-patches.

Note to the FAQ maintainer: there isn't mentioned the source for -ac and
-aa kernels.

  I found that I need 2.4.9 patched to -pre12 and patched afterwards with 
http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.10pre12aa1.bz2


  Using my old configuration for kernel I get after "make dep; make bzImage"

gcc -D__KERNEL__ -I/usr/src/linux-2.4.10-pre12/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=i686   -c -o init/main.o init/main.c
In file included from /usr/src/linux-2.4.10-pre12/linux/include/linux/mm.h:4,
                 from /usr/src/linux-2.4.10-pre12/linux/include/linux/slab.h:14,
                 from /usr/src/linux-2.4.10-pre12/linux/include/linux/proc_fs.h:5,
                 from init/main.c:15:
/usr/src/linux-2.4.10-pre12/linux/include/linux/sched.h:423: warning: `PF_USEDFPU' redefined
/usr/src/linux-2.4.10-pre12/linux/include/linux/sched.h:421: warning: this is the location of the previous definition
. scripts/mkversion > .version
gcc -D__KERNEL__ -I/usr/src/linux-2.4.10-pre12/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=i686  -DUTS_MACHINE='"i386"' -c -o init/version.o init/version.c
make CFLAGS="-D__KERNEL__ -I/usr/src/linux-2.4.10-pre12/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=i686 " -C  kernel
make[1]: Entering directory `/usr/src/linux-2.4.10-pre12/linux/kernel'
make all_targets
make[2]: Entering directory `/usr/src/linux-2.4.10-pre12/linux/kernel'
gcc -D__KERNEL__ -I/usr/src/linux-2.4.10-pre12/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=i686    -fno-omit-frame-pointer -c -o sched.o sched.c
In file included from /usr/src/linux-2.4.10-pre12/linux/include/linux/mm.h:4,
                 from sched.c:24:
/usr/src/linux-2.4.10-pre12/linux/include/linux/sched.h:423: warning: `PF_USEDFPU' redefined
/usr/src/linux-2.4.10-pre12/linux/include/linux/sched.h:421: warning: this is the location of the previous definition
sched.c: In function `reschedule_idle':
sched.c:234: warning: `oldest_idle' might be used uninitialized in this function
sched.c: In function `sys_sched_yield':
sched.c:1130: warning: control reaches end of non-void function
sched.c: At top level:
sched.c:1135: parse error before `if'
sched.c:1142: parse error before `->'
make[2]: *** [sched.o] Error 1
make[2]: Leaving directory `/usr/src/linux-2.4.10-pre12/linux/kernel'

 Note to Andrea, while you mention in your post to linux-kernel list
Changelog of your kernel relase, you do not mention for newbies like me se
source site and maybe "How to apply" would help also. ;) 

Thanks for replies! ;-)
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian
  2001-09-20  9:57                 ` Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian Martin MOKREJŠ
@ 2001-09-20 10:10                   ` Magnus Naeslund(f)
  2001-09-20 10:26                     ` Martin MOKREJŠ
                                       ` (2 more replies)
  2001-09-20 10:24                   ` [PATCH] Make kernel build numbers work again (was: Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian) Russell King
  1 sibling, 3 replies; 59+ messages in thread
From: Magnus Naeslund(f) @ 2001-09-20 10:10 UTC (permalink / raw)
  To: Martin MOKREJ©; +Cc: linux-kernel

From: "Martin MOKREJ©" <mmokrejs@natur.cuni.cz>

There are two defines for that FPU thing around line 421 in sched.c, take
one away (i deleted the 1<<6 one).

I'm running that kernel now.

Magnus



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] Make kernel build numbers work again (was: Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian)
  2001-09-20  9:57                 ` Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian Martin MOKREJŠ
  2001-09-20 10:10                   ` Magnus Naeslund(f)
@ 2001-09-20 10:24                   ` Russell King
  2001-09-20 12:54                     ` Alan Cox
  1 sibling, 1 reply; 59+ messages in thread
From: Russell King @ 2001-09-20 10:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox, Linus Torvalds

On Thu, Sep 20, 2001 at 11:57:02AM +0200, Martin MOKREJ© wrote:
> . scripts/mkversion > .version

People,

As I'm sure you're all aware, being experts in userland programming, that
the above obviously cannot work and is totally bogus.

The mkversion script contains:

if [ ! -f .version ]
then
    echo 1
else
    expr 0`cat .version` + 1
fi

but wait!  As far as the script is concerned, .version will always exist
because its created before the script is run (the open occurs, the file is
truncated, and passed to the script as STDOUT).

This has a nice effect - the build number of the kernel is now fixed at '1'.
So, why don't we get rid of the above crap and just do "echo 1 > .version"
and be done with it? ;)

Alternatively, the following patch fixes things such that we can read the
original .version file within the script, if it existed prior to invocation,
and produce the correct build number.

Note that as illustrated by the previous poster, -linus now has the problem,
and -ac also has the same.  The following patch was generated against
2.4.9-ac10, but should apply to both trees without problem.

--- ref/Makefile	Wed Sep 19 14:00:24 2001
+++ linux/Makefile	Thu Sep 20 11:19:43 2001
@@ -234,7 +234,7 @@
 	drivers/sound/pndsperm.c \
 	drivers/sound/pndspini.c \
 	drivers/atm/fore200e_*_fw.c drivers/atm/.fore200e_*.fw \
-	.version .config* config.in config.old \
+	.version* .config* config.in config.old \
 	scripts/tkparse scripts/kconfig.tk scripts/kconfig.tmp \
 	scripts/lxdialog/*.o scripts/lxdialog/lxdialog \
 	.menuconfig.log \
@@ -306,7 +306,8 @@
 $(TOPDIR)/include/linux/compile.h: include/linux/compile.h
 
 newversion:
-	. scripts/mkversion > .version
+	. scripts/mkversion > .version.tmp
+	@mv -f .version.tmp .version
 
 include/linux/compile.h: $(CONFIGURATION) include/linux/version.h newversion
 	@echo -n \#define UTS_VERSION \"\#`cat .version` > .ver

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian
  2001-09-20 10:10                   ` Magnus Naeslund(f)
@ 2001-09-20 10:26                     ` Martin MOKREJŠ
  2001-09-20 10:26                     ` Magnus Naeslund(f)
  2001-09-20 10:59                     ` Perf improvements in 2.4.10pre12aa1 Martin MOKREJŠ
  2 siblings, 0 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-20 10:26 UTC (permalink / raw)
  To: Magnus Naeslund(f); +Cc: linux-kernel

On Thu, 20 Sep 2001, Magnus Naeslund(f) wrote:

> From: "Martin MOKREJŠ" <mmokrejs@natur.cuni.cz>
> 
> There are two defines for that FPU thing around line 421 in sched.c, take
> one away (i deleted the 1<<6 one).

I've just compiled and am going to reboot, one more note:
gcc -D__KERNEL__ -I/usr/src/linux-2.4.10-pre12/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer
-fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o pci-pc.o pci-pc.c
{standard input}: Assembler messages:
{standard input}:1116: Warning: indirect lcall without `*'
{standard input}:1201: Warning: indirect lcall without `*'
{standard input}:1288: Warning: indirect lcall without `*'
{standard input}:1370: Warning: indirect lcall without `*'
{standard input}:1381: Warning: indirect lcall without `*'
{standard input}:1392: Warning: indirect lcall without `*'
{standard input}:1479: Warning: indirect lcall without `*'
{standard input}:1491: Warning: indirect lcall without `*'
{standard input}:1503: Warning: indirect lcall without `*'
{standard input}:1990: Warning: indirect lcall without `*'
{standard input}:2083: Warning: indirect lcall without `*'
gcc -D__KERNEL__ -I/usr/src/linux-2.4.10-pre12/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer
-fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o pci-irq.o pci-irq.c

-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian
  2001-09-20 10:10                   ` Magnus Naeslund(f)
  2001-09-20 10:26                     ` Martin MOKREJŠ
@ 2001-09-20 10:26                     ` Magnus Naeslund(f)
  2001-09-20 10:59                     ` Perf improvements in 2.4.10pre12aa1 Martin MOKREJŠ
  2 siblings, 0 replies; 59+ messages in thread
From: Magnus Naeslund(f) @ 2001-09-20 10:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Martin MOKREJ©

From: "Magnus Naeslund(f)" <mag@fbab.net>
> From: "Martin MOKREJ©" <mmokrejs@natur.cuni.cz>
>
> There are two defines for that FPU thing around line 421 in sched.c, take
> one away (i deleted the 1<<6 one).
>

... And that should have been sched.h, as Martin kindly pointed out ;)
I meant something like this:

--- sched.h~    Thu Sep 20 10:20:44 2001
+++ sched.h     Thu Sep 20 11:29:06 2001
@@ -418,7 +418,9 @@
 #define PF_DUMPCORE    (1UL<<3)        /* dumped core */
 #define PF_SIGNALED    (1UL<<4)        /* killed by a signal */
 #define PF_MEMALLOC    (1UL<<5)        /* Allocating memory */
-#define PF_USEDFPU     (1UL<<6)        /* task used FPU this quantum (SMP)
*/
 #define PF_ATOMICALLOC (1UL<<7)        /* do not block during memalloc */
 #define PF_USEDFPU     (1UL<<8)        /* task used FPU this quantum (SMP)
*/
 #define PF_FREE_PAGES  (1UL<<9)        /* per process page freeing */

Magnus


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Perf improvements in 2.4.10pre12aa1
  2001-09-20 10:10                   ` Magnus Naeslund(f)
  2001-09-20 10:26                     ` Martin MOKREJŠ
  2001-09-20 10:26                     ` Magnus Naeslund(f)
@ 2001-09-20 10:59                     ` Martin MOKREJŠ
  2001-09-20 15:28                       ` Martin MOKREJŠ
  2 siblings, 1 reply; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-20 10:59 UTC (permalink / raw)
  To: Magnus Naeslund(f); +Cc: linux-kernel

Hi,
  I've just started some tests to try to repeat the memory allocation
errors. I see the aa1 kernel is twice fast as -pre12!? Is this expected?
I have 2x intelPIII 933MHz, 1GB RAM, HIGMEM kernel, ReiserFS, aic7xxx,
eepro100.


linux-2.4.10-pre12
dbench 16: Throughput 67.8566 MB/sec (NB=84.8208 MB/sec  678.566 MBit/sec)  16 procs

Yesterday after havy tests and after memory alloc. errors already
appeared:
        total:    used:    free:  shared: buffers:  cached:
Mem:  1054490624 880287744 174202880        0  4653056 460627968
Swap: 2147467264 42909696 2104557568
MemTotal:      1029776 kB
MemFree:        170120 kB
MemShared:           0 kB
Buffers:          4544 kB
Cached:         448416 kB
SwapCached:       1416 kB
Active:         377868 kB
Inactive:        76508 kB
HighTotal:      131072 kB
HighFree:         2044 kB
LowTotal:       898704 kB
LowFree:        168076 kB
SwapTotal:     2097136 kB
SwapFree:      2055232 kB



linux-2.4.10-pre12aa1
dbench 16: Throughput 141.659 MB/sec (NB=177.074 MB/sec  1416.59 MBit/sec)  16 procs

Now after fresh bootup and just after I started first tests:
        total:    used:    free:  shared: buffers:  cached:
Mem:  1054412800 110338048 944074752        0  8560640 59211776
Swap: 2147467264        0 2147467264
MemTotal:      1029700 kB
MemFree:        921948 kB
MemShared:           0 kB
Buffers:          8360 kB
Cached:          57824 kB
SwapCached:          0 kB
Active:              0 kB
Inactive:        66184 kB
HighTotal:      131072 kB
HighFree:        58612 kB
LowTotal:       898628 kB
LowFree:        863336 kB
SwapTotal:     2097136 kB
SwapFree:      2097136 kB


The documentation to dbech is a bit sparse (README,INSTALL). It's a bit
offtopic, but would someone explain me where does the dbench write, into
which directory? I performed the tests above under same user and in same
tmp/ directory, to be sure. Maybe it was not necessary at all. ;)

Thanks
-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] Make kernel build numbers work again (was: Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian)
  2001-09-20 10:24                   ` [PATCH] Make kernel build numbers work again (was: Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian) Russell King
@ 2001-09-20 12:54                     ` Alan Cox
  0 siblings, 0 replies; 59+ messages in thread
From: Alan Cox @ 2001-09-20 12:54 UTC (permalink / raw)
  To: Russell King; +Cc: linux-kernel, Alan Cox, Linus Torvalds

> Note that as illustrated by the previous poster, -linus now has the problem,
> and -ac also has the same.  The following patch was generated against
> 2.4.9-ac10, but should apply to both trees without problem.

Looks right to me

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-20  2:52                   ` Andrea Arcangeli
@ 2001-09-20 15:02                     ` Randy.Dunlap
  2001-09-21  1:54                       ` Keith Owens
  0 siblings, 1 reply; 59+ messages in thread
From: Randy.Dunlap @ 2001-09-20 15:02 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Shane Wegner, linux-kernel

Andrea Arcangeli wrote:
> 
> Can you also resolve "c012e052" so we know who's allocating those pages
> just in case?

It's trivial to do that, of course, but if someone needs an
automated way to do it (several times, easy lookup), you can
try  http://www.osdlab.org/sw_resources/scripts/ksysmap .

Usage is:  ksysmap [system_map_file] offset

and it spits out address/symbol before offset, exact match if
present, and address/symbol after offset.

Example:

[rddunlap@dragon linux]$ ksysmap ./System.map-249acpi c012e052
ksysmap: searching './System.map-249acpi' for 'c012e052'

c012df20 T sys_truncate
c012e052 ..... <<<<<
c012e0a0 T sys_ftruncate

~Randy

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Perf improvements in 2.4.10pre12aa1
  2001-09-20 10:59                     ` Perf improvements in 2.4.10pre12aa1 Martin MOKREJŠ
@ 2001-09-20 15:28                       ` Martin MOKREJŠ
  2001-09-20 15:40                         ` Martin MOKREJŠ
  0 siblings, 1 reply; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-20 15:28 UTC (permalink / raw)
  To: linux-kernel

On Thu, 20 Sep 2001, Martin MOKREJŠ wrote:

Hi,
  stupid to reply to myself, but ...

> linux-2.4.10-pre12
> dbench 16: Throughput 67.8566 MB/sec (NB=84.8208 MB/sec  678.566 MBit/sec)  16 procs

> linux-2.4.10-pre12aa1
> dbench 16: Throughput 141.659 MB/sec (NB=177.074 MB/sec  1416.59 MBit/sec)  16 procs

Hmm, now after few ours of running mysql tests I have (while still running):
linux-2.4.10-pre12aa1
dbench 16: Throughput 41.1484 MB/sec (NB=51.4356 MB/sec  411.484 MBit/sec)  16 procs

Load so far up to 7 (yesterday even 16, but thatt dependes of course while
test is currently being run).


And, well oh NO!, it's here again:
__alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012f852

How can I find what mean those (gfp=0x20/0) from c012f852 ?
Current situation:
        total:    used:    free:  shared: buffers:  cached:
Mem:  1054412800 845828096 208584704        0  3731456 476766208
Swap: 2147467264 61083648 2086383616
MemTotal:      1029700 kB
MemFree:        203696 kB
MemShared:           0 kB
Buffers:          3644 kB
Cached:         464100 kB
SwapCached:       1492 kB
Active:         318896 kB
Inactive:       150340 kB
HighTotal:      131072 kB
HighFree:         2044 kB
LowTotal:       898628 kB
LowFree:        201652 kB
SwapTotal:     2097136 kB
SwapFree:      2037484 kB

  5:29pm  up  4:58,  3 users,  load average: 5.61, 6.04, 6.32

-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany




^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Perf improvements in 2.4.10pre12aa1
  2001-09-20 15:28                       ` Martin MOKREJŠ
@ 2001-09-20 15:40                         ` Martin MOKREJŠ
  0 siblings, 0 replies; 59+ messages in thread
From: Martin MOKREJŠ @ 2001-09-20 15:40 UTC (permalink / raw)
  To: linux-kernel

On Thu, 20 Sep 2001, Martin MOKREJŠ wrote:

There was an answer already posted, amazing!

> And, well oh NO!, it's here again:
> __alloc_pages: 0-order allocation failed (gfp=0x20/0) from c012f852

ksysmap: searching '/boot/System.map-2.4.10-pre12aa1' for 'c012f852'

c012f83c T _alloc_pages
c012f852 ..... <<<<<
c012f854 t balance_classzone



-- 
Martin Mokrejs - PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed still in -pre12
  2001-09-20 15:02                     ` Randy.Dunlap
@ 2001-09-21  1:54                       ` Keith Owens
  0 siblings, 0 replies; 59+ messages in thread
From: Keith Owens @ 2001-09-21  1:54 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: linux-kernel

On Thu, 20 Sep 2001 08:02:45 -0700, 
"Randy.Dunlap" <rddunlap@osdlab.org> wrote:
>Usage is:  ksysmap [system_map_file] offset
>and it spits out address/symbol before offset, exact match if

I like it!

Idea pinched for ksymoops 2.4.3; ksymoops -A "address list", any words
in the -A list are treated as addresses and looked up in the composite
system map, including modules.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2006-05-31 12:09 Oliver König
@ 2006-06-01 12:19 ` Jes Sorensen
  0 siblings, 0 replies; 59+ messages in thread
From: Jes Sorensen @ 2006-06-01 12:19 UTC (permalink / raw)
  To: Oliver König; +Cc: linux-kernel

>>>>> "Oliver" == Oliver König <k.oliver@t-online.de> writes:

Oliver> I run Debian 3.1 (Sarge) with Debian-Kernel 2.4.27-3-686-smp
Oliver> on Dell Poweredge 2850 with the following setup/config:

Oliver> Model: Dell Poweredge 2850 CPU: 2x3.0 GHz RAM: 2 GB SWAP: 1 GB
Oliver> Raid 1 with Dell PowerEdge Expandable RAID controller 4 (SCSI)
Oliver> Kernel: 2.4.27-3-686-smp (CONFIG_HIGHMEM4G=y) Web server:
Oliver> apache2 SQL server: mysql4.1 MTA: exim4

[snip]

Oliver> The server is then so slow tom react that the only way to get
Oliver> rid of the problem is to reset the server.

Oliver> What can we do to fix the problem?

0-order allocations means it cannot get even a single page of free
memory. You also see in the log the the OOM is kicking in. In other
words, totally out of memory.

You have two options, add more swap or add more memory. At the same
time it might be a good idea to try and monitor it to find out which
tasks are chewing away that much memory.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 59+ messages in thread

* __alloc_pages: 0-order allocation failed
@ 2006-05-31 12:09 Oliver König
  2006-06-01 12:19 ` Jes Sorensen
  0 siblings, 1 reply; 59+ messages in thread
From: Oliver König @ 2006-05-31 12:09 UTC (permalink / raw)
  To: linux-kernel

I run Debian 3.1 (Sarge) with Debian-Kernel 2.4.27-3-686-smp on Dell 
Poweredge 2850 with the following setup/config:

Model: Dell Poweredge 2850
CPU: 2x3.0 GHz
RAM: 2 GB
SWAP: 1 GB
Raid 1 with Dell PowerEdge Expandable RAID controller 4 (SCSI)
Kernel: 2.4.27-3-686-smp (CONFIG_HIGHMEM4G=y)
Web server: apache2
SQL server: mysql4.1
MTA: exim4

Occasionally all of a sudden the load average increases from around 1 
to 50-150. Primarily apache2 and also mysql are then consuming most of the CPU 
and memory. I checked the hardware with the Dell 32-bit diagnostic but could 
not find any errors. /var/log/message produces the following or similar 
output:

May 24 09:06:44 server kernel: VM: killing process cron
May 24 09:06:44 server kernel: __alloc_pages: 0-order allocation failed 
(gfp=0x1d2/0)
May 24 09:06:44 server last message repeated 6 times
May 24 09:06:44 server kernel: VM: killing process apache2
May 24 09:06:56 server logger: Hole TAFSYNOP-Wetterdaten...
May 24 09:11:09 server kernel: __alloc_pages: 0-order allocation failed 
(gfp=0x1d2/0)
[..]

The server is then so slow tom react that the only way to get rid of the 
problem is to reset the server.

What can we do to fix the problem?
Thanks.
Oliver

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-26 18:15             ` tpepper
@ 2001-09-26 18:29               ` Andrea Arcangeli
  0 siblings, 0 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-26 18:29 UTC (permalink / raw)
  To: tpepper
  Cc: Marcelo Tosatti, Paul Larson, Linus Torvalds,
	Christian Bornträger, Jacek [iso-8859-2] Pop³awski,
	lkml

On Wed, Sep 26, 2001 at 11:15:09AM -0700, tpepper@vato.org wrote:
> On Wed 26 Sep at 01:05:16 +0200 andrea@suse.de done said:
> > On Tue, Sep 25, 2001 at 06:25:10PM -0300, Marcelo Tosatti wrote:
> > > 
> > > Does vm-tweaks-1 fixes the current problem we're seeing? 
> > 
> > it seems no by reading the last email, however I'm not seeing any
> > problem, the DEBUG_GFP will tell us where the problem cames from,
> > pssobly it's a highmem thing since I never reproduced anything bad here.
> > But the point is that the above isn't going to be a right fix anyways.
> 
> vm-tweaks-1 fixes things for me.  I've got 512MB ram (kernel not
> configured for highmem) and 1 gig of swap.  The workload is heavy file
> i/o and has now been running almost 24 hours (about 2 billion I/Os or
> a few TB of data I think so far).  Previously all the memory was being
> consumed by cache, nothing swapped (as expected if the memory is cached
> buffer i/o right?) and I'd get the:

yes, unless the buffered I/O was identified as your very working set but
even in such case the 2.4.10 vm shouldn't swapout too early.

> 	__alloc_pages: 0-order allocation failed
> Now I continue to see the memory consumption / no swap, and no more
> error...iow the expected behaviour.

good. As far I can tell it is the check in swap_out that is making the
difference and fixing the oom problem, it was very intentional indeed.

> On an unrelated note if I want to backport the async I/O changes in 2.4.10,
> are there patches from you I should apply other than:
> 	2.4.10pre10aa1/40_blkdev-pagecache-17
> 	2.4.7pre8aa1/41_blkdev-pagecache-5_drop_get_bh_async-1

both patches are now included in mainline 2.4.10.

thanks,
Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 23:05           ` Andrea Arcangeli
@ 2001-09-26 18:15             ` tpepper
  2001-09-26 18:29               ` Andrea Arcangeli
  0 siblings, 1 reply; 59+ messages in thread
From: tpepper @ 2001-09-26 18:15 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Marcelo Tosatti, Paul Larson, Linus Torvalds,
	Christian Bornträger, Jacek [iso-8859-2] Pop³awski,
	lkml

On Wed 26 Sep at 01:05:16 +0200 andrea@suse.de done said:
> On Tue, Sep 25, 2001 at 06:25:10PM -0300, Marcelo Tosatti wrote:
> > 
> > Does vm-tweaks-1 fixes the current problem we're seeing? 
> 
> it seems no by reading the last email, however I'm not seeing any
> problem, the DEBUG_GFP will tell us where the problem cames from,
> pssobly it's a highmem thing since I never reproduced anything bad here.
> But the point is that the above isn't going to be a right fix anyways.

vm-tweaks-1 fixes things for me.  I've got 512MB ram (kernel not
configured for highmem) and 1 gig of swap.  The workload is heavy file
i/o and has now been running almost 24 hours (about 2 billion I/Os or
a few TB of data I think so far).  Previously all the memory was being
consumed by cache, nothing swapped (as expected if the memory is cached
buffer i/o right?) and I'd get the:
	__alloc_pages: 0-order allocation failed
Now I continue to see the memory consumption / no swap, and no more
error...iow the expected behaviour.

On an unrelated note if I want to backport the async I/O changes in 2.4.10,
are there patches from you I should apply other than:
	2.4.10pre10aa1/40_blkdev-pagecache-17
	2.4.7pre8aa1/41_blkdev-pagecache-5_drop_get_bh_async-1


Tim

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 11:35       ` Paul Larson
  2001-09-24 15:12         ` Marcelo Tosatti
@ 2001-09-26 13:48         ` Andrea Arcangeli
  1 sibling, 0 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-26 13:48 UTC (permalink / raw)
  To: Paul Larson; +Cc: Marcelo Tosatti, lkml

On Mon, Sep 24, 2001 at 11:35:41AM +0000, Paul Larson wrote:
> 
> The patch helped for me, but there are still problems.  I was able to
> run all the way through LTP without it shutting anything down.  When I
> used one of the memory tests to chew up all the ram though, I noticed
> that VM was killing things it shouldn't have.  First thing to get killed
> was cron, then top, then it finally killed mtest01 (the memory test
> mentioned before).

can you reproduce anything wrong with vm-tweaks-2 applied to plain
2.4.10? I posted it to l-k a few minutes ago.

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 23:46               ` Andrea Arcangeli
@ 2001-09-26  7:29                 ` Christian Bornträger
  0 siblings, 0 replies; 59+ messages in thread
From: Christian Bornträger @ 2001-09-26  7:29 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Marcelo Tosatti, Paul Larson, Linus Torvalds,
	Jacek[iso-8859-2]Pop³awski, lkml

> ok, this sounds like a normal oom condition but of course I assume it
> isn't. Can you show a `vmstat 1` during the oom kill + some
> /proc/meminfo? thanks for the great feedback!

As the VM sometimes killed me some long lasting root processes, like klogd, I 
think it is not the normal oom behaviour........ :-)

vmstat: 

   procs                      memory    swap          io     system         
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy id
 0  0  0      0 457824   8476  28856   0   0   225   140  156   163   4  51 45
 0  0  0      0 457440   8476  28860   0   0     0   280  143   103   1  52 47
 0  0  0      0 457440   8476  28860   0   0     0     0  101    70   1  60 39
 0  0  0      0 457440   8476  28860   0   0     0     0  109    86   1  56 43
 0  0  0      0 457436   8476  28864   0   0     0     0  102    73   1  59 40
 0  0  1      0 457436   8476  28864   0   0     0     0  101    70   1  58 41
 0  0  0      0 457436   8476  28864   0   0     0   236  130    88   0  59 41
 0  0  0      0 457436   8476  28864   0   0     0     0  101    69   1  60 39
 0  0  0      0 457436   8476  28864   0   0     0     0  101    71   1  60 39
 0  0  0      0 457436   8476  28864   0   0     0     0  101    71   0  60 40
 1  0  0      0 168236   8476  28928   0   0    64     0  117    79  11  72 17
 1  2  0      0 496532    236   1660   0   0  2948   252  415   303   5  75 20
 0  0  0      0 495568    408   2364   0   0   708     0  184   206   1  26 73
 0  0  1      0 495532    412   2396   0   0    36     0  105    78   1  59 40
 0  0  0      0 495528    416   2396   0   0     0     0  106    81   1  60 39
 0  0  0      0 495376    560   2396   0   0     0   272  139   103   1  56 43
 0  0  0      0 495376    560   2396   0   0     0     0  104    78   1  60 39

meminfo: (a made a looping bash script with sleep 1 after a cat 
/proc/meminfo. But During the kill there is a 3 second gap)
Mit Sep 26 09:14:58 CEST 2001

        total:    used:    free:  shared: buffers:  cached:
Mem:  526491648 58064896 468426752        0  8679424 29556736
Swap:        0        0        0
MemTotal:       514152 kB
MemFree:        457448 kB
MemShared:           0 kB
Buffers:          8476 kB
Cached:          28864 kB
SwapCached:          0 kB
Active:          19084 kB
Inactive:        18256 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       514152 kB
LowFree:        457448 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Mit Sep 26 09:14:59 CEST 2001
        total:    used:    free:  shared: buffers:  cached:
Mem:  526491648 162996224 363495424        0  8679424 29622272
Swap:        0        0        0
MemTotal:       514152 kB
MemFree:        354976 kB
MemShared:           0 kB
Buffers:          8476 kB
Cached:          28928 kB
SwapCached:          0 kB
Active:          19112 kB
Inactive:        18292 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       514152 kB
LowFree:        354976 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Mit Sep 26 09:15:02 CEST 2001
        total:    used:    free:  shared: buffers:  cached:
Mem:  526491648 18993152 507498496        0   409600  2408448
Swap:        0        0        0
MemTotal:       514152 kB
MemFree:        495604 kB
MemShared:           0 kB
Buffers:           400 kB
Cached:           2352 kB
SwapCached:          0 kB
Active:           1388 kB
Inactive:         1364 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       514152 kB
LowFree:        495604 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Mit Sep 26 09:15:03 CEST 2001
        total:    used:    free:  shared: buffers:  cached:
Mem:  526491648 19054592 507437056        0   421888  2453504
Swap:        0        0        0
MemTotal:       514152 kB
MemFree:        495544 kB
MemShared:           0 kB
Buffers:           412 kB
Cached:           2396 kB
SwapCached:          0 kB
Active:           1496 kB
Inactive:         1312 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       514152 kB
LowFree:        495544 kB
SwapTotal:           0 kB
SwapFree:            0 kB

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 23:24             ` Christian Bornträger
@ 2001-09-25 23:46               ` Andrea Arcangeli
  2001-09-26  7:29                 ` Christian Bornträger
  0 siblings, 1 reply; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-25 23:46 UTC (permalink / raw)
  To: Christian Bornträger
  Cc: Marcelo Tosatti, Paul Larson, Linus Torvalds,
	Jacek[iso-8859-2]Pop³awski, lkml

On Wed, Sep 26, 2001 at 01:24:10AM +0200, Christian Bornträger wrote:
> > Could you enable CONFIG_DEBUG_GFP (kernel debugging menu) in 2.4.10aa1
> > and send me full traces of the faliures so I can better see where the
> > problem cames from? Thanks!
> 
> OK, with the vm-tweaks and the gfp-patch I got the following output:
> 
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> dcb81e30 c01fd640 00000000 000001d2 00000000 00000000 dcb81e5c 00000001
>        c0223848 c02239b8 000001d2 00000000 00000001 c1978d40 de9f30c0 de9855a0
>        c01219d7 00000000 00000000 c012b35f c0223848 de9f30c0 c1978d40 00000001
> Call Trace: [<c01219d7>] [<c012b35f>] [<c0121a60>] [<c0121b40>] [<c011152e>]
>    [<c010adc2>] [<c0121e27>] [<c01113b0>] [<c0106dec>]
> VM: killing process a.out
> 
> 
> feeding it to ksymoops....:
> 
> 
> dcb81e30 c01fd640 00000000 000001d2 00000000 00000000 dcb81e5c 00000001
>        c0223848 c02239b8 000001d2 00000000 00000001 c1978d40 de9f30c0 de9855a0
>        c01219d7 00000000 00000000 c012b35f c0223848 de9f30c0 c1978d40 00000001
> Call Trace: [<c01219d7>] [<c012b35f>] [<c0121a60>] [<c0121b40>] [<c011152e>]
>    [<c010adc2>] [<c0121e27>] [<c01113b0>] [<c0106dec>]
> Warning (Oops_read): Code line not seen, dumping what data is available
> 
> Trace; c01219d7 <do_anonymous_page+37/90>
> Trace; c012b35f <__alloc_pages+4f/240>
> Trace; c0121a60 <do_no_page+30/b0>
> Trace; c0121b40 <handle_mm_fault+60/d0>
> Trace; c011152e <do_page_fault+17e/4b0>
> Trace; c010adc2 <timer_interrupt+62/110>
> Trace; c0121e27 <sys_brk+b7/f0>
> Trace; c01113b0 <do_page_fault+0/4b0>
> Trace; c0106dec <error_code+34/3c>
> 
> 
> If you need a run on a complete aa1-patch, let me know.

ok, this sounds like a normal oom condition but of course I assume it
isn't. Can you show a `vmstat 1` during the oom kill + some
/proc/meminfo? thanks for the great feedback!

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 23:01           ` Andrea Arcangeli
  2001-09-25 23:10             ` Christian Bornträger
@ 2001-09-25 23:24             ` Christian Bornträger
  2001-09-25 23:46               ` Andrea Arcangeli
  1 sibling, 1 reply; 59+ messages in thread
From: Christian Bornträger @ 2001-09-25 23:24 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Marcelo Tosatti, Paul Larson, Linus Torvalds,
	Jacek[iso-8859-2]Pop³awski, lkml

> Could you enable CONFIG_DEBUG_GFP (kernel debugging menu) in 2.4.10aa1
> and send me full traces of the faliures so I can better see where the
> problem cames from? Thanks!

OK, with the vm-tweaks and the gfp-patch I got the following output:

__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
dcb81e30 c01fd640 00000000 000001d2 00000000 00000000 dcb81e5c 00000001
       c0223848 c02239b8 000001d2 00000000 00000001 c1978d40 de9f30c0 de9855a0
       c01219d7 00000000 00000000 c012b35f c0223848 de9f30c0 c1978d40 00000001
Call Trace: [<c01219d7>] [<c012b35f>] [<c0121a60>] [<c0121b40>] [<c011152e>]
   [<c010adc2>] [<c0121e27>] [<c01113b0>] [<c0106dec>]
VM: killing process a.out


feeding it to ksymoops....:


dcb81e30 c01fd640 00000000 000001d2 00000000 00000000 dcb81e5c 00000001
       c0223848 c02239b8 000001d2 00000000 00000001 c1978d40 de9f30c0 de9855a0
       c01219d7 00000000 00000000 c012b35f c0223848 de9f30c0 c1978d40 00000001
Call Trace: [<c01219d7>] [<c012b35f>] [<c0121a60>] [<c0121b40>] [<c011152e>]
   [<c010adc2>] [<c0121e27>] [<c01113b0>] [<c0106dec>]
Warning (Oops_read): Code line not seen, dumping what data is available

Trace; c01219d7 <do_anonymous_page+37/90>
Trace; c012b35f <__alloc_pages+4f/240>
Trace; c0121a60 <do_no_page+30/b0>
Trace; c0121b40 <handle_mm_fault+60/d0>
Trace; c011152e <do_page_fault+17e/4b0>
Trace; c010adc2 <timer_interrupt+62/110>
Trace; c0121e27 <sys_brk+b7/f0>
Trace; c01113b0 <do_page_fault+0/4b0>
Trace; c0106dec <error_code+34/3c>


If you need a run on a complete aa1-patch, let me know.

greetings

Christian

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 23:10             ` Christian Bornträger
@ 2001-09-25 23:19               ` Andrea Arcangeli
  0 siblings, 0 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-25 23:19 UTC (permalink / raw)
  To: Christian Bornträger; +Cc: lkml

On Wed, Sep 26, 2001 at 01:10:07AM +0200, Christian Bornträger wrote:
> > Could you enable CONFIG_DEBUG_GFP (kernel debugging menu) in 2.4.10aa1
> > and send me full traces of the faliures so I can better see where the
> > problem cames from? Thanks!
> >
> > Andrea
> 
> Is it enough, to take a vanilla 2.4.10 and apply 00_debug-gfp-1 and 
> 00_vm-tweaks-1 or should I patch to a complete aa1. If yes, where can I find 

yes, that's enough, you don't need the complete aa1 just for that.

> the complete aa1-Patch in one part?

The whole aa1 patch can be found in the v2.4 directory called as
2.4.10aa1.bz2.

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 23:01           ` Andrea Arcangeli
@ 2001-09-25 23:10             ` Christian Bornträger
  2001-09-25 23:19               ` Andrea Arcangeli
  2001-09-25 23:24             ` Christian Bornträger
  1 sibling, 1 reply; 59+ messages in thread
From: Christian Bornträger @ 2001-09-25 23:10 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: lkml

> Could you enable CONFIG_DEBUG_GFP (kernel debugging menu) in 2.4.10aa1
> and send me full traces of the faliures so I can better see where the
> problem cames from? Thanks!
>
> Andrea

Is it enough, to take a vanilla 2.4.10 and apply 00_debug-gfp-1 and 
00_vm-tweaks-1 or should I patch to a complete aa1. If yes, where can I find 
the complete aa1-Patch in one part?

greetings

Christian

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 21:25         ` Marcelo Tosatti
@ 2001-09-25 23:05           ` Andrea Arcangeli
  2001-09-26 18:15             ` tpepper
  0 siblings, 1 reply; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-25 23:05 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Paul Larson, Linus Torvalds, Christian Bornträger,
	Jacek [iso-8859-2] Pop³awski, lkml

On Tue, Sep 25, 2001 at 06:25:10PM -0300, Marcelo Tosatti wrote:
> 
> 
> On Wed, 26 Sep 2001, Andrea Arcangeli wrote:
> 
> > On Mon, Sep 24, 2001 at 09:38:24AM -0300, Marcelo Tosatti wrote:
> > > --- linux.orig/mm/vmscan.c	Mon Sep 24 10:36:40 2001
> > > +++ linux/mm/vmscan.c	Mon Sep 24 10:54:01 2001
> > > @@ -567,6 +567,9 @@
> > >  		if (nr_pages <= 0)
> > >  			return 1;
> > >  
> > > +		if (nr_pages < SWAP_CLUSTER_MAX)
> > > +			ret |= 1;
> > > +
> > 
> > too much permissive (vm-tweaks-1 does something similar but not that
> > permissive)
> 
> Andrea,
> 
> Does vm-tweaks-1 fixes the current problem we're seeing? 

it seems no by reading the last email, however I'm not seeing any
problem, the DEBUG_GFP will tell us where the problem cames from,
pssobly it's a highmem thing since I never reproduced anything bad here.
But the point is that the above isn't going to be a right fix anyways.

> Also, we have to make sure _all_ progress accounting is being done
> correctly (i/dcache, etc). I'll make sure that happens as soon as the OOM
> problem is gone.
> 
> > >  		ret |= swap_out(priority, classzone, gfp_mask, SWAP_CLUSTER_MAX << 2);
> > >  	} while (--priority);
> > >  
> > > --- linux.orig/mm/page_alloc.c	Mon Sep 24 10:36:40 2001
> > > +++ linux/mm/page_alloc.c	Mon Sep 24 10:44:12 2001
> > > @@ -400,7 +400,7 @@
> > >  			if (!z)
> > >  				break;
> > >  
> > > -			if (zone_free_pages(z, order) > z->pages_high) {
> > > +			if (zone_free_pages(z, order) > z->pages_min) {
> > 
> > that breaks oom detection.
> 
> Why? 

the only point such code exists is to try to kill only one task, if the
killed task was the right one, such code can be dropped and still the
machine must not fail allocation unless truly oom, so any change there
cannot obviously fix anything related to early-oom-by-mistake.

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 22:16         ` Christian Bornträger
@ 2001-09-25 23:01           ` Andrea Arcangeli
  2001-09-25 23:10             ` Christian Bornträger
  2001-09-25 23:24             ` Christian Bornträger
  0 siblings, 2 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-25 23:01 UTC (permalink / raw)
  To: Christian Bornträger
  Cc: Marcelo Tosatti, Paul Larson, Linus Torvalds,
	Jacek[iso-8859-2]Pop³awski, lkml

On Wed, Sep 26, 2001 at 12:16:53AM +0200, Christian Bornträger wrote:
> > too much permissive (vm-tweaks-1 does something similar but not that
> > permissive)
> 
> But it doesnt help neither.  I installed vm-tweaks-1 on a vanilla 2.4.10 and 
> still got an __alloc_pages: 0-order allocation failure
> I have no swap and 512 MB of RAM.

Could you enable CONFIG_DEBUG_GFP (kernel debugging menu) in 2.4.10aa1
and send me full traces of the faliures so I can better see where the
problem cames from? Thanks!

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 22:09       ` Andrea Arcangeli
  2001-09-25 21:25         ` Marcelo Tosatti
@ 2001-09-25 22:16         ` Christian Bornträger
  2001-09-25 23:01           ` Andrea Arcangeli
  1 sibling, 1 reply; 59+ messages in thread
From: Christian Bornträger @ 2001-09-25 22:16 UTC (permalink / raw)
  To: Andrea Arcangeli, Marcelo Tosatti
  Cc: Paul Larson, Linus Torvalds, Christian Bornträger,
	Jacek[iso-8859-2]Pop³awski, lkml

> too much permissive (vm-tweaks-1 does something similar but not that
> permissive)

But it doesnt help neither.  I installed vm-tweaks-1 on a vanilla 2.4.10 and 
still got an __alloc_pages: 0-order allocation failure
I have no swap and 512 MB of RAM.




^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 12:38     ` Marcelo Tosatti
  2001-09-24 11:35       ` Paul Larson
  2001-09-24 15:33       ` Christian Bornträger
@ 2001-09-25 22:09       ` Andrea Arcangeli
  2001-09-25 21:25         ` Marcelo Tosatti
  2001-09-25 22:16         ` Christian Bornträger
  2 siblings, 2 replies; 59+ messages in thread
From: Andrea Arcangeli @ 2001-09-25 22:09 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Paul Larson, Linus Torvalds, Christian Bornträger,
	Jacek [iso-8859-2] Pop³awski, lkml

On Mon, Sep 24, 2001 at 09:38:24AM -0300, Marcelo Tosatti wrote:
> --- linux.orig/mm/vmscan.c	Mon Sep 24 10:36:40 2001
> +++ linux/mm/vmscan.c	Mon Sep 24 10:54:01 2001
> @@ -567,6 +567,9 @@
>  		if (nr_pages <= 0)
>  			return 1;
>  
> +		if (nr_pages < SWAP_CLUSTER_MAX)
> +			ret |= 1;
> +

too much permissive (vm-tweaks-1 does something similar but not that
permissive)

>  		ret |= swap_out(priority, classzone, gfp_mask, SWAP_CLUSTER_MAX << 2);
>  	} while (--priority);
>  
> --- linux.orig/mm/page_alloc.c	Mon Sep 24 10:36:40 2001
> +++ linux/mm/page_alloc.c	Mon Sep 24 10:44:12 2001
> @@ -400,7 +400,7 @@
>  			if (!z)
>  				break;
>  
> -			if (zone_free_pages(z, order) > z->pages_high) {
> +			if (zone_free_pages(z, order) > z->pages_min) {

that breaks oom detection.

Andrea

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-25 22:09       ` Andrea Arcangeli
@ 2001-09-25 21:25         ` Marcelo Tosatti
  2001-09-25 23:05           ` Andrea Arcangeli
  2001-09-25 22:16         ` Christian Bornträger
  1 sibling, 1 reply; 59+ messages in thread
From: Marcelo Tosatti @ 2001-09-25 21:25 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Paul Larson, Linus Torvalds, Christian Bornträger,
	Jacek [iso-8859-2] Pop³awski, lkml



On Wed, 26 Sep 2001, Andrea Arcangeli wrote:

> On Mon, Sep 24, 2001 at 09:38:24AM -0300, Marcelo Tosatti wrote:
> > --- linux.orig/mm/vmscan.c	Mon Sep 24 10:36:40 2001
> > +++ linux/mm/vmscan.c	Mon Sep 24 10:54:01 2001
> > @@ -567,6 +567,9 @@
> >  		if (nr_pages <= 0)
> >  			return 1;
> >  
> > +		if (nr_pages < SWAP_CLUSTER_MAX)
> > +			ret |= 1;
> > +
> 
> too much permissive (vm-tweaks-1 does something similar but not that
> permissive)

Andrea,

Does vm-tweaks-1 fixes the current problem we're seeing? 

Also, we have to make sure _all_ progress accounting is being done
correctly (i/dcache, etc). I'll make sure that happens as soon as the OOM
problem is gone.

> >  		ret |= swap_out(priority, classzone, gfp_mask, SWAP_CLUSTER_MAX << 2);
> >  	} while (--priority);
> >  
> > --- linux.orig/mm/page_alloc.c	Mon Sep 24 10:36:40 2001
> > +++ linux/mm/page_alloc.c	Mon Sep 24 10:44:12 2001
> > @@ -400,7 +400,7 @@
> >  			if (!z)
> >  				break;
> >  
> > -			if (zone_free_pages(z, order) > z->pages_high) {
> > +			if (zone_free_pages(z, order) > z->pages_min) {
> 
> that breaks oom detection.

Why? 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 21:03   ` Jacek Popławski
@ 2001-09-24 21:11     ` Jacek Popławski
  0 siblings, 0 replies; 59+ messages in thread
From: Jacek Popławski @ 2001-09-24 21:11 UTC (permalink / raw)
  To: linux-kernel

On Mon, Sep 24, 2001 at 11:03:46PM +0200, Jacek Popławski wrote:
> VM: killing process donkey_s
> free
>              total       used       free     shared    buffers     cached
> Mem:        320616     318732       1884          0        116     305480
> -/+ buffers/cache:      13136     307480
> Swap:       104380       9312      95068

and it's getting worse (donkey_s is already killed!):

             total       used       free     shared    buffers     cached
Mem:        320616     318828       1788          0        112     305596
-/+ buffers/cache:      13120     307496
Swap:       104380      10472      93908
[root@localhost /root]# __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
VM: killing process sendmail
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
VM: killing process sendmail
(...)

PS. uptime=5h, no problems before I started "donkey_s"

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 11:12 ` Marcelo Tosatti
                     ` (2 preceding siblings ...)
  2001-09-24 15:54   ` Olaf Hering
@ 2001-09-24 21:03   ` Jacek Popławski
  2001-09-24 21:11     ` Jacek Popławski
  3 siblings, 1 reply; 59+ messages in thread
From: Jacek Popławski @ 2001-09-24 21:03 UTC (permalink / raw)
  To: linux-kernel

On Mon, Sep 24, 2001 at 08:12:20AM -0300, Marcelo Tosatti wrote:
> Jacek, 
> 
> You had available swap when the VM started to kill processes ? 

Application eats whole memory, then started using swap, when swap used is 10MB
kernel starting to cry:

[root@localhost /root]# free
             total       used       free     shared    buffers     cached
Mem:        320616     317348       3268          0        120     304096
-/+ buffers/cache:      13132     307484
Swap:       104380      10208      94172
[root@localhost /root]# free
             total       used       free     shared    buffers     cached
Mem:        320616     318932       1684          0        136     305372
-/+ buffers/cache:      13424     307192
Swap:       104380      10072      94308
[root@localhost /root]# __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
free
             total       used       free     shared    buffers     cached
Mem:        320616     318884       1732          0        128     305636
-/+ buffers/cache:      13120     307496
Swap:       104380      10204      94176
[root@localhost /root]# __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
VM: killing process donkey_s
free
             total       used       free     shared    buffers     cached
Mem:        320616     318732       1884          0        116     305480
-/+ buffers/cache:      13136     307480
Swap:       104380       9312      95068


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 15:12         ` Marcelo Tosatti
@ 2001-09-24 19:48           ` tpepper
  0 siblings, 0 replies; 59+ messages in thread
From: tpepper @ 2001-09-24 19:48 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Paul Larson, lkml

Just to confirm I'm seeing this also.  I've a machine with 512mb ram and a
gig of swap.  Running a filesystem i/o stress test app causes the machine
to pretty much run out of memory.  The swap is hardly touched.  Then the
VM starts killing things...klogd, the file i/o app, the shell it was in...

I didn't see any significant change here with the patch.

Here's meminfo prior to and towards the end of things for what it's worth:

[root@foobox /root]# cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
				Mem:  526299136 66060288 460238848        0   864256 21032960
				Swap: 1074765824  4374528 1070391296
				MemTotal:       513964 kB
				MemFree:        449452 kB
				MemShared:           0 kB
				Buffers:           844 kB
				Cached:          16268 kB
				SwapCached:       4272 kB
				Active:           5804 kB
				Inactive:        15580 kB
				HighTotal:           0 kB
				HighFree:            0 kB
				LowTotal:       513964 kB
				LowFree:        449452 kB
				SwapTotal:     1049576 kB
				SwapFree:      1045304 kB

[root@foobox /root]# cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
				Mem:  526299136 522604544  3694592        0  1212416 435134464
				Swap: 1074765824  3366912 1071398912
				MemTotal:       513964 kB
				MemFree:          3608 kB
				MemShared:           0 kB
				Buffers:          1184 kB
				Cached:         424784 kB
				SwapCached:        152 kB
				Active:         356640 kB
				Inactive:        69480 kB
				HighTotal:           0 kB
				HighFree:            0 kB
				LowTotal:       513964 kB
				LowFree:          3608 kB
				SwapTotal:     1049576 kB
				SwapFree:      1046288 kB

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 15:54   ` Olaf Hering
@ 2001-09-24 16:06     ` Olaf Hering
  0 siblings, 0 replies; 59+ messages in thread
From: Olaf Hering @ 2001-09-24 16:06 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

On Mon, Sep 24, Olaf Hering wrote:

> mandarine:~ # vmstat
>    procs                      memory    swap          io     system
> cpu
>  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
> sy  id
>  3  0  1      0   2744  53944 1794968   0   0   440   343   75   300  14
> 28  58
> mandarine:~ # free
> Killed
> 
> 
> That did not happen with pre10aa1, at least the OOM kills.
> I happend with a bk pull, a build in the background. I seems that it
> doesnt release some memory...

it seems that the cache grows and grows, one bk process was still active. No idea
who to blame, but it should not kill the box :)

27429 pts/0    D      3:59 bk idcache -q


Gruss Olaf

-- 
 $ man clone

BUGS
       Main feature not yet implemented...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 11:12 ` Marcelo Tosatti
  2001-09-24  8:13   ` Paul Larson
  2001-09-24  9:01   ` Paul Larson
@ 2001-09-24 15:54   ` Olaf Hering
  2001-09-24 16:06     ` Olaf Hering
  2001-09-24 21:03   ` Jacek Popławski
  3 siblings, 1 reply; 59+ messages in thread
From: Olaf Hering @ 2001-09-24 15:54 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

On Mon, Sep 24, Marcelo Tosatti wrote:

> 
> 
> On Mon, 24 Sep 2001, Jacek [iso-8859-2] Pop³awski wrote:
> 
> > I just installed 2.4.10, and...
> > 
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > VM: killing process donkey_s
> > __alloc_pages: 0-order allocation failed (gfp=0x1f0/0) from c0126c2e
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > VM: killing process screen
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > VM: killing process bash
> > (...)
> > 
> > I am changing kernels often, but never seen that kind of message. Last kernel I
> > had before 2.4.10 was 2.4.10-pre4.
> > 
> > PS. donkey_s is application which eats a lot of memory, but I have 384MB RAM
> > and 100MB swap.
> 
> Jacek, 
> 
> You had available swap when the VM started to kill processes ? 

I see that too with 2.4.10aa1 on a 4way 2gig ppc power3 box without
swap:

mandarine:~ # w
bash: fork: Cannot allocate memory
mandarine:~ # w
bash: /usr/bin/w: Cannot allocate memory
mandarine:~ # w
  5:50pm  up 13 min,  3 users,  load average: 6.27, 3.30, 1.67
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
root     ttyS0    -                 5:39pm  3.00s  3.19s  1.51s  w 
olaf     pts/0    nectarine.suse.d  5:39pm  9:45  43.90s  0.06s  sh
do_all 
olh      pts/1    nectarine.suse.d  5:48pm  7.00s  0.79s  0.79s  -bash 
mandarine:~ # free
bash: fork: Cannot allocate memory
mandarine:~ # dmesg | tail
__alloc_pages: 0-order allocation failed (gfp=0x1f0/0)
__alloc_pages: 0-order allocation failed (gfp=0x1f0/0)
__alloc_pages: 0-order allocation failed (gfp=0x70/0)
__alloc_pages: 0-order allocation failed (gfp=0x70/0)
__alloc_pages: 0-order allocation failed (gfp=0x70/0)
__alloc_pages: 0-order allocation failed (gfp=0x70/0)
__alloc_pages: 0-order allocation failed (gfp=0x70/0)
__alloc_pages: 0-order allocation failed (gfp=0x70/0)
__alloc_pages: 0-order allocation failed (gfp=0x1f0/0)
VM: killing process cc1
mandarine:~ # free
             total       used       free     shared    buffers
cached
Mem:       2057304    2052932       4372          0      53480
1792468
-/+ buffers/cache:     206984    1850320
Swap:            0          0          0
mandarine:~ # free
bash: fork: Cannot allocate memory
mandarine:~ # vmstat
bash: fork: Cannot allocate memory
mandarine:~ # vmstat
bash: fork: Cannot allocate memory
mandarine:~ # vmstat
bash: fork: Cannot allocate memory
mandarine:~ # vmstat
bash: fork: Cannot allocate memory
mandarine:~ # vmstat
bash: fork: Cannot allocate memory
mandarine:~ # vmstat
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us
sy  id
 3  0  1      0   2744  53944 1794968   0   0   440   343   75   300  14
28  58
mandarine:~ # free
Killed


That did not happen with pre10aa1, at least the OOM kills.
I happend with a bk pull, a build in the background. I seems that it
doesnt release some memory...

Gruss Olaf

-- 
 $ man clone

BUGS
       Main feature not yet implemented...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 12:38     ` Marcelo Tosatti
  2001-09-24 11:35       ` Paul Larson
@ 2001-09-24 15:33       ` Christian Bornträger
  2001-09-25 22:09       ` Andrea Arcangeli
  2 siblings, 0 replies; 59+ messages in thread
From: Christian Bornträger @ 2001-09-24 15:33 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel, Linus Torvalds

> For the people having the allocation failure problems, please try the
> following patch.

I tried it. No success.
dmesg: after the "bad" program I wrote some mail ago.

__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0123cc1
VM: killing process a.out

I forgot to mention:

c0123c60 t page_cache_read
c0123d10 t read_cluster_nonblocking

If you need  a backtrace, we should insert a panic into the code to get a 
full back trace for the debugging.

greetings

Christian Bornträger

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 11:35       ` Paul Larson
@ 2001-09-24 15:12         ` Marcelo Tosatti
  2001-09-24 19:48           ` tpepper
  2001-09-26 13:48         ` Andrea Arcangeli
  1 sibling, 1 reply; 59+ messages in thread
From: Marcelo Tosatti @ 2001-09-24 15:12 UTC (permalink / raw)
  To: Paul Larson; +Cc: lkml



On 24 Sep 2001, Paul Larson wrote:

> 
> The patch helped for me, but there are still problems.  I was able to
> run all the way through LTP without it shutting anything down.  When I
> used one of the memory tests to chew up all the ram though, I noticed
> that VM was killing things it shouldn't have.  First thing to get killed
> was cron, then top, then it finally killed mtest01 (the memory test
> mentioned before).

Ok, its good to know that the patch at least helped.

Watch out for another patch in hours.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 12:58 ` Christian Bornträger
@ 2001-09-24 13:05   ` Arjan van de Ven
  0 siblings, 0 replies; 59+ messages in thread
From: Arjan van de Ven @ 2001-09-24 13:05 UTC (permalink / raw)
  To: Christian Bornträger; +Cc: linux-kernel

Christian Bornträger wrote:
> 
> > I just installed 2.4.10, and...
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> 
> I saw the same message when running this c++ programm.
> 
> int main (int argc, char * argv[]) {
> char * test;
> while (1)
> test=new char[1024];
> }
> 
> My dmesg:
> 
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c01219e7
> VM: killing process a.out

While this program is obviously "bad", it does show that something
is not right. It should print "OOM: killing process a.out" as the
kernel will have to deliberatly kill this "out of hand" program.
the "VM: killing" message means it could just as easily have killed
another program due to this DoS program...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24  2:02 __alloc_pages: 0-order allocation failed Jacek Popławski
  2001-09-24 11:12 ` Marcelo Tosatti
@ 2001-09-24 12:58 ` Christian Bornträger
  2001-09-24 13:05   ` Arjan van de Ven
  1 sibling, 1 reply; 59+ messages in thread
From: Christian Bornträger @ 2001-09-24 12:58 UTC (permalink / raw)
  To: linux-kernel

> I just installed 2.4.10, and...
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e

I saw the same message when running this c++ programm.

int main (int argc, char * argv[]) {
char * test;
while (1)
test=new char[1024];
}

My dmesg:

__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c01219e7
VM: killing process a.out

I have 512 MB RAM and no swap.
Actually the system slowed down a lot but worked fine again after the kill.
And a flood ping from another PC had no lost packages. 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24  8:13   ` Paul Larson
@ 2001-09-24 12:38     ` Marcelo Tosatti
  2001-09-24 11:35       ` Paul Larson
                         ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Marcelo Tosatti @ 2001-09-24 12:38 UTC (permalink / raw)
  To: Paul Larson, Linus Torvalds
  Cc: Christian Bornträger, Jacek [iso-8859-2] Pop³awski, lkml



On 24 Sep 2001, Paul Larson wrote:

> On 24 Sep 2001 08:12:20 -0300, Marcelo Tosatti wrote:
> > 
> > 
> > On Mon, 24 Sep 2001, Jacek [iso-8859-2] Pop³awski wrote:
> > 
> > > I just installed 2.4.10, and...
> > > 
> > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > > VM: killing process donkey_s
> ...
> 
> I'm getting a lot of this with 2.4.10 also.  At the time, I had KDM
> running, but I was coming into the box over telnet and running the
> latest released version of LTP.  The test it appeared to be on at the
> time was a filesystem test called growfiles.  Nothing else was running
> other than these things and standard system services.  The machine has
> 256 MB of ram, and 512 MB swap.  The order that things got killed in
> were sadc, sar, kdm, X, in.telnetd, xinetd (ouch).
> 
> 
> __alloc_pages: 0-order allocation failed (gfp=0xf0/0) from c012b9b2
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> VM: killing process xinetd
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2

For the people having the allocation failure problems, please try the
following patch. 

Currently the page freeing success accounting is completly broken (it does
not report a task has made progress while it did), and the page allocation
code uses that information to know if it should or not try to keep calling
the freeing code.

Please test this. 

--- linux.orig/mm/vmscan.c	Mon Sep 24 10:36:40 2001
+++ linux/mm/vmscan.c	Mon Sep 24 10:54:01 2001
@@ -567,6 +567,9 @@
 		if (nr_pages <= 0)
 			return 1;
 
+		if (nr_pages < SWAP_CLUSTER_MAX)
+			ret |= 1;
+
 		ret |= swap_out(priority, classzone, gfp_mask, SWAP_CLUSTER_MAX << 2);
 	} while (--priority);
 
--- linux.orig/mm/page_alloc.c	Mon Sep 24 10:36:40 2001
+++ linux/mm/page_alloc.c	Mon Sep 24 10:44:12 2001
@@ -400,7 +400,7 @@
 			if (!z)
 				break;
 
-			if (zone_free_pages(z, order) > z->pages_high) {
+			if (zone_free_pages(z, order) > z->pages_min) {
 				page = rmqueue(z, order);
 				if (page)
 					return page;


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 12:38     ` Marcelo Tosatti
@ 2001-09-24 11:35       ` Paul Larson
  2001-09-24 15:12         ` Marcelo Tosatti
  2001-09-26 13:48         ` Andrea Arcangeli
  2001-09-24 15:33       ` Christian Bornträger
  2001-09-25 22:09       ` Andrea Arcangeli
  2 siblings, 2 replies; 59+ messages in thread
From: Paul Larson @ 2001-09-24 11:35 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml


The patch helped for me, but there are still problems.  I was able to
run all the way through LTP without it shutting anything down.  When I
used one of the memory tests to chew up all the ram though, I noticed
that VM was killing things it shouldn't have.  First thing to get killed
was cron, then top, then it finally killed mtest01 (the memory test
mentioned before).


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24  2:02 __alloc_pages: 0-order allocation failed Jacek Popławski
@ 2001-09-24 11:12 ` Marcelo Tosatti
  2001-09-24  8:13   ` Paul Larson
                     ` (3 more replies)
  2001-09-24 12:58 ` Christian Bornträger
  1 sibling, 4 replies; 59+ messages in thread
From: Marcelo Tosatti @ 2001-09-24 11:12 UTC (permalink / raw)
  To: Jacek Popławski; +Cc: lkml



On Mon, 24 Sep 2001, Jacek [iso-8859-2] Pop³awski wrote:

> I just installed 2.4.10, and...
> 
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> VM: killing process donkey_s
> __alloc_pages: 0-order allocation failed (gfp=0x1f0/0) from c0126c2e
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> VM: killing process screen
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> VM: killing process bash
> (...)
> 
> I am changing kernels often, but never seen that kind of message. Last kernel I
> had before 2.4.10 was 2.4.10-pre4.
> 
> PS. donkey_s is application which eats a lot of memory, but I have 384MB RAM
> and 100MB swap.

Jacek, 

You had available swap when the VM started to kill processes ? 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 11:12 ` Marcelo Tosatti
  2001-09-24  8:13   ` Paul Larson
@ 2001-09-24  9:01   ` Paul Larson
  2001-09-24 15:54   ` Olaf Hering
  2001-09-24 21:03   ` Jacek Popławski
  3 siblings, 0 replies; 59+ messages in thread
From: Paul Larson @ 2001-09-24  9:01 UTC (permalink / raw)
  To: lkml

I just remembered that my sar output file is still around from before
sar got killed, so I thought you might like to see the tail end of -r on
it.  Memory utilization got pretty high before sar got killed, but swap
was still at almost nothing:

07:34:18    kbmemfree kbmemused  %memused kbmemshrd kbbuffers  kbcached kbswpfree kbswpused  %swpused
07:34:19        59620    194728     76.56         0      6052    146688    530136         0      0.00
07:34:20        50572    203776     80.12         0      6052    155736    530136         0      0.00
07:34:21        41488    212860     83.69         0      6052    164820    530136         0      0.00
07:34:22        32484    221864     87.23         0      6056    173820    530136         0      0.00
07:34:23        23372    230976     90.81         0      6064    182924    530136         0      0.00
07:34:24        14012    240336     94.49         0      6072    192276    530136         0      0.00
07:34:25         4980    249368     98.04         0      6080    201300    530136         0      0.00
07:34:26         3876    250472     98.48         0      6088    202468    530136         0      0.00
07:34:27         3664    250684     98.56         0      3324    205444    530136         0      0.00
07:34:28         3560    250788     98.60         0      1248    207624    530136         0      0.00
07:34:29         3764    250584     98.52         0       244    208424    530136         0      0.00
07:34:30         4180    250168     98.36         0       148    209272    529532       604      0.11
07:34:31         4072    250276     98.40         0       100    213304    526196      3940      0.74
07:34:32         3488    250860     98.63         0        96    218204    519968     10168      1.92
Average:       122043    132305     52.02         0      5848     84739    530055        81      0.02

-Paul Larson

On 24 Sep 2001 08:13:42 +0000, Paul Larson wrote:
> I'm getting a lot of this with 2.4.10 also.  At the time, I had KDM
> running, but I was coming into the box over telnet and running the
> latest released version of LTP.  The test it appeared to be on at the
> time was a filesystem test called growfiles.  Nothing else was running
> other than these things and standard system services.  The machine has
> 256 MB of ram, and 512 MB swap.  The order that things got killed in
> were sadc, sar, kdm, X, in.telnetd, xinetd (ouch).
> 
> 
> __alloc_pages: 0-order allocation failed (gfp=0xf0/0) from c012b9b2
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> VM: killing process xinetd
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
> 
> Thanks,
> Paul Larson
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: __alloc_pages: 0-order allocation failed
  2001-09-24 11:12 ` Marcelo Tosatti
@ 2001-09-24  8:13   ` Paul Larson
  2001-09-24 12:38     ` Marcelo Tosatti
  2001-09-24  9:01   ` Paul Larson
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 59+ messages in thread
From: Paul Larson @ 2001-09-24  8:13 UTC (permalink / raw)
  To: lkml

On 24 Sep 2001 08:12:20 -0300, Marcelo Tosatti wrote:
> 
> 
> On Mon, 24 Sep 2001, Jacek [iso-8859-2] Pop³awski wrote:
> 
> > I just installed 2.4.10, and...
> > 
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
> > VM: killing process donkey_s
...

I'm getting a lot of this with 2.4.10 also.  At the time, I had KDM
running, but I was coming into the box over telnet and running the
latest released version of LTP.  The test it appeared to be on at the
time was a filesystem test called growfiles.  Nothing else was running
other than these things and standard system services.  The machine has
256 MB of ram, and 512 MB swap.  The order that things got killed in
were sadc, sar, kdm, X, in.telnetd, xinetd (ouch).


__alloc_pages: 0-order allocation failed (gfp=0xf0/0) from c012b9b2
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
VM: killing process xinetd
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c012b9b2

Thanks,
Paul Larson


^ permalink raw reply	[flat|nested] 59+ messages in thread

* __alloc_pages: 0-order allocation failed
@ 2001-09-24  2:02 Jacek Popławski
  2001-09-24 11:12 ` Marcelo Tosatti
  2001-09-24 12:58 ` Christian Bornträger
  0 siblings, 2 replies; 59+ messages in thread
From: Jacek Popławski @ 2001-09-24  2:02 UTC (permalink / raw)
  To: linux-kernel

I just installed 2.4.10, and...

__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
VM: killing process donkey_s
__alloc_pages: 0-order allocation failed (gfp=0x1f0/0) from c0126c2e
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
VM: killing process screen
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) from c0126c2e
VM: killing process bash
(...)

I am changing kernels often, but never seen that kind of message. Last kernel I
had before 2.4.10 was 2.4.10-pre4.

PS. donkey_s is application which eats a lot of memory, but I have 384MB RAM
and 100MB swap.

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2006-06-01 12:19 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-04 13:11 __alloc_pages: 0-order allocation failed Martin MOKREJŠ
2001-09-04 16:12 ` Daniel Phillips
2001-09-07 12:53   ` Martin MOKREJŠ
2001-09-07 13:06   ` Martin MOKREJŠ
2001-09-07 20:43     ` Daniel Phillips
2001-09-07 21:00     ` Daniel Phillips
2001-09-12 13:06       ` Martin MOKREJŠ
2001-09-19 14:21         ` __alloc_pages: 0-order allocation failed still in -pre12 Martin MOKREJŠ
2001-09-19 15:03           ` Martin MOKREJŠ
2001-09-19 15:16           ` Rik van Riel
2001-09-19 15:51             ` Martin MOKREJŠ
2001-09-19 22:34           ` Shane Wegner
2001-09-19 22:45             ` Andrea Arcangeli
2001-09-20  2:31               ` Shane Wegner
2001-09-20  2:36                 ` Andrea Arcangeli
2001-09-20  2:36                 ` Shane Wegner
2001-09-20  2:52                   ` Andrea Arcangeli
2001-09-20 15:02                     ` Randy.Dunlap
2001-09-21  1:54                       ` Keith Owens
2001-09-20  9:57                 ` Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian Martin MOKREJŠ
2001-09-20 10:10                   ` Magnus Naeslund(f)
2001-09-20 10:26                     ` Martin MOKREJŠ
2001-09-20 10:26                     ` Magnus Naeslund(f)
2001-09-20 10:59                     ` Perf improvements in 2.4.10pre12aa1 Martin MOKREJŠ
2001-09-20 15:28                       ` Martin MOKREJŠ
2001-09-20 15:40                         ` Martin MOKREJŠ
2001-09-20 10:24                   ` [PATCH] Make kernel build numbers work again (was: Re: Cannot compile 2.4.10pre12aa1 with 2.95.2 on Debian) Russell King
2001-09-20 12:54                     ` Alan Cox
2001-09-19 22:39           ` __alloc_pages: 0-order allocation failed still in -pre12 Andrea Arcangeli
2001-09-24  2:02 __alloc_pages: 0-order allocation failed Jacek Popławski
2001-09-24 11:12 ` Marcelo Tosatti
2001-09-24  8:13   ` Paul Larson
2001-09-24 12:38     ` Marcelo Tosatti
2001-09-24 11:35       ` Paul Larson
2001-09-24 15:12         ` Marcelo Tosatti
2001-09-24 19:48           ` tpepper
2001-09-26 13:48         ` Andrea Arcangeli
2001-09-24 15:33       ` Christian Bornträger
2001-09-25 22:09       ` Andrea Arcangeli
2001-09-25 21:25         ` Marcelo Tosatti
2001-09-25 23:05           ` Andrea Arcangeli
2001-09-26 18:15             ` tpepper
2001-09-26 18:29               ` Andrea Arcangeli
2001-09-25 22:16         ` Christian Bornträger
2001-09-25 23:01           ` Andrea Arcangeli
2001-09-25 23:10             ` Christian Bornträger
2001-09-25 23:19               ` Andrea Arcangeli
2001-09-25 23:24             ` Christian Bornträger
2001-09-25 23:46               ` Andrea Arcangeli
2001-09-26  7:29                 ` Christian Bornträger
2001-09-24  9:01   ` Paul Larson
2001-09-24 15:54   ` Olaf Hering
2001-09-24 16:06     ` Olaf Hering
2001-09-24 21:03   ` Jacek Popławski
2001-09-24 21:11     ` Jacek Popławski
2001-09-24 12:58 ` Christian Bornträger
2001-09-24 13:05   ` Arjan van de Ven
2006-05-31 12:09 Oliver König
2006-06-01 12:19 ` Jes Sorensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).