All of lore.kernel.org
 help / color / mirror / Atom feed
* Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-08-11 16:30 ` Juergen Beisert
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-11 16:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arm-kernel, linux-hotplug

Hi,

I get the following Ooops message when "udevadm" is running on an ARM S3C2440
CPU based system:

[...]
starting udevd...done
Unable to handle kernel paging request at virtual address e3540000
pgd = c39d4000
[e3540000] *pgd=00000000
Internal error: Oops: 5 [#1]
Modules linked in:
CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
PC is at strlen+0xc/0x20
LR is at kobject_get_path+0x24/0xa4
pc : [<c0115338>]    lr : [<c0111f48>]    psr: a0000013
sp : c39bdea0  ip : 00000005  fp : c029645b
r10: c02e0430  r9 : 00000000  r8 : c3802c60
r7 : c001de30  r6 : 000000d0  r5 : 00000001  r4 : c001de30
r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 339d4000  DAC: 00000015
Process udevadm (pid: 325, stack limit = 0xc39bc270)
Stack: (0xc39bdea0 to 0xc39be000)
dea0: c001de28 00000003 c393d778 c399d000 c3802c60 c014dce0 c029645b c0112200
dec0: c393d778 00000000 00000003 c393d780 c399d000 c0112400 00000000 00020001
dee0: c0307fdc 00000000 c3950960 c027ddc3 c380cda0 c343b3d4 c3811e60 00000000
df00: 00000000 c393d778 00000003 c3960520 c393d780 c02e0470 c3960538 c39bdf88
df20: 00019cb0 c014dd80 c382db20 00000000 c3960538 c38ede7c 00000003 c014d464
df40: c38ede7c c00bf224 c382db20 00019cb0 c39bdf88 00000004 00000003 c39bc000
df60: 00000000 c0080bd8 c343b3d4 00000020 00000000 00000000 c382db20 00000004
df80: c0022fa8 c0080d38 00000000 00000000 bead3250 00000000 00026d98 00000003
dfa0: bead3250 c0022e00 00026d98 00000003 00000003 00019cb0 00000003 00000000
dfc0: 00026d98 00000003 bead3250 00000004 00022a90 bead3678 00019cb0 00019cb0
dfe0: 00022a94 bead3250 00018114 400d8f1c 40000010 00000003 00000000 00000000
[<c0115338>] (strlen+0xc/0x20) from [<c0111f48>] (kobject_get_path+0x24/0xa4)
[<c0111f48>] (kobject_get_path+0x24/0xa4) from [<c014dce0>] (dev_uevent+0x1dc/0x208)
[<c014dce0>] (dev_uevent+0x1dc/0x208) from [<c0112400>] (kobject_uevent_env+0x18c/0x3a8)
[<c0112400>] (kobject_uevent_env+0x18c/0x3a8) from [<c014dd80>] (store_uevent+0x74/0x88)
[<c014dd80>] (store_uevent+0x74/0x88) from [<c014d464>] (dev_attr_store+0x20/0x28)
[<c014d464>] (dev_attr_store+0x20/0x28) from [<c00bf224>] (sysfs_write_file+0x104/0x13c)
[<c00bf224>] (sysfs_write_file+0x104/0x13c) from [<c0080bd8>] (vfs_write+0xb0/0x15c)
[<c0080bd8>] (vfs_write+0xb0/0x15c) from [<c0080d38>] (sys_write+0x40/0x6c)
[<c0080d38>] (sys_write+0x40/0x6c) from [<c0022e00>] (ret_fast_syscall+0x0/0x2c)
Code: c02dbe78 e1a02000 ea000000 e2800001 (e5d03000)
---[ end trace 4d391449ae70e71a ]---
Segmentation fault
[...]

This Oops does not occure in v2.6.31-rc4 and occures in v2.6.31-rc5. Bisected
to the commit:

e084b2d95e48b31aa45f9c49ffc6cdae8bdb21d4
"page-allocator: preserve PFN ordering when __GFP_COLD is set"

But its curious: The same binary kernel still runs without this oops on an ARM
S3C2410 CPU.

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-08-11 16:30 ` Juergen Beisert
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-11 16:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arm-kernel, linux-hotplug

Hi,

I get the following Ooops message when "udevadm" is running on an ARM S3C2440
CPU based system:

[...]
starting udevd...done
Unable to handle kernel paging request at virtual address e3540000
pgd = c39d4000
[e3540000] *pgd\0000000
Internal error: Oops: 5 [#1]
Modules linked in:
CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
PC is at strlen+0xc/0x20
LR is at kobject_get_path+0x24/0xa4
pc : [<c0115338>]    lr : [<c0111f48>]    psr: a0000013
sp : c39bdea0  ip : 00000005  fp : c029645b
r10: c02e0430  r9 : 00000000  r8 : c3802c60
r7 : c001de30  r6 : 000000d0  r5 : 00000001  r4 : c001de30
r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 339d4000  DAC: 00000015
Process udevadm (pid: 325, stack limit = 0xc39bc270)
Stack: (0xc39bdea0 to 0xc39be000)
dea0: c001de28 00000003 c393d778 c399d000 c3802c60 c014dce0 c029645b c0112200
dec0: c393d778 00000000 00000003 c393d780 c399d000 c0112400 00000000 00020001
dee0: c0307fdc 00000000 c3950960 c027ddc3 c380cda0 c343b3d4 c3811e60 00000000
df00: 00000000 c393d778 00000003 c3960520 c393d780 c02e0470 c3960538 c39bdf88
df20: 00019cb0 c014dd80 c382db20 00000000 c3960538 c38ede7c 00000003 c014d464
df40: c38ede7c c00bf224 c382db20 00019cb0 c39bdf88 00000004 00000003 c39bc000
df60: 00000000 c0080bd8 c343b3d4 00000020 00000000 00000000 c382db20 00000004
df80: c0022fa8 c0080d38 00000000 00000000 bead3250 00000000 00026d98 00000003
dfa0: bead3250 c0022e00 00026d98 00000003 00000003 00019cb0 00000003 00000000
dfc0: 00026d98 00000003 bead3250 00000004 00022a90 bead3678 00019cb0 00019cb0
dfe0: 00022a94 bead3250 00018114 400d8f1c 40000010 00000003 00000000 00000000
[<c0115338>] (strlen+0xc/0x20) from [<c0111f48>] (kobject_get_path+0x24/0xa4)
[<c0111f48>] (kobject_get_path+0x24/0xa4) from [<c014dce0>] (dev_uevent+0x1dc/0x208)
[<c014dce0>] (dev_uevent+0x1dc/0x208) from [<c0112400>] (kobject_uevent_env+0x18c/0x3a8)
[<c0112400>] (kobject_uevent_env+0x18c/0x3a8) from [<c014dd80>] (store_uevent+0x74/0x88)
[<c014dd80>] (store_uevent+0x74/0x88) from [<c014d464>] (dev_attr_store+0x20/0x28)
[<c014d464>] (dev_attr_store+0x20/0x28) from [<c00bf224>] (sysfs_write_file+0x104/0x13c)
[<c00bf224>] (sysfs_write_file+0x104/0x13c) from [<c0080bd8>] (vfs_write+0xb0/0x15c)
[<c0080bd8>] (vfs_write+0xb0/0x15c) from [<c0080d38>] (sys_write+0x40/0x6c)
[<c0080d38>] (sys_write+0x40/0x6c) from [<c0022e00>] (ret_fast_syscall+0x0/0x2c)
Code: c02dbe78 e1a02000 ea000000 e2800001 (e5d03000)
---[ end trace 4d391449ae70e71a ]---
Segmentation fault
[...]

This Oops does not occure in v2.6.31-rc4 and occures in v2.6.31-rc5. Bisected
to the commit:

e084b2d95e48b31aa45f9c49ffc6cdae8bdb21d4
"page-allocator: preserve PFN ordering when __GFP_COLD is set"

But its curious: The same binary kernel still runs without this oops on an ARM
S3C2410 CPU.

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-11 16:30 ` Juergen Beisert
@ 2009-08-12  7:47   ` Robert Schwebel
  -1 siblings, 0 replies; 21+ messages in thread
From: Robert Schwebel @ 2009-08-12  7:47 UTC (permalink / raw)
  To: Juergen Beisert; +Cc: linux-kernel, linux-arm-kernel, linux-hotplug, Mel Gorman

Hi Jürgen,

adding the patch author to Cc: ...

rsc

On Tue, Aug 11, 2009 at 06:30:02PM +0200, Juergen Beisert wrote:
> Hi,
> 
> I get the following Ooops message when "udevadm" is running on an ARM S3C2440
> CPU based system:
> 
> [...]
> starting udevd...done
> Unable to handle kernel paging request at virtual address e3540000
> pgd = c39d4000
> [e3540000] *pgd=00000000
> Internal error: Oops: 5 [#1]
> Modules linked in:
> CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> PC is at strlen+0xc/0x20
> LR is at kobject_get_path+0x24/0xa4
> pc : [<c0115338>]    lr : [<c0111f48>]    psr: a0000013
> sp : c39bdea0  ip : 00000005  fp : c029645b
> r10: c02e0430  r9 : 00000000  r8 : c3802c60
> r7 : c001de30  r6 : 000000d0  r5 : 00000001  r4 : c001de30
> r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: c000717f  Table: 339d4000  DAC: 00000015
> Process udevadm (pid: 325, stack limit = 0xc39bc270)
> Stack: (0xc39bdea0 to 0xc39be000)
> dea0: c001de28 00000003 c393d778 c399d000 c3802c60 c014dce0 c029645b c0112200
> dec0: c393d778 00000000 00000003 c393d780 c399d000 c0112400 00000000 00020001
> dee0: c0307fdc 00000000 c3950960 c027ddc3 c380cda0 c343b3d4 c3811e60 00000000
> df00: 00000000 c393d778 00000003 c3960520 c393d780 c02e0470 c3960538 c39bdf88
> df20: 00019cb0 c014dd80 c382db20 00000000 c3960538 c38ede7c 00000003 c014d464
> df40: c38ede7c c00bf224 c382db20 00019cb0 c39bdf88 00000004 00000003 c39bc000
> df60: 00000000 c0080bd8 c343b3d4 00000020 00000000 00000000 c382db20 00000004
> df80: c0022fa8 c0080d38 00000000 00000000 bead3250 00000000 00026d98 00000003
> dfa0: bead3250 c0022e00 00026d98 00000003 00000003 00019cb0 00000003 00000000
> dfc0: 00026d98 00000003 bead3250 00000004 00022a90 bead3678 00019cb0 00019cb0
> dfe0: 00022a94 bead3250 00018114 400d8f1c 40000010 00000003 00000000 00000000
> [<c0115338>] (strlen+0xc/0x20) from [<c0111f48>] (kobject_get_path+0x24/0xa4)
> [<c0111f48>] (kobject_get_path+0x24/0xa4) from [<c014dce0>] (dev_uevent+0x1dc/0x208)
> [<c014dce0>] (dev_uevent+0x1dc/0x208) from [<c0112400>] (kobject_uevent_env+0x18c/0x3a8)
> [<c0112400>] (kobject_uevent_env+0x18c/0x3a8) from [<c014dd80>] (store_uevent+0x74/0x88)
> [<c014dd80>] (store_uevent+0x74/0x88) from [<c014d464>] (dev_attr_store+0x20/0x28)
> [<c014d464>] (dev_attr_store+0x20/0x28) from [<c00bf224>] (sysfs_write_file+0x104/0x13c)
> [<c00bf224>] (sysfs_write_file+0x104/0x13c) from [<c0080bd8>] (vfs_write+0xb0/0x15c)
> [<c0080bd8>] (vfs_write+0xb0/0x15c) from [<c0080d38>] (sys_write+0x40/0x6c)
> [<c0080d38>] (sys_write+0x40/0x6c) from [<c0022e00>] (ret_fast_syscall+0x0/0x2c)
> Code: c02dbe78 e1a02000 ea000000 e2800001 (e5d03000)
> ---[ end trace 4d391449ae70e71a ]---
> Segmentation fault
> [...]
> 
> This Oops does not occure in v2.6.31-rc4 and occures in v2.6.31-rc5. Bisected
> to the commit:
> 
> e084b2d95e48b31aa45f9c49ffc6cdae8bdb21d4
> "page-allocator: preserve PFN ordering when __GFP_COLD is set"
> 
> But its curious: The same binary kernel still runs without this oops on an ARM
> S3C2410 CPU.
> 
> Regards,
> Juergen
> 
> -- 
> Pengutronix e.K.                              | Juergen Beisert             |
> Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
> Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
> Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
@ 2009-08-12  7:47   ` Robert Schwebel
  0 siblings, 0 replies; 21+ messages in thread
From: Robert Schwebel @ 2009-08-12  7:47 UTC (permalink / raw)
  To: Juergen Beisert; +Cc: linux-kernel, linux-arm-kernel, linux-hotplug, Mel Gorman

Hi Jürgen,

adding the patch author to Cc: ...

rsc

On Tue, Aug 11, 2009 at 06:30:02PM +0200, Juergen Beisert wrote:
> Hi,
> 
> I get the following Ooops message when "udevadm" is running on an ARM S3C2440
> CPU based system:
> 
> [...]
> starting udevd...done
> Unable to handle kernel paging request at virtual address e3540000
> pgd = c39d4000
> [e3540000] *pgd\0000000
> Internal error: Oops: 5 [#1]
> Modules linked in:
> CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> PC is at strlen+0xc/0x20
> LR is at kobject_get_path+0x24/0xa4
> pc : [<c0115338>]    lr : [<c0111f48>]    psr: a0000013
> sp : c39bdea0  ip : 00000005  fp : c029645b
> r10: c02e0430  r9 : 00000000  r8 : c3802c60
> r7 : c001de30  r6 : 000000d0  r5 : 00000001  r4 : c001de30
> r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: c000717f  Table: 339d4000  DAC: 00000015
> Process udevadm (pid: 325, stack limit = 0xc39bc270)
> Stack: (0xc39bdea0 to 0xc39be000)
> dea0: c001de28 00000003 c393d778 c399d000 c3802c60 c014dce0 c029645b c0112200
> dec0: c393d778 00000000 00000003 c393d780 c399d000 c0112400 00000000 00020001
> dee0: c0307fdc 00000000 c3950960 c027ddc3 c380cda0 c343b3d4 c3811e60 00000000
> df00: 00000000 c393d778 00000003 c3960520 c393d780 c02e0470 c3960538 c39bdf88
> df20: 00019cb0 c014dd80 c382db20 00000000 c3960538 c38ede7c 00000003 c014d464
> df40: c38ede7c c00bf224 c382db20 00019cb0 c39bdf88 00000004 00000003 c39bc000
> df60: 00000000 c0080bd8 c343b3d4 00000020 00000000 00000000 c382db20 00000004
> df80: c0022fa8 c0080d38 00000000 00000000 bead3250 00000000 00026d98 00000003
> dfa0: bead3250 c0022e00 00026d98 00000003 00000003 00019cb0 00000003 00000000
> dfc0: 00026d98 00000003 bead3250 00000004 00022a90 bead3678 00019cb0 00019cb0
> dfe0: 00022a94 bead3250 00018114 400d8f1c 40000010 00000003 00000000 00000000
> [<c0115338>] (strlen+0xc/0x20) from [<c0111f48>] (kobject_get_path+0x24/0xa4)
> [<c0111f48>] (kobject_get_path+0x24/0xa4) from [<c014dce0>] (dev_uevent+0x1dc/0x208)
> [<c014dce0>] (dev_uevent+0x1dc/0x208) from [<c0112400>] (kobject_uevent_env+0x18c/0x3a8)
> [<c0112400>] (kobject_uevent_env+0x18c/0x3a8) from [<c014dd80>] (store_uevent+0x74/0x88)
> [<c014dd80>] (store_uevent+0x74/0x88) from [<c014d464>] (dev_attr_store+0x20/0x28)
> [<c014d464>] (dev_attr_store+0x20/0x28) from [<c00bf224>] (sysfs_write_file+0x104/0x13c)
> [<c00bf224>] (sysfs_write_file+0x104/0x13c) from [<c0080bd8>] (vfs_write+0xb0/0x15c)
> [<c0080bd8>] (vfs_write+0xb0/0x15c) from [<c0080d38>] (sys_write+0x40/0x6c)
> [<c0080d38>] (sys_write+0x40/0x6c) from [<c0022e00>] (ret_fast_syscall+0x0/0x2c)
> Code: c02dbe78 e1a02000 ea000000 e2800001 (e5d03000)
> ---[ end trace 4d391449ae70e71a ]---
> Segmentation fault
> [...]
> 
> This Oops does not occure in v2.6.31-rc4 and occures in v2.6.31-rc5. Bisected
> to the commit:
> 
> e084b2d95e48b31aa45f9c49ffc6cdae8bdb21d4
> "page-allocator: preserve PFN ordering when __GFP_COLD is set"
> 
> But its curious: The same binary kernel still runs without this oops on an ARM
> S3C2410 CPU.
> 
> Regards,
> Juergen
> 
> -- 
> Pengutronix e.K.                              | Juergen Beisert             |
> Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
> Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
> Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-12  7:47   ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Robert Schwebel
@ 2009-08-12  9:20     ` Mel Gorman
  -1 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2009-08-12  9:20 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Juergen Beisert, linux-kernel, linux-arm-kernel, linux-hotplug

On Wed, Aug 12, 2009 at 09:47:53AM +0200, Robert Schwebel wrote:
> Hi Jürgen,
> 
> adding the patch author to Cc: ...
> 
> rsc
> 
> On Tue, Aug 11, 2009 at 06:30:02PM +0200, Juergen Beisert wrote:
> > Hi,
> > 
> > I get the following Ooops message when "udevadm" is running on an ARM S3C2440
> > CPU based system:
> > 

This is extremely odd. All that patch is doing is changing what order pages
are returned in to the caller when __GFP_COLD is specified.  valid memory. Does
reverting the patch really make the problem go away?

> > [...]
> > starting udevd...done
> > Unable to handle kernel paging request at virtual address e3540000
> > pgd = c39d4000
> > [e3540000] *pgd=00000000
> > Internal error: Oops: 5 [#1]
> > Modules linked in:
> > CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > PC is at strlen+0xc/0x20
> > LR is at kobject_get_path+0x24/0xa4

I haven't tackled this sort of bug before but it looks more likely that
there is garbage in the sysfs tree that is being tripped up on.

> > pc : [<c0115338>]    lr : [<c0111f48>]    psr: a0000013
> > sp : c39bdea0  ip : 00000005  fp : c029645b
> > r10: c02e0430  r9 : 00000000  r8 : c3802c60
> > r7 : c001de30  r6 : 000000d0  r5 : 00000001  r4 : c001de30
> > r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
> > Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> > Control: c000717f  Table: 339d4000  DAC: 00000015
> > Process udevadm (pid: 325, stack limit = 0xc39bc270)
> > Stack: (0xc39bdea0 to 0xc39be000)
> > dea0: c001de28 00000003 c393d778 c399d000 c3802c60 c014dce0 c029645b c0112200
> > dec0: c393d778 00000000 00000003 c393d780 c399d000 c0112400 00000000 00020001
> > dee0: c0307fdc 00000000 c3950960 c027ddc3 c380cda0 c343b3d4 c3811e60 00000000
> > df00: 00000000 c393d778 00000003 c3960520 c393d780 c02e0470 c3960538 c39bdf88
> > df20: 00019cb0 c014dd80 c382db20 00000000 c3960538 c38ede7c 00000003 c014d464
> > df40: c38ede7c c00bf224 c382db20 00019cb0 c39bdf88 00000004 00000003 c39bc000
> > df60: 00000000 c0080bd8 c343b3d4 00000020 00000000 00000000 c382db20 00000004
> > df80: c0022fa8 c0080d38 00000000 00000000 bead3250 00000000 00026d98 00000003
> > dfa0: bead3250 c0022e00 00026d98 00000003 00000003 00019cb0 00000003 00000000
> > dfc0: 00026d98 00000003 bead3250 00000004 00022a90 bead3678 00019cb0 00019cb0
> > dfe0: 00022a94 bead3250 00018114 400d8f1c 40000010 00000003 00000000 00000000
> > [<c0115338>] (strlen+0xc/0x20) from [<c0111f48>] (kobject_get_path+0x24/0xa4)
> > [<c0111f48>] (kobject_get_path+0x24/0xa4) from [<c014dce0>] (dev_uevent+0x1dc/0x208)
> > [<c014dce0>] (dev_uevent+0x1dc/0x208) from [<c0112400>] (kobject_uevent_env+0x18c/0x3a8)
> > [<c0112400>] (kobject_uevent_env+0x18c/0x3a8) from [<c014dd80>] (store_uevent+0x74/0x88)
> > [<c014dd80>] (store_uevent+0x74/0x88) from [<c014d464>] (dev_attr_store+0x20/0x28)
> > [<c014d464>] (dev_attr_store+0x20/0x28) from [<c00bf224>] (sysfs_write_file+0x104/0x13c)
> > [<c00bf224>] (sysfs_write_file+0x104/0x13c) from [<c0080bd8>] (vfs_write+0xb0/0x15c)
> > [<c0080bd8>] (vfs_write+0xb0/0x15c) from [<c0080d38>] (sys_write+0x40/0x6c)
> > [<c0080d38>] (sys_write+0x40/0x6c) from [<c0022e00>] (ret_fast_syscall+0x0/0x2c)
> > Code: c02dbe78 e1a02000 ea000000 e2800001 (e5d03000)
> > ---[ end trace 4d391449ae70e71a ]---
> > Segmentation fault
> > [...]
> > 
> > This Oops does not occure in v2.6.31-rc4 and occures in v2.6.31-rc5. Bisected
> > to the commit:
> > 
> > e084b2d95e48b31aa45f9c49ffc6cdae8bdb21d4
> > "page-allocator: preserve PFN ordering when __GFP_COLD is set"
> > 
> > But its curious: The same binary kernel still runs without this oops on an ARM
> > S3C2410 CPU.
> > 
> > Regards,
> > Juergen
> > 
> > -- 
> > Pengutronix e.K.                              | Juergen Beisert             |
> > Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
> > Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
> > Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
> -- 
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
@ 2009-08-12  9:20     ` Mel Gorman
  0 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2009-08-12  9:20 UTC (permalink / raw)
  To: Robert Schwebel
  Cc: Juergen Beisert, linux-kernel, linux-arm-kernel, linux-hotplug

On Wed, Aug 12, 2009 at 09:47:53AM +0200, Robert Schwebel wrote:
> Hi Jürgen,
> 
> adding the patch author to Cc: ...
> 
> rsc
> 
> On Tue, Aug 11, 2009 at 06:30:02PM +0200, Juergen Beisert wrote:
> > Hi,
> > 
> > I get the following Ooops message when "udevadm" is running on an ARM S3C2440
> > CPU based system:
> > 

This is extremely odd. All that patch is doing is changing what order pages
are returned in to the caller when __GFP_COLD is specified.  valid memory. Does
reverting the patch really make the problem go away?

> > [...]
> > starting udevd...done
> > Unable to handle kernel paging request at virtual address e3540000
> > pgd = c39d4000
> > [e3540000] *pgd\0000000
> > Internal error: Oops: 5 [#1]
> > Modules linked in:
> > CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > PC is at strlen+0xc/0x20
> > LR is at kobject_get_path+0x24/0xa4

I haven't tackled this sort of bug before but it looks more likely that
there is garbage in the sysfs tree that is being tripped up on.

> > pc : [<c0115338>]    lr : [<c0111f48>]    psr: a0000013
> > sp : c39bdea0  ip : 00000005  fp : c029645b
> > r10: c02e0430  r9 : 00000000  r8 : c3802c60
> > r7 : c001de30  r6 : 000000d0  r5 : 00000001  r4 : c001de30
> > r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
> > Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> > Control: c000717f  Table: 339d4000  DAC: 00000015
> > Process udevadm (pid: 325, stack limit = 0xc39bc270)
> > Stack: (0xc39bdea0 to 0xc39be000)
> > dea0: c001de28 00000003 c393d778 c399d000 c3802c60 c014dce0 c029645b c0112200
> > dec0: c393d778 00000000 00000003 c393d780 c399d000 c0112400 00000000 00020001
> > dee0: c0307fdc 00000000 c3950960 c027ddc3 c380cda0 c343b3d4 c3811e60 00000000
> > df00: 00000000 c393d778 00000003 c3960520 c393d780 c02e0470 c3960538 c39bdf88
> > df20: 00019cb0 c014dd80 c382db20 00000000 c3960538 c38ede7c 00000003 c014d464
> > df40: c38ede7c c00bf224 c382db20 00019cb0 c39bdf88 00000004 00000003 c39bc000
> > df60: 00000000 c0080bd8 c343b3d4 00000020 00000000 00000000 c382db20 00000004
> > df80: c0022fa8 c0080d38 00000000 00000000 bead3250 00000000 00026d98 00000003
> > dfa0: bead3250 c0022e00 00026d98 00000003 00000003 00019cb0 00000003 00000000
> > dfc0: 00026d98 00000003 bead3250 00000004 00022a90 bead3678 00019cb0 00019cb0
> > dfe0: 00022a94 bead3250 00018114 400d8f1c 40000010 00000003 00000000 00000000
> > [<c0115338>] (strlen+0xc/0x20) from [<c0111f48>] (kobject_get_path+0x24/0xa4)
> > [<c0111f48>] (kobject_get_path+0x24/0xa4) from [<c014dce0>] (dev_uevent+0x1dc/0x208)
> > [<c014dce0>] (dev_uevent+0x1dc/0x208) from [<c0112400>] (kobject_uevent_env+0x18c/0x3a8)
> > [<c0112400>] (kobject_uevent_env+0x18c/0x3a8) from [<c014dd80>] (store_uevent+0x74/0x88)
> > [<c014dd80>] (store_uevent+0x74/0x88) from [<c014d464>] (dev_attr_store+0x20/0x28)
> > [<c014d464>] (dev_attr_store+0x20/0x28) from [<c00bf224>] (sysfs_write_file+0x104/0x13c)
> > [<c00bf224>] (sysfs_write_file+0x104/0x13c) from [<c0080bd8>] (vfs_write+0xb0/0x15c)
> > [<c0080bd8>] (vfs_write+0xb0/0x15c) from [<c0080d38>] (sys_write+0x40/0x6c)
> > [<c0080d38>] (sys_write+0x40/0x6c) from [<c0022e00>] (ret_fast_syscall+0x0/0x2c)
> > Code: c02dbe78 e1a02000 ea000000 e2800001 (e5d03000)
> > ---[ end trace 4d391449ae70e71a ]---
> > Segmentation fault
> > [...]
> > 
> > This Oops does not occure in v2.6.31-rc4 and occures in v2.6.31-rc5. Bisected
> > to the commit:
> > 
> > e084b2d95e48b31aa45f9c49ffc6cdae8bdb21d4
> > "page-allocator: preserve PFN ordering when __GFP_COLD is set"
> > 
> > But its curious: The same binary kernel still runs without this oops on an ARM
> > S3C2410 CPU.
> > 
> > Regards,
> > Juergen
> > 
> > -- 
> > Pengutronix e.K.                              | Juergen Beisert             |
> > Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
> > Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
> > Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
> -- 
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-12  9:20     ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Mel Gorman
@ 2009-08-12 11:11       ` Juergen Beisert
  -1 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-12 11:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mel Gorman, linux-arm-kernel, linux-hotplug

On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > I get the following Ooops message when "udevadm" is running on an ARM
> > > S3C2440 CPU based system:
>
> This is extremely odd. All that patch is doing is changing what order pages
> are returned in to the caller when __GFP_COLD is specified.  valid memory.
> Does reverting the patch really make the problem go away?

At least I can work with the system if I remove this patch. Theres is no oops, 
so udev creates all the required devnodes and the system comes up into the 
login prompt.

> > > [...]
> > > starting udevd...done
> > > Unable to handle kernel paging request at virtual address e3540000
> > > pgd = c39d4000
> > > [e3540000] *pgd=00000000
> > > Internal error: Oops: 5 [#1]
> > > Modules linked in:
> > > CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > > PC is at strlen+0xc/0x20
> > > LR is at kobject_get_path+0x24/0xa4
>
> I haven't tackled this sort of bug before but it looks more likely that
> there is garbage in the sysfs tree that is being tripped up on.

Yes, I think so, too. Because the same binary rc5 image runs on an S3C2410 CPU 
without an oops, but oopses on an S3C2440 (both CPUs are nearly the same, but 
only nearly). But how to track down such a failure?

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-08-12 11:11       ` Juergen Beisert
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-12 11:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mel Gorman, linux-arm-kernel, linux-hotplug

On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > I get the following Ooops message when "udevadm" is running on an ARM
> > > S3C2440 CPU based system:
>
> This is extremely odd. All that patch is doing is changing what order pages
> are returned in to the caller when __GFP_COLD is specified.  valid memory.
> Does reverting the patch really make the problem go away?

At least I can work with the system if I remove this patch. Theres is no oops, 
so udev creates all the required devnodes and the system comes up into the 
login prompt.

> > > [...]
> > > starting udevd...done
> > > Unable to handle kernel paging request at virtual address e3540000
> > > pgd = c39d4000
> > > [e3540000] *pgd\0000000
> > > Internal error: Oops: 5 [#1]
> > > Modules linked in:
> > > CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > > PC is at strlen+0xc/0x20
> > > LR is at kobject_get_path+0x24/0xa4
>
> I haven't tackled this sort of bug before but it looks more likely that
> there is garbage in the sysfs tree that is being tripped up on.

Yes, I think so, too. Because the same binary rc5 image runs on an S3C2410 CPU 
without an oops, but oopses on an S3C2440 (both CPUs are nearly the same, but 
only nearly). But how to track down such a failure?

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-12 11:11       ` Juergen Beisert
@ 2009-08-12 13:50         ` Mel Gorman
  -1 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2009-08-12 13:50 UTC (permalink / raw)
  To: Juergen Beisert; +Cc: linux-kernel, linux-arm-kernel, linux-hotplug

On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > I get the following Ooops message when "udevadm" is running on an ARM
> > > > S3C2440 CPU based system:
> >
> > This is extremely odd. All that patch is doing is changing what order pages
> > are returned in to the caller when __GFP_COLD is specified.  valid memory.
> > Does reverting the patch really make the problem go away?
> 
> At least I can work with the system if I remove this patch. Theres is no oops, 
> so udev creates all the required devnodes and the system comes up into the 
> login prompt.
> 

One reason I can think of that the patch would make a different to booting
is that there is a buffer overrun somewhere. When the pages in one order, the
buffer overrun is into pages that are not being used so it's not spotted. In
the other order, the overrun causes damage. The patch only alters the
order of pages in a linked list and ordinarily that shouldn't make any
functional difference.

Can you enable the config option DEBUG_PAGEALLOC please and tell me if
that blows up in some unexpected fashion? It would also be helpful if
you could enable all slab/slqb/slub debugging (whichever one you are
using).

> > > > [...]
> > > > starting udevd...done
> > > > Unable to handle kernel paging request at virtual address e3540000
> > > > pgd = c39d4000
> > > > [e3540000] *pgd=00000000
> > > > Internal error: Oops: 5 [#1]
> > > > Modules linked in:
> > > > CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > > > PC is at strlen+0xc/0x20
> > > > LR is at kobject_get_path+0x24/0xa4
> >
> > I haven't tackled this sort of bug before but it looks more likely that
> > there is garbage in the sysfs tree that is being tripped up on.
> 
> Yes, I think so, too. Because the same binary rc5 image runs on an S3C2410 CPU 
> without an oops, but oopses on an S3C2440 (both CPUs are nearly the same, but 
> only nearly). But how to track down such a failure?
> 

Lets start with a full dmesg with CONFIG_DEBUG_KOBJECT and
CONFIG_DEBUG_OBJECTS set and see if anything springs up that looks
unusual on that platform.

Thanks

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
@ 2009-08-12 13:50         ` Mel Gorman
  0 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2009-08-12 13:50 UTC (permalink / raw)
  To: Juergen Beisert; +Cc: linux-kernel, linux-arm-kernel, linux-hotplug

On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > I get the following Ooops message when "udevadm" is running on an ARM
> > > > S3C2440 CPU based system:
> >
> > This is extremely odd. All that patch is doing is changing what order pages
> > are returned in to the caller when __GFP_COLD is specified.  valid memory.
> > Does reverting the patch really make the problem go away?
> 
> At least I can work with the system if I remove this patch. Theres is no oops, 
> so udev creates all the required devnodes and the system comes up into the 
> login prompt.
> 

One reason I can think of that the patch would make a different to booting
is that there is a buffer overrun somewhere. When the pages in one order, the
buffer overrun is into pages that are not being used so it's not spotted. In
the other order, the overrun causes damage. The patch only alters the
order of pages in a linked list and ordinarily that shouldn't make any
functional difference.

Can you enable the config option DEBUG_PAGEALLOC please and tell me if
that blows up in some unexpected fashion? It would also be helpful if
you could enable all slab/slqb/slub debugging (whichever one you are
using).

> > > > [...]
> > > > starting udevd...done
> > > > Unable to handle kernel paging request at virtual address e3540000
> > > > pgd = c39d4000
> > > > [e3540000] *pgd\0000000
> > > > Internal error: Oops: 5 [#1]
> > > > Modules linked in:
> > > > CPU: 0    Not tainted  (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > > > PC is at strlen+0xc/0x20
> > > > LR is at kobject_get_path+0x24/0xa4
> >
> > I haven't tackled this sort of bug before but it looks more likely that
> > there is garbage in the sysfs tree that is being tripped up on.
> 
> Yes, I think so, too. Because the same binary rc5 image runs on an S3C2410 CPU 
> without an oops, but oopses on an S3C2440 (both CPUs are nearly the same, but 
> only nearly). But how to track down such a failure?
> 

Lets start with a full dmesg with CONFIG_DEBUG_KOBJECT and
CONFIG_DEBUG_OBJECTS set and see if anything springs up that looks
unusual on that platform.

Thanks

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-12 13:50         ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Mel Gorman
@ 2009-08-12 15:35           ` Juergen Beisert
  -1 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-12 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mel Gorman, linux-arm-kernel, linux-hotplug

On Mittwoch, 12. August 2009, Mel Gorman wrote:
> On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> > On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > > I get the following Ooops message when "udevadm" is running on an
> > > > > ARM S3C2440 CPU based system:
> > >
> > > This is extremely odd. All that patch is doing is changing what order
> > > pages are returned in to the caller when __GFP_COLD is specified. 
> > > valid memory. Does reverting the patch really make the problem go away?
> >
> > At least I can work with the system if I remove this patch. Theres is no
> > oops, so udev creates all the required devnodes and the system comes up
> > into the login prompt.
>
> One reason I can think of that the patch would make a different to booting
> is that there is a buffer overrun somewhere. When the pages in one order,
> the buffer overrun is into pages that are not being used so it's not
> spotted. In the other order, the overrun causes damage. The patch only
> alters the order of pages in a linked list and ordinarily that shouldn't
> make any functional difference.
>
> Can you enable the config option DEBUG_PAGEALLOC please and tell me if
> that blows up in some unexpected fashion? It would also be helpful if
> you could enable all slab/slqb/slub debugging (whichever one you are
> using).

DEBUG_PAGEALLOC=y and DEBUG_SLAB=y does not change anything. Only 
DEBUG_KOBJECT=y shows this:

[...]
kobject: 'default' (c399af48): fill_kobj_path: path = '/class/bdi/default'
kobject: 'gpiochip0' (c3946f48): kobject_uevent_env
kobject: 'gpiochip0' (c3946f48): fill_kobj_path: path 
= '/class/gpio/gpiochip0'
kobject: 'gpiochip128' (c396af48): kobject_uevent_env
kobject: 'gpiochip128' (c396af48): fill_kobj_path: path 
= '/class/gpio/gpiochip128'
kobject: 'gpiochip160' (c396cf48): kobject_uevent_env
kobject: 'gpiochip160' (c396cf48): fill_kobj_path: path 
= '/class/gpio/gpiochip160'
kobject: 'gpiochip192' (c3962f48): kobject_uevent_env
kobject: 'gpiochip192' (c3962f48): fill_kobj_path: path 
= '/class/gpio/gpiochip192'
kobject: 'gpiochip224' (c3970f48): kobject_uevent_env
kobject: 'gpiochip224' (c3970f48): fill_kobj_path: path 
= '/class/gpio/gpiochip224'
kobject: 'gpiochip32' (c3964f48): kobject_uevent_env
kobject: 'gpiochip32' (c3964f48): fill_kobj_path: path 
= '/class/gpio/gpiochip32'
kobject: 'gpiochip64' (c3966f48): kobject_uevent_env
kobject: 'gpiochip64' (c3966f48): fill_kobj_path: path 
= '/class/gpio/gpiochip64'
kobject: 'gpiochip96' (c3969f48): kobject_uevent_env
kobject: 'gpiochip96' (c3969f48): fill_kobj_path: path 
= '/class/gpio/gpiochip96'
kobject: 'fb0' (c3bbff48): kobject_uevent_env
kobject: 'fb0' (c3bbff48): fill_kobj_path: path = '/class/graphics/fb0'
kobject: 's3c2410-lcd' (c02dc468): fill_kobj_path: path 
= '/devices/platform/s3c2410-lcd'
kobject: 'fbcon' (c3b62f48): kobject_uevent_env
kobject: 'fbcon' (c3b62f48): fill_kobj_path: path = '/class/graphics/fbcon'
kobject: 'i2c-0' (c39a3e78): kobject_uevent_env
kobject: 'i2c-0' (c39a3e78): fill_kobj_path: path = '/class/i2c-adapter/i2c-0'
kobject: 's3c2440-i2c' (c02de368): fill_kobj_path: path 
= '/devices/platform/s3c2440-i2c'
kobject: '0-0068' (c3996f28): kobject_uevent_env
kobject: '0-0068' (c3996f28): fill_kobj_path: path 
= '/class/i2c-adapter/i2c-0/0-0068'
kobject: 'input0' (c3bba780): kobject_uevent_env
kobject: 'input0' (c3bba780): fill_kobj_path: path = '/class/input/input0'
Unable to handle kernel paging request at virtual address e3540000
pgd = c3ef8000
[e3540000] *pgd=00000000
Internal error: Oops: 5 [#1]
Modules linked in:
CPU: 0    Not tainted  (2.6.31-rc5 #8)
PC is at strlen+0xc/0x20
LR is at kobject_get_path+0x28/0xd0
pc : [<c011a3a4>]    lr : [<c0116ee4>]    psr: a0000013
sp : c3ebde90  ip : 00000005  fp : c029fdce
r10: c02ea490  r9 : 00000000  r8 : c001de30
r7 : c3e58000  r6 : 000000d0  r5 : 00000001  r4 : c001de30
r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 33ef8000  DAC: 00000015
Process udevadm (pid: 332, stack limit = 0xc3ebc270)
Stack: (0xc3ebde90 to 0xc3ebe000)
de80:                                     ffffffff ffffffff a0000013 c001de28
dea0: c3bba780 c3bba778 c3e58000 c3802ce0 c02ea490 c0153a68 c029fdce c01171c4
dec0: c3bba780 00000000 c3bba780 c3d07da0 c3e58000 c0117420 00000000 00020001
dee0: 00000000 00000000 c3f2a6c0 c0286d79 c3e67f70 c3797f70 c380bf40 00000000
df00: 00000000 c3bba778 00000003 c3d07da0 c3bba780 c02ea4d0 c3d07db8 c3ebdf88
df20: 00019cb0 c0153b08 c3efcf78 00000000 c3d07db8 c3d00d10 00000003 c01531ec
df40: c3d00d10 c00c3f6c c3efcf78 00019cb0 c3ebdf88 00000004 00000003 c3ebc000
df60: 00000000 c0085734 c3797f70 00000020 00000000 00000000 c3efcf78 00000004
df80: c0025fa8 c0085894 00000000 00000000 be878250 00000000 00026e20 00000003
dfa0: be878250 c0025e00 00026e20 00000003 00000003 00019cb0 00000003 00000000
dfc0: 00026e20 00000003 be878250 00000004 00022a90 be878678 00019cb0 00019cb0
dfe0: 00022a94 be878250 00018114 400d8f1c 40000010 00000003 aaaaaaaa aaaaaaaa
[<c011a3a4>] (strlen+0xc/0x20) from [<c0116ee4>] (kobject_get_path+0x28/0xd0)
[<c0116ee4>] (kobject_get_path+0x28/0xd0) from [<c0153a68>] 
(dev_uevent+0x1dc/0x208)
[<c0153a68>] (dev_uevent+0x1dc/0x208) from [<c0117420>] 
(kobject_uevent_env+0x1e8/0x450)
[<c0117420>] (kobject_uevent_env+0x1e8/0x450) from [<c0153b08>] 
(store_uevent+0x74/0x88)
[<c0153b08>] (store_uevent+0x74/0x88) from [<c01531ec>] 
(dev_attr_store+0x20/0x28)
[<c01531ec>] (dev_attr_store+0x20/0x28) from [<c00c3f6c>] 
(sysfs_write_file+0x104/0x13c)
[<c00c3f6c>] (sysfs_write_file+0x104/0x13c) from [<c0085734>] 
(vfs_write+0xb0/0x15c)
[<c0085734>] (vfs_write+0xb0/0x15c) from [<c0085894>] (sys_write+0x40/0x6c)
[<c0085894>] (sys_write+0x40/0x6c) from [<c0025e00>] 
(ret_fast_syscall+0x0/0x2c)
Code: c02e5eb8 e1a02000 ea000000 e2800001 (e5d03000)
---[ end trace a8d7bdeb081ef0e0 ]---
Segmentation fault
[...]

After this oops, system startup continues. Then the next oops occurs:

This one is new, since I try to mount the connected SD card.

[...]
Unable to handle kernel paging request at virtual address fcd00004
pgd = c3fec000
[fcd00004] *pgd=00000000
Internal error: Oops: 5 [#2]
Modules linked in:
CPU: 0    Tainted: G      D     (2.6.31-rc5 #8)
PC is at s3c2410_gpio_getpin+0xc/0x20
LR is at s3cmci_card_present+0x1c/0x38
pc : [<c002fe50>]    lr : [<c01a250c>]    psr: 20000013
sp : c3d8de48  ip : 00000200  fp : 00000000
r10: c3d7dfa4  r9 : c3d8de90  r8 : 00000001
r7 : c3d28e00  r6 : c3d8de90  r5 : c3d42800  r4 : c001ded8
r3 : fcd00000  r2 : 00000000  r1 : c3d8de90  r0 : 03a00002
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: c000717f  Table: 33fec000  DAC: 00000017
Process mmcqd (pid: 305, stack limit = 0xc3d8c270)
Stack: (0xc3d8de48 to 0xc3d8e000)
de40:                   c3d42940 c01a2ba0 c3d8de90 c3d42800 c3f1df28 c0199d04
de60: c3d7dfa4 00000000 c3d8de68 c3d8de68 c3d8de90 c3f1df28 c3d7dfa4 c019f630
de80: 90cf85d1 3d2e1a39 47260d9e c3d8dea4 c3d8dea4 c3d8defc 00000000 c3d8de64
dea0: c0199d8c 00000011 00004000 00000000 00000000 00000000 00000000 000000b5
dec0: 00000000 00000000 c3d8defc c3d8de90 0000000c 00000000 00000000 00000000
dee0: 00000000 00000000 0000049d 00000000 00000000 00000000 00000000 02faf080
df00: 00000000 00000200 00000001 00000000 00000200 00000000 00000000 c3d8de90
df20: 00000001 c3d89800 c02de5a0 c38bcd40 c02de5a0 c0034114 c3d73d08 c02de568
df40: c38bcd38 c39f001c c3d73d08 00000017 c3d73d08 c3d82ba0 00000000 00000000
df60: c3d7dfa4 00000000 c3d7dfa4 c3d82ba0 00000000 c010b4e4 c3d73d08 c3f1df28
df80: 00000000 c3f1df28 c3d7dfa4 c3d8c000 c3d7dfac c3d82ba0 00000000 c3d82cb0
dfa0: 00000001 c019fea8 c39f1dfc c3d8dfd4 c39f1dfc c3d7dfa4 c019fdc4 00000000
dfc0: 00000000 00000000 00000000 c004a7e8 00000000 00000000 c3d8dfd8 c3d8dfd8
dfe0: 00000000 00000000 00000000 00000000 00000000 c0026dd8 fb0f7272 885b61ba
[<c002fe50>] (s3c2410_gpio_getpin+0xc/0x20) from [<c01a250c>] 
(s3cmci_card_present+0x1c/0x38)
[<c01a250c>] (s3cmci_card_present+0x1c/0x38) from [<c01a2ba0>] 
(s3cmci_request+0x28/0x60)
[<c01a2ba0>] (s3cmci_request+0x28/0x60) from [<c0199d04>] 
(mmc_wait_for_req+0x104/0x11c)
[<c0199d04>] (mmc_wait_for_req+0x104/0x11c) from [<c019f630>] 
(mmc_blk_issue_rq+0x1c4/0x53c)
[<c019f630>] (mmc_blk_issue_rq+0x1c4/0x53c) from [<c019fea8>] 
(mmc_queue_thread+0xe4/0xe8)
[<c019fea8>] (mmc_queue_thread+0xe4/0xe8) from [<c004a7e8>] 
(kthread+0x74/0x78)
[<c004a7e8>] (kthread+0x74/0x78) from [<c0026dd8>] 
(kernel_thread_exit+0x0/0x8)
Code: e8bd8010 e3c0301f e1a030a3 e28334fb (e5933004)
---[ end trace a8d7bdeb081ef0e1 ]---

Seems something GPIO related. Because also the '/class/input/input0' is a GPIO 
based button. I will continue tomorrow.

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-08-12 15:35           ` Juergen Beisert
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-12 15:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mel Gorman, linux-arm-kernel, linux-hotplug

On Mittwoch, 12. August 2009, Mel Gorman wrote:
> On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> > On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > > I get the following Ooops message when "udevadm" is running on an
> > > > > ARM S3C2440 CPU based system:
> > >
> > > This is extremely odd. All that patch is doing is changing what order
> > > pages are returned in to the caller when __GFP_COLD is specified. 
> > > valid memory. Does reverting the patch really make the problem go away?
> >
> > At least I can work with the system if I remove this patch. Theres is no
> > oops, so udev creates all the required devnodes and the system comes up
> > into the login prompt.
>
> One reason I can think of that the patch would make a different to booting
> is that there is a buffer overrun somewhere. When the pages in one order,
> the buffer overrun is into pages that are not being used so it's not
> spotted. In the other order, the overrun causes damage. The patch only
> alters the order of pages in a linked list and ordinarily that shouldn't
> make any functional difference.
>
> Can you enable the config option DEBUG_PAGEALLOC please and tell me if
> that blows up in some unexpected fashion? It would also be helpful if
> you could enable all slab/slqb/slub debugging (whichever one you are
> using).

DEBUG_PAGEALLOC=y and DEBUG_SLAB=y does not change anything. Only 
DEBUG_KOBJECT=y shows this:

[...]
kobject: 'default' (c399af48): fill_kobj_path: path = '/class/bdi/default'
kobject: 'gpiochip0' (c3946f48): kobject_uevent_env
kobject: 'gpiochip0' (c3946f48): fill_kobj_path: path 
= '/class/gpio/gpiochip0'
kobject: 'gpiochip128' (c396af48): kobject_uevent_env
kobject: 'gpiochip128' (c396af48): fill_kobj_path: path 
= '/class/gpio/gpiochip128'
kobject: 'gpiochip160' (c396cf48): kobject_uevent_env
kobject: 'gpiochip160' (c396cf48): fill_kobj_path: path 
= '/class/gpio/gpiochip160'
kobject: 'gpiochip192' (c3962f48): kobject_uevent_env
kobject: 'gpiochip192' (c3962f48): fill_kobj_path: path 
= '/class/gpio/gpiochip192'
kobject: 'gpiochip224' (c3970f48): kobject_uevent_env
kobject: 'gpiochip224' (c3970f48): fill_kobj_path: path 
= '/class/gpio/gpiochip224'
kobject: 'gpiochip32' (c3964f48): kobject_uevent_env
kobject: 'gpiochip32' (c3964f48): fill_kobj_path: path 
= '/class/gpio/gpiochip32'
kobject: 'gpiochip64' (c3966f48): kobject_uevent_env
kobject: 'gpiochip64' (c3966f48): fill_kobj_path: path 
= '/class/gpio/gpiochip64'
kobject: 'gpiochip96' (c3969f48): kobject_uevent_env
kobject: 'gpiochip96' (c3969f48): fill_kobj_path: path 
= '/class/gpio/gpiochip96'
kobject: 'fb0' (c3bbff48): kobject_uevent_env
kobject: 'fb0' (c3bbff48): fill_kobj_path: path = '/class/graphics/fb0'
kobject: 's3c2410-lcd' (c02dc468): fill_kobj_path: path 
= '/devices/platform/s3c2410-lcd'
kobject: 'fbcon' (c3b62f48): kobject_uevent_env
kobject: 'fbcon' (c3b62f48): fill_kobj_path: path = '/class/graphics/fbcon'
kobject: 'i2c-0' (c39a3e78): kobject_uevent_env
kobject: 'i2c-0' (c39a3e78): fill_kobj_path: path = '/class/i2c-adapter/i2c-0'
kobject: 's3c2440-i2c' (c02de368): fill_kobj_path: path 
= '/devices/platform/s3c2440-i2c'
kobject: '0-0068' (c3996f28): kobject_uevent_env
kobject: '0-0068' (c3996f28): fill_kobj_path: path 
= '/class/i2c-adapter/i2c-0/0-0068'
kobject: 'input0' (c3bba780): kobject_uevent_env
kobject: 'input0' (c3bba780): fill_kobj_path: path = '/class/input/input0'
Unable to handle kernel paging request at virtual address e3540000
pgd = c3ef8000
[e3540000] *pgd\0000000
Internal error: Oops: 5 [#1]
Modules linked in:
CPU: 0    Not tainted  (2.6.31-rc5 #8)
PC is at strlen+0xc/0x20
LR is at kobject_get_path+0x28/0xd0
pc : [<c011a3a4>]    lr : [<c0116ee4>]    psr: a0000013
sp : c3ebde90  ip : 00000005  fp : c029fdce
r10: c02ea490  r9 : 00000000  r8 : c001de30
r7 : c3e58000  r6 : 000000d0  r5 : 00000001  r4 : c001de30
r3 : e3550001  r2 : e3540000  r1 : 000000d0  r0 : e3540000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 33ef8000  DAC: 00000015
Process udevadm (pid: 332, stack limit = 0xc3ebc270)
Stack: (0xc3ebde90 to 0xc3ebe000)
de80:                                     ffffffff ffffffff a0000013 c001de28
dea0: c3bba780 c3bba778 c3e58000 c3802ce0 c02ea490 c0153a68 c029fdce c01171c4
dec0: c3bba780 00000000 c3bba780 c3d07da0 c3e58000 c0117420 00000000 00020001
dee0: 00000000 00000000 c3f2a6c0 c0286d79 c3e67f70 c3797f70 c380bf40 00000000
df00: 00000000 c3bba778 00000003 c3d07da0 c3bba780 c02ea4d0 c3d07db8 c3ebdf88
df20: 00019cb0 c0153b08 c3efcf78 00000000 c3d07db8 c3d00d10 00000003 c01531ec
df40: c3d00d10 c00c3f6c c3efcf78 00019cb0 c3ebdf88 00000004 00000003 c3ebc000
df60: 00000000 c0085734 c3797f70 00000020 00000000 00000000 c3efcf78 00000004
df80: c0025fa8 c0085894 00000000 00000000 be878250 00000000 00026e20 00000003
dfa0: be878250 c0025e00 00026e20 00000003 00000003 00019cb0 00000003 00000000
dfc0: 00026e20 00000003 be878250 00000004 00022a90 be878678 00019cb0 00019cb0
dfe0: 00022a94 be878250 00018114 400d8f1c 40000010 00000003 aaaaaaaa aaaaaaaa
[<c011a3a4>] (strlen+0xc/0x20) from [<c0116ee4>] (kobject_get_path+0x28/0xd0)
[<c0116ee4>] (kobject_get_path+0x28/0xd0) from [<c0153a68>] 
(dev_uevent+0x1dc/0x208)
[<c0153a68>] (dev_uevent+0x1dc/0x208) from [<c0117420>] 
(kobject_uevent_env+0x1e8/0x450)
[<c0117420>] (kobject_uevent_env+0x1e8/0x450) from [<c0153b08>] 
(store_uevent+0x74/0x88)
[<c0153b08>] (store_uevent+0x74/0x88) from [<c01531ec>] 
(dev_attr_store+0x20/0x28)
[<c01531ec>] (dev_attr_store+0x20/0x28) from [<c00c3f6c>] 
(sysfs_write_file+0x104/0x13c)
[<c00c3f6c>] (sysfs_write_file+0x104/0x13c) from [<c0085734>] 
(vfs_write+0xb0/0x15c)
[<c0085734>] (vfs_write+0xb0/0x15c) from [<c0085894>] (sys_write+0x40/0x6c)
[<c0085894>] (sys_write+0x40/0x6c) from [<c0025e00>] 
(ret_fast_syscall+0x0/0x2c)
Code: c02e5eb8 e1a02000 ea000000 e2800001 (e5d03000)
---[ end trace a8d7bdeb081ef0e0 ]---
Segmentation fault
[...]

After this oops, system startup continues. Then the next oops occurs:

This one is new, since I try to mount the connected SD card.

[...]
Unable to handle kernel paging request at virtual address fcd00004
pgd = c3fec000
[fcd00004] *pgd\0000000
Internal error: Oops: 5 [#2]
Modules linked in:
CPU: 0    Tainted: G      D     (2.6.31-rc5 #8)
PC is at s3c2410_gpio_getpin+0xc/0x20
LR is at s3cmci_card_present+0x1c/0x38
pc : [<c002fe50>]    lr : [<c01a250c>]    psr: 20000013
sp : c3d8de48  ip : 00000200  fp : 00000000
r10: c3d7dfa4  r9 : c3d8de90  r8 : 00000001
r7 : c3d28e00  r6 : c3d8de90  r5 : c3d42800  r4 : c001ded8
r3 : fcd00000  r2 : 00000000  r1 : c3d8de90  r0 : 03a00002
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: c000717f  Table: 33fec000  DAC: 00000017
Process mmcqd (pid: 305, stack limit = 0xc3d8c270)
Stack: (0xc3d8de48 to 0xc3d8e000)
de40:                   c3d42940 c01a2ba0 c3d8de90 c3d42800 c3f1df28 c0199d04
de60: c3d7dfa4 00000000 c3d8de68 c3d8de68 c3d8de90 c3f1df28 c3d7dfa4 c019f630
de80: 90cf85d1 3d2e1a39 47260d9e c3d8dea4 c3d8dea4 c3d8defc 00000000 c3d8de64
dea0: c0199d8c 00000011 00004000 00000000 00000000 00000000 00000000 000000b5
dec0: 00000000 00000000 c3d8defc c3d8de90 0000000c 00000000 00000000 00000000
dee0: 00000000 00000000 0000049d 00000000 00000000 00000000 00000000 02faf080
df00: 00000000 00000200 00000001 00000000 00000200 00000000 00000000 c3d8de90
df20: 00000001 c3d89800 c02de5a0 c38bcd40 c02de5a0 c0034114 c3d73d08 c02de568
df40: c38bcd38 c39f001c c3d73d08 00000017 c3d73d08 c3d82ba0 00000000 00000000
df60: c3d7dfa4 00000000 c3d7dfa4 c3d82ba0 00000000 c010b4e4 c3d73d08 c3f1df28
df80: 00000000 c3f1df28 c3d7dfa4 c3d8c000 c3d7dfac c3d82ba0 00000000 c3d82cb0
dfa0: 00000001 c019fea8 c39f1dfc c3d8dfd4 c39f1dfc c3d7dfa4 c019fdc4 00000000
dfc0: 00000000 00000000 00000000 c004a7e8 00000000 00000000 c3d8dfd8 c3d8dfd8
dfe0: 00000000 00000000 00000000 00000000 00000000 c0026dd8 fb0f7272 885b61ba
[<c002fe50>] (s3c2410_gpio_getpin+0xc/0x20) from [<c01a250c>] 
(s3cmci_card_present+0x1c/0x38)
[<c01a250c>] (s3cmci_card_present+0x1c/0x38) from [<c01a2ba0>] 
(s3cmci_request+0x28/0x60)
[<c01a2ba0>] (s3cmci_request+0x28/0x60) from [<c0199d04>] 
(mmc_wait_for_req+0x104/0x11c)
[<c0199d04>] (mmc_wait_for_req+0x104/0x11c) from [<c019f630>] 
(mmc_blk_issue_rq+0x1c4/0x53c)
[<c019f630>] (mmc_blk_issue_rq+0x1c4/0x53c) from [<c019fea8>] 
(mmc_queue_thread+0xe4/0xe8)
[<c019fea8>] (mmc_queue_thread+0xe4/0xe8) from [<c004a7e8>] 
(kthread+0x74/0x78)
[<c004a7e8>] (kthread+0x74/0x78) from [<c0026dd8>] 
(kernel_thread_exit+0x0/0x8)
Code: e8bd8010 e3c0301f e1a030a3 e28334fb (e5933004)
---[ end trace a8d7bdeb081ef0e1 ]---

Seems something GPIO related. Because also the '/class/input/input0' is a GPIO 
based button. I will continue tomorrow.

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-12 15:35           ` Juergen Beisert
@ 2009-08-12 18:40             ` Arnaud Faucher
  -1 siblings, 0 replies; 21+ messages in thread
From: Arnaud Faucher @ 2009-08-12 18:40 UTC (permalink / raw)
  To: Juergen Beisert; +Cc: linux-kernel, Mel Gorman, linux-arm-kernel, linux-hotplug

I have a rather similar problem on a driver that I try to keep
up-to-date with recent kernel versions
(http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
hardware is an ethernet-enabled disk controller on one chip, kind of a
cheap iSCSI.

In my case there is no oops: the symptoms are that the read blocks seem
to be swapped or full of garbage.

After investigation in the NDAS code, the bug triggers when the driver
tries to merge adjacent requests before sending them to the controller.
I had to disable this merge in order to restore normal behavior, at the
expense of a reduced efficiency.

> On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> > > On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > > > I get the following Ooops message when "udevadm" is running on an
> > > > > > ARM S3C2440 CPU based system:
> > > >
> > > > This is extremely odd. All that patch is doing is changing what order
> > > > pages are returned in to the caller when __GFP_COLD is specified. 
> > > > valid memory. Does reverting the patch really make the problem go away?
> > >
> > > At least I can work with the system if I remove this patch. Theres is no
> > > oops, so udev creates all the required devnodes and the system comes up
> > > into the login prompt.
> >
> > One reason I can think of that the patch would make a different to booting
> > is that there is a buffer overrun somewhere. When the pages in one order,
> > the buffer overrun is into pages that are not being used so it's not
> > spotted. In the other order, the overrun causes damage. The patch only
> > alters the order of pages in a linked list and ordinarily that shouldn't
> > make any functional difference.
> >
[...]
> After this oops, system startup continues. Then the next oops occurs:
> 
> This one is new, since I try to mount the connected SD card.
> 

Mel's buffer overrun theory seems to apply in the NDAS driver case,
where the original requests adjacency test seems faulty.

May it also be the cause of the SD mounting crash ?




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
@ 2009-08-12 18:40             ` Arnaud Faucher
  0 siblings, 0 replies; 21+ messages in thread
From: Arnaud Faucher @ 2009-08-12 18:40 UTC (permalink / raw)
  To: Juergen Beisert; +Cc: linux-kernel, Mel Gorman, linux-arm-kernel, linux-hotplug

I have a rather similar problem on a driver that I try to keep
up-to-date with recent kernel versions
(http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
hardware is an ethernet-enabled disk controller on one chip, kind of a
cheap iSCSI.

In my case there is no oops: the symptoms are that the read blocks seem
to be swapped or full of garbage.

After investigation in the NDAS code, the bug triggers when the driver
tries to merge adjacent requests before sending them to the controller.
I had to disable this merge in order to restore normal behavior, at the
expense of a reduced efficiency.

> On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> > > On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > > > I get the following Ooops message when "udevadm" is running on an
> > > > > > ARM S3C2440 CPU based system:
> > > >
> > > > This is extremely odd. All that patch is doing is changing what order
> > > > pages are returned in to the caller when __GFP_COLD is specified. 
> > > > valid memory. Does reverting the patch really make the problem go away?
> > >
> > > At least I can work with the system if I remove this patch. Theres is no
> > > oops, so udev creates all the required devnodes and the system comes up
> > > into the login prompt.
> >
> > One reason I can think of that the patch would make a different to booting
> > is that there is a buffer overrun somewhere. When the pages in one order,
> > the buffer overrun is into pages that are not being used so it's not
> > spotted. In the other order, the overrun causes damage. The patch only
> > alters the order of pages in a linked list and ordinarily that shouldn't
> > make any functional difference.
> >
[...]
> After this oops, system startup continues. Then the next oops occurs:
> 
> This one is new, since I try to mount the connected SD card.
> 

Mel's buffer overrun theory seems to apply in the NDAS driver case,
where the original requests adjacency test seems faulty.

May it also be the cause of the SD mounting crash ?




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-12 18:40             ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Arnaud Faucher
@ 2009-08-13  8:39               ` Mel Gorman
  -1 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2009-08-13  8:39 UTC (permalink / raw)
  To: Arnaud Faucher
  Cc: Juergen Beisert, linux-kernel, linux-arm-kernel, linux-hotplug

On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> I have a rather similar problem on a driver that I try to keep
> up-to-date with recent kernel versions
> (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> hardware is an ethernet-enabled disk controller on one chip, kind of a
> cheap iSCSI.
> 
> In my case there is no oops: the symptoms are that the read blocks seem
> to be swapped or full of garbage.
> 
> After investigation in the NDAS code, the bug triggers when the driver
> tries to merge adjacent requests before sending them to the controller.
> I had to disable this merge in order to restore normal behavior, at the
> expense of a reduced efficiency.
> 

That is a very interesting point and one I hadn't considered. The point
of the patch was to help drivers that merge adjacent requests if they
happen to be physically contiguous. The reported bug that led to the
patch was a regression of memory not being physically contiguous and
requests not being merged.

> > After this oops, system startup continues. Then the next oops occurs:
> > 
> > This one is new, since I try to mount the connected SD card.
> > 
> 
> Mel's buffer overrun theory seems to apply in the NDAS driver case,
> where the original requests adjacency test seems faulty.
> 
> May it also be the cause of the SD mounting crash ?
> 

It's a possibility. If it's not an overrun, it's possible that the automatic
merging code is buggy as well.

Juergen, is the disk controller on your machine capable of merging
requests? If so, can you disable it and see if the bug still occurs
please?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
@ 2009-08-13  8:39               ` Mel Gorman
  0 siblings, 0 replies; 21+ messages in thread
From: Mel Gorman @ 2009-08-13  8:39 UTC (permalink / raw)
  To: Arnaud Faucher
  Cc: Juergen Beisert, linux-kernel, linux-arm-kernel, linux-hotplug

On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> I have a rather similar problem on a driver that I try to keep
> up-to-date with recent kernel versions
> (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> hardware is an ethernet-enabled disk controller on one chip, kind of a
> cheap iSCSI.
> 
> In my case there is no oops: the symptoms are that the read blocks seem
> to be swapped or full of garbage.
> 
> After investigation in the NDAS code, the bug triggers when the driver
> tries to merge adjacent requests before sending them to the controller.
> I had to disable this merge in order to restore normal behavior, at the
> expense of a reduced efficiency.
> 

That is a very interesting point and one I hadn't considered. The point
of the patch was to help drivers that merge adjacent requests if they
happen to be physically contiguous. The reported bug that led to the
patch was a regression of memory not being physically contiguous and
requests not being merged.

> > After this oops, system startup continues. Then the next oops occurs:
> > 
> > This one is new, since I try to mount the connected SD card.
> > 
> 
> Mel's buffer overrun theory seems to apply in the NDAS driver case,
> where the original requests adjacency test seems faulty.
> 
> May it also be the cause of the SD mounting crash ?
> 

It's a possibility. If it's not an overrun, it's possible that the automatic
merging code is buggy as well.

Juergen, is the disk controller on your machine capable of merging
requests? If so, can you disable it and see if the bug still occurs
please?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-13  8:39               ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Mel Gorman
@ 2009-08-13  9:22                 ` Juergen Beisert
  -1 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-13  9:22 UTC (permalink / raw)
  To: ben-linux
  Cc: linux-kernel, Mel Gorman, Arnaud Faucher, linux-arm-kernel,
	linux-hotplug

On Donnerstag, 13. August 2009, Mel Gorman wrote:
> On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> > I have a rather similar problem on a driver that I try to keep
> > up-to-date with recent kernel versions
> > (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> > hardware is an ethernet-enabled disk controller on one chip, kind of a
> > cheap iSCSI.
> >
> > In my case there is no oops: the symptoms are that the read blocks seem
> > to be swapped or full of garbage.
> >
> > After investigation in the NDAS code, the bug triggers when the driver
> > tries to merge adjacent requests before sending them to the controller.
> > I had to disable this merge in order to restore normal behavior, at the
> > expense of a reduced efficiency.
>
> That is a very interesting point and one I hadn't considered. The point
> of the patch was to help drivers that merge adjacent requests if they
> happen to be physically contiguous. The reported bug that led to the
> patch was a regression of memory not being physically contiguous and
> requests not being merged.
>
> > > After this oops, system startup continues. Then the next oops occurs:
> > >
> > > This one is new, since I try to mount the connected SD card.
> >
> > Mel's buffer overrun theory seems to apply in the NDAS driver case,
> > where the original requests adjacency test seems faulty.
> >
> > May it also be the cause of the SD mounting crash ?
>
> It's a possibility. If it's not an overrun, it's possible that the
> automatic merging code is buggy as well.
>
> Juergen, is the disk controller on your machine capable of merging
> requests? If so, can you disable it and see if the bug still occurs
> please?

Hmmm, hard to say. Maybe the author of this driver can say more.

@Ben: MMC/SD/SDHC driver for the s3c2440-CPU. Can you answer Mel's question? 

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-08-13  9:22                 ` Juergen Beisert
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-08-13  9:22 UTC (permalink / raw)
  To: ben-linux
  Cc: linux-kernel, Mel Gorman, Arnaud Faucher, linux-arm-kernel,
	linux-hotplug

On Donnerstag, 13. August 2009, Mel Gorman wrote:
> On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> > I have a rather similar problem on a driver that I try to keep
> > up-to-date with recent kernel versions
> > (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> > hardware is an ethernet-enabled disk controller on one chip, kind of a
> > cheap iSCSI.
> >
> > In my case there is no oops: the symptoms are that the read blocks seem
> > to be swapped or full of garbage.
> >
> > After investigation in the NDAS code, the bug triggers when the driver
> > tries to merge adjacent requests before sending them to the controller.
> > I had to disable this merge in order to restore normal behavior, at the
> > expense of a reduced efficiency.
>
> That is a very interesting point and one I hadn't considered. The point
> of the patch was to help drivers that merge adjacent requests if they
> happen to be physically contiguous. The reported bug that led to the
> patch was a regression of memory not being physically contiguous and
> requests not being merged.
>
> > > After this oops, system startup continues. Then the next oops occurs:
> > >
> > > This one is new, since I try to mount the connected SD card.
> >
> > Mel's buffer overrun theory seems to apply in the NDAS driver case,
> > where the original requests adjacency test seems faulty.
> >
> > May it also be the cause of the SD mounting crash ?
>
> It's a possibility. If it's not an overrun, it's possible that the
> automatic merging code is buggy as well.
>
> Juergen, is the disk controller on your machine capable of merging
> requests? If so, can you disable it and see if the bug still occurs
> please?

Hmmm, hard to say. Maybe the author of this driver can say more.

@Ben: MMC/SD/SDHC driver for the s3c2440-CPU. Can you answer Mel's question? 

Regards,
Juergen

-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
  2009-08-13  9:22                 ` Juergen Beisert
@ 2009-10-16  8:10                   ` Juergen Beisert
  -1 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-10-16  8:10 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Mel Gorman, linux-kernel

On Donnerstag, 13. August 2009, Juergen Beisert wrote:
> On Donnerstag, 13. August 2009, Mel Gorman wrote:
> > On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> > > I have a rather similar problem on a driver that I try to keep
> > > up-to-date with recent kernel versions
> > > (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> > > hardware is an ethernet-enabled disk controller on one chip, kind of a
> > > cheap iSCSI.
> > >
> > > In my case there is no oops: the symptoms are that the read blocks seem
> > > to be swapped or full of garbage.
> > >
> > > After investigation in the NDAS code, the bug triggers when the driver
> > > tries to merge adjacent requests before sending them to the controller.
> > > I had to disable this merge in order to restore normal behavior, at the
> > > expense of a reduced efficiency.
> >
> > That is a very interesting point and one I hadn't considered. The point
> > of the patch was to help drivers that merge adjacent requests if they
> > happen to be physically contiguous. The reported bug that led to the
> > patch was a regression of memory not being physically contiguous and
> > requests not being merged.
> >
> > > > After this oops, system startup continues. Then the next oops occurs:
> > > >
> > > > This one is new, since I try to mount the connected SD card.
> > >
> > > Mel's buffer overrun theory seems to apply in the NDAS driver case,
> > > where the original requests adjacency test seems faulty.
> > >
> > > May it also be the cause of the SD mounting crash ?
> >
> > It's a possibility. If it's not an overrun, it's possible that the
> > automatic merging code is buggy as well.
> >
> > Juergen, is the disk controller on your machine capable of merging
> > requests? If so, can you disable it and see if the bug still occurs
> > please?
>
> Hmmm, hard to say. Maybe the author of this driver can say more.
>
> @Ben: MMC/SD/SDHC driver for the s3c2440-CPU. Can you answer Mel's
> question?

For the records: A wrong __initdata at the MMC/SD/SDHC platform structure 
causes this failure to happen. I copied it from a broken implementation in 
mach-mini2440.c.

Regards,
Juergen



-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-10-16  8:10                   ` Juergen Beisert
  0 siblings, 0 replies; 21+ messages in thread
From: Juergen Beisert @ 2009-10-16  8:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Donnerstag, 13. August 2009, Juergen Beisert wrote:
> On Donnerstag, 13. August 2009, Mel Gorman wrote:
> > On Wed, Aug 12, 2009 at 02:40:30PM -0400, Arnaud Faucher wrote:
> > > I have a rather similar problem on a driver that I try to keep
> > > up-to-date with recent kernel versions
> > > (http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
> > > hardware is an ethernet-enabled disk controller on one chip, kind of a
> > > cheap iSCSI.
> > >
> > > In my case there is no oops: the symptoms are that the read blocks seem
> > > to be swapped or full of garbage.
> > >
> > > After investigation in the NDAS code, the bug triggers when the driver
> > > tries to merge adjacent requests before sending them to the controller.
> > > I had to disable this merge in order to restore normal behavior, at the
> > > expense of a reduced efficiency.
> >
> > That is a very interesting point and one I hadn't considered. The point
> > of the patch was to help drivers that merge adjacent requests if they
> > happen to be physically contiguous. The reported bug that led to the
> > patch was a regression of memory not being physically contiguous and
> > requests not being merged.
> >
> > > > After this oops, system startup continues. Then the next oops occurs:
> > > >
> > > > This one is new, since I try to mount the connected SD card.
> > >
> > > Mel's buffer overrun theory seems to apply in the NDAS driver case,
> > > where the original requests adjacency test seems faulty.
> > >
> > > May it also be the cause of the SD mounting crash ?
> >
> > It's a possibility. If it's not an overrun, it's possible that the
> > automatic merging code is buggy as well.
> >
> > Juergen, is the disk controller on your machine capable of merging
> > requests? If so, can you disable it and see if the bug still occurs
> > please?
>
> Hmmm, hard to say. Maybe the author of this driver can say more.
>
> @Ben: MMC/SD/SDHC driver for the s3c2440-CPU. Can you answer Mel's
> question?

For the records: A wrong __initdata at the MMC/SD/SDHC platform structure 
causes this failure to happen. I copied it from a broken implementation in 
mach-mini2440.c.

Regards,
Juergen



-- 
Pengutronix e.K.                              | Juergen Beisert             |
Linux Solutions for Science and Industry      | Phone: +49-8766-939 228     |
Vertretung Sued/Muenchen, Germany             | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686              | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system
@ 2009-08-12 18:29 Arnaud Faucher
  0 siblings, 0 replies; 21+ messages in thread
From: Arnaud Faucher @ 2009-08-12 18:29 UTC (permalink / raw)
  To: Juergen Beisert, Mel Gorman; +Cc: Robert Schwebel, linux-kernel

I have a rather similar problem on a driver that I try to keep
up-to-date with recent kernel versions
(http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
hardware is an ethernet-enabled disk controller on one chip, kind of a
cheap iSCSI.

In my case there is no oops: the symptoms are that the read blocks seem
to be swapped or full of garbage.

After investigation in the NDAS code, the bug triggers when the driver
tries to merge adjacent requests, before sending them to the controller.
I had to disable this merge in order to restore normal behavior, at the
expense of a reduced efficiency.

Mel's buffer overrun theory seems to apply in the NDAS driver case,
where the original requests adjacency test seems faulty.

May it also be the cause of the SD mounting crash ?


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-10-16  8:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-11 16:30 Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system Juergen Beisert
2009-08-11 16:30 ` Juergen Beisert
2009-08-12  7:47 ` Robert Schwebel
2009-08-12  7:47   ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Robert Schwebel
2009-08-12  9:20   ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system Mel Gorman
2009-08-12  9:20     ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Mel Gorman
2009-08-12 11:11     ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system Juergen Beisert
2009-08-12 11:11       ` Juergen Beisert
2009-08-12 13:50       ` Mel Gorman
2009-08-12 13:50         ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Mel Gorman
2009-08-12 15:35         ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system Juergen Beisert
2009-08-12 15:35           ` Juergen Beisert
2009-08-12 18:40           ` Arnaud Faucher
2009-08-12 18:40             ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Arnaud Faucher
2009-08-13  8:39             ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system Mel Gorman
2009-08-13  8:39               ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD Mel Gorman
2009-08-13  9:22               ` Patch "page-allocator: preserve PFN ordering when __GFP_COLD is set" fails on my system Juergen Beisert
2009-08-13  9:22                 ` Juergen Beisert
2009-10-16  8:10                 ` Juergen Beisert
2009-10-16  8:10                   ` Juergen Beisert
2009-08-12 18:29 Arnaud Faucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.