linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.1-rc10 oops in nameidata_to_filp
@ 2011-11-16 11:22 George Spelvin
  2011-11-24 14:51 ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: George Spelvin @ 2011-11-16 11:22 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel; +Cc: linux

This morning, I found the following on my laptop.  I hope the kernel
version is recent enough to be useful; the only change between then and
current 3.2-rc2 I noticed is an NFS lease fix, and the machine has no
NFS exports or mounts active.

The laptop is a core 2 duo, running a 32-bit kernel with 2 GB of RAM.
Uptime is 26 days, although obviously it's been asleep for a lot of that.

Non-ECC RAM; it *could* be just a random bit flip, but I'm sending
this out into the world in case it's illuniating to someone with
a deeper understanding of the relevant data structures.

It's running a copy of John Linville's wireless development tree,
but the changes there should not affect core file system activity like
this.  (They're mostly in drivers/net/wireless and net/wireless,
touching *nothing* in fs/ or other core kernel code.)

The exact kernel I'm running is:

> commit 137d0943ea2cbcdbfc38606944fc0b6494f7c935
> Merge: dfd5c52 899e3ee
> Author: John W. Linville <linville@tuxdriver.com>
> Date:   Tue Oct 18 10:52:19 2011 -0400
> 
>     Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torva

899e3ee is v3.1-rc10.  The commit is available at
http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=137d0943ea2cbcdbfc38606944fc0b6494f7c935

Local file sytsems are all ext3 or tmpfs.  Although I have mounted NFS
file systems since reboot, they were all unmounted days before the oops.

The machine is still up.  I plan on upgrading the kernel and
rebooting unless someone would like some specific testing.


BUG: unable to handle kernel NULL pointer dereference at 00000018
IP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed
*pde = 00000000 
Oops: 0000 [#1] SMP 
Modules linked in: nfs lockd sunrpc serpent xcbc b43 mac80211 cfg80211 rfkill bcma

Pid: 15325, comm: find Not tainted 3.1.0-rc10-wl #281 Dell Inc. MXC061                          /0MG532
EIP: 0060:[<c108a788>] EFLAGS: 00010206 CPU: 0
EIP is at __dentry_open.isra.16+0x12c/0x1ed
EAX: 00000000 EBX: c5c80480 ECX: 00000000 EDX: 00000000
ESI: c003fd0c EDI: 00000000 EBP: c3c9be58 ESP: c3c9be40
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process find (pid: 15325, ti=c3c9a000 task=f4a73390 task.ti=c3c9a000)
Stack:
 cef74480 f62a2d80 c3c9beec c3c9beec cef74480 c5c80480 c3c9be70 c108b309
 00000000 c3c9beec 00000000 00000000 c3c9bea4 c10950fe 00000000 00000001
 c003fd0c 00000024 c1093dad cef74400 00000000 00038900 c3c9beec 0000000b
Call Trace:
 [<c108b309>] nameidata_to_filp+0x33/0x3d
 [<c10950fe>] do_last.isra.49+0x3dc/0x4c3
 [<c1093dad>] ? path_init+0x20d/0x249
 [<c10952ab>] path_openat+0xa1/0x254
 [<c11109b7>] ? copy_to_user+0x3f/0x46
 [<c109549e>] do_filp_open+0x26/0x67
 [<c111080b>] ? might_fault+0x8/0xa
 [<c109cfc3>] ? alloc_fd+0x4e/0xba
 [<c1092ee7>] ? getname_flags+0x6d/0xad
 [<c108b36d>] do_sys_open+0x5a/0xe5
 [<c108b43e>] sys_openat+0x1f/0x25
 [<c1314a90>] sysenter_do_call+0x12/0x26
Code: 85 ff 89 43 10 75 0b 85 c0 74 14 8b 78 2c 85 ff 74 0d 89 da 89 f0 ff d7 85 c0 89 45 f0 75 4d 81 63 20 3f fc ff ff 8b 43 7c 8b 00 <8b> 50 18 8d 43 4c e8 5d 10 fe ff f6 43 21 40 0f 84 a2 00 00 00 
EIP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed SS:ESP 0068:c3c9be40
CR2: 0000000000000018
---[ end trace 34290958b6905e19 ]---

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-16 11:22 3.1-rc10 oops in nameidata_to_filp George Spelvin
@ 2011-11-24 14:51 ` Jan Kara
  2011-11-24 16:44   ` George Spelvin
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2011-11-24 14:51 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-fsdevel, linux-kernel

On Wed 16-11-11 06:22:46, George Spelvin wrote:
> This morning, I found the following on my laptop.  I hope the kernel
> version is recent enough to be useful; the only change between then and
> current 3.2-rc2 I noticed is an NFS lease fix, and the machine has no
> NFS exports or mounts active.
> 
> The laptop is a core 2 duo, running a 32-bit kernel with 2 GB of RAM.
> Uptime is 26 days, although obviously it's been asleep for a lot of that.
> 
> Non-ECC RAM; it *could* be just a random bit flip, but I'm sending
> this out into the world in case it's illuniating to someone with
> a deeper understanding of the relevant data structures.
> 
> It's running a copy of John Linville's wireless development tree,
> but the changes there should not affect core file system activity like
> this.  (They're mostly in drivers/net/wireless and net/wireless,
> touching *nothing* in fs/ or other core kernel code.)
> 
> The exact kernel I'm running is:
> 
> > commit 137d0943ea2cbcdbfc38606944fc0b6494f7c935
> > Merge: dfd5c52 899e3ee
> > Author: John W. Linville <linville@tuxdriver.com>
> > Date:   Tue Oct 18 10:52:19 2011 -0400
> > 
> >     Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torva
> 
> 899e3ee is v3.1-rc10.  The commit is available at
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commit;h=137d0943ea2cbcdbfc38606944fc0b6494f7c935
> 
> Local file sytsems are all ext3 or tmpfs.  Although I have mounted NFS
> file systems since reboot, they were all unmounted days before the oops.
  Well, probably you also have /proc and other virtual filesystems mounted
:)

> The machine is still up.  I plan on upgrading the kernel and
> rebooting unless someone would like some specific testing.
> 
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000018
> IP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed
> *pde = 00000000 
> Oops: 0000 [#1] SMP 
> Modules linked in: nfs lockd sunrpc serpent xcbc b43 mac80211 cfg80211 rfkill bcma
> 
> Pid: 15325, comm: find Not tainted 3.1.0-rc10-wl #281 Dell Inc. MXC061                          /0MG532
> EIP: 0060:[<c108a788>] EFLAGS: 00010206 CPU: 0
> EIP is at __dentry_open.isra.16+0x12c/0x1ed
> EAX: 00000000 EBX: c5c80480 ECX: 00000000 EDX: 00000000
> ESI: c003fd0c EDI: 00000000 EBP: c3c9be58 ESP: c3c9be40
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process find (pid: 15325, ti=c3c9a000 task=f4a73390 task.ti=c3c9a000)
> Stack:
>  cef74480 f62a2d80 c3c9beec c3c9beec cef74480 c5c80480 c3c9be70 c108b309
>  00000000 c3c9beec 00000000 00000000 c3c9bea4 c10950fe 00000000 00000001
>  c003fd0c 00000024 c1093dad cef74400 00000000 00038900 c3c9beec 0000000b
> Call Trace:
>  [<c108b309>] nameidata_to_filp+0x33/0x3d
>  [<c10950fe>] do_last.isra.49+0x3dc/0x4c3
>  [<c1093dad>] ? path_init+0x20d/0x249
>  [<c10952ab>] path_openat+0xa1/0x254
>  [<c11109b7>] ? copy_to_user+0x3f/0x46
>  [<c109549e>] do_filp_open+0x26/0x67
>  [<c111080b>] ? might_fault+0x8/0xa
>  [<c109cfc3>] ? alloc_fd+0x4e/0xba
>  [<c1092ee7>] ? getname_flags+0x6d/0xad
>  [<c108b36d>] do_sys_open+0x5a/0xe5
>  [<c108b43e>] sys_openat+0x1f/0x25
>  [<c1314a90>] sysenter_do_call+0x12/0x26
> Code: 85 ff 89 43 10 75 0b 85 c0 74 14 8b 78 2c 85 ff 74 0d 89 da 89 f0 ff d7 85 c0 89 45 f0 75 4d 81 63 20 3f fc ff ff 8b 43 7c 8b 00 <8b> 50 18 8d 43 4c e8 5d 10 fe ff f6 43 21 40 0f 84 a2 00 00 00 
> EIP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed SS:ESP 0068:c3c9be40
> CR2: 0000000000000018
> ---[ end trace 34290958b6905e19 ]---
  Interesting. So we failed at doing dereference for
file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);
  In particular f->f_mapping->host was NULL. That is curious since
f_mapping is normally initialized to inode->i_mapping (which has ->host
properly set) shortly before and only devices and similar special inodes
override this in their ->open() callback to something else. Furthermore I
see the process doing open() was find(1) which usually opens only
directories which do not commonly have special ->open callback. So that
makes things even more strange.

 So my guess would be that find wondered into some virtual filesystem and
that set f_mapping to something strange (or had inode->i_mapping not
initialized properly). Anyway, unless you can reproduce this and find on
which filesystem this happened, I don't know how to debug this further...

  Thanks for report!

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 14:51 ` Jan Kara
@ 2011-11-24 16:44   ` George Spelvin
  2011-11-24 17:38     ` Al Viro
  0 siblings, 1 reply; 10+ messages in thread
From: George Spelvin @ 2011-11-24 16:44 UTC (permalink / raw)
  To: jack, linux; +Cc: linux-fsdevel, linux-kernel

Jan Kara <jack@suse.cz> wrote:
>  Well, probably you also have /proc and other virtual filesystems mounted :)

Yes.  To be precise, the mount list is:
rootfs / rootfs rw 0 0
/dev/root / ext3 rw,relatime,errors=remount-ro,commit=5,barrier=1,data=ordered 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,relatime,size=5120k,mode=755 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=206512k,mode=755 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev tmpfs rw,relatime,size=10240k,mode=755 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,relatime,size=413024k 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0

>  Interesting. So we failed at doing dereference for
> file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);

>   In particular f->f_mapping->host was NULL. That is curious since
> f_mapping is normally initialized to inode->i_mapping (which has ->host
> properly set) shortly before and only devices and similar special inodes
> override this in their ->open() callback to something else. Furthermore I
> see the process doing open() was find(1) which usually opens only
> directories which do not commonly have special ->open callback. So that
> makes things even more strange.

>  So my guess would be that find wandered into some virtual filesystem and
> that set f_mapping to something strange (or had inode->i_mapping not
> initialized properly). Anyway, unless you can reproduce this and find on
> which filesystem this happened, I don't know how to debug this further...

It's been happening every day or so.

[Timestamp Nov 20 07:37:54]
BUG: unable to handle kernel NULL pointer dereference at 00000018
IP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed
*pde = 00000000 
Oops: 0000 [#4] SMP 
Modules linked in: nfs lockd sunrpc serpent xcbc b43 mac80211 cfg80211 rfkill bcma

Pid: 15266, comm: find Tainted: G      D     3.1.0-rc10-wl #281 Dell Inc. MXC061                          /0MG532
EIP: 0060:[<c108a788>] EFLAGS: 00010206 CPU: 0
EIP is at __dentry_open.isra.16+0x12c/0x1ed
EAX: 00000000 EBX: ce058580 ECX: 00000000 EDX: 00000000
ESI: c003fd0c EDI: 00000000 EBP: cf03be58 ESP: cf03be40
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process find (pid: 15266, ti=cf03a000 task=d7176cb0 task.ti=cf03a000)
Stack:
 cef74480 f62a2d80 cf03beec cf03beec cef74480 ce058580 cf03be70 c108b309
 00000000 cf03beec 00000000 00000000 cf03bea4 c10950fe 00000000 00000001
 c003fd0c 00000024 c1093dad cef74400 00000000 00038900 cf03beec 0000000a
Call Trace:
 [<c108b309>] nameidata_to_filp+0x33/0x3d
 [<c10950fe>] do_last.isra.49+0x3dc/0x4c3
 [<c1093dad>] ? path_init+0x20d/0x249
 [<c10952ab>] path_openat+0xa1/0x254
 [<c11109b7>] ? copy_to_user+0x3f/0x46
 [<c109549e>] do_filp_open+0x26/0x67
 [<c111080b>] ? might_fault+0x8/0xa
 [<c109cfc3>] ? alloc_fd+0x4e/0xba
 [<c1092ee7>] ? getname_flags+0x6d/0xad
 [<c108b36d>] do_sys_open+0x5a/0xe5
 [<c108b43e>] sys_openat+0x1f/0x25
 [<c1314a90>] sysenter_do_call+0x12/0x26
Code: 85 ff 89 43 10 75 0b 85 c0 74 14 8b 78 2c 85 ff 74 0d 89 da 89 f0 ff d7 85 c0 89 45 f0 75 4d 81 63 20 3f fc ff ff 8b 43 7c 8b 00 <8b> 50 18 8d 43 4c e8 5d 10 fe ff f6 43 21 40 0f 84 a2 00 00 00 
EIP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed SS:ESP 0068:cf03be40
CR2: 0000000000000018
---[ end trace 34290958b6905e1c ]---


[Timestamp Nov 23 00:21:04]
BUG: unable to handle kernel NULL pointer dereference at 00000018
IP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed
*pde = 00000000 
Oops: 0000 [#5] SMP 
Modules linked in: nfs lockd sunrpc serpent xcbc b43 mac80211 cfg80211 rfkill bcma

Pid: 5856, comm: find Tainted: G      D     3.1.0-rc10-wl #281 Dell Inc. MXC061                          /0MG532
EIP: 0060:[<c108a788>] EFLAGS: 00010206 CPU: 0
EIP is at __dentry_open.isra.16+0x12c/0x1ed
EAX: 00000000 EBX: c93c7380 ECX: 00000000 EDX: 00000000
ESI: c003fd0c EDI: 00000000 EBP: c93c1e58 ESP: c93c1e40
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process find (pid: 5856, ti=c93c0000 task=cbac8a50 task.ti=c93c0000)
Stack:
 cef74480 f62a2d80 c93c1eec c93c1eec cef74480 c93c7380 c93c1e70 c108b309
 00000000 c93c1eec 00000000 00000000 c93c1ea4 c10950fe 00000000 00000001
 c003fd0c 00000024 c1093dad cef74400 00000000 00038900 c93c1eec 0000000b
Call Trace:
 [<c108b309>] nameidata_to_filp+0x33/0x3d
 [<c10950fe>] do_last.isra.49+0x3dc/0x4c3
 [<c1093dad>] ? path_init+0x20d/0x249
 [<c10952ab>] path_openat+0xa1/0x254
 [<c11109b7>] ? copy_to_user+0x3f/0x46
 [<c109549e>] do_filp_open+0x26/0x67
 [<c111080b>] ? might_fault+0x8/0xa
 [<c109cfc3>] ? alloc_fd+0x4e/0xba
 [<c1092ee7>] ? getname_flags+0x6d/0xad
 [<c108b36d>] do_sys_open+0x5a/0xe5
 [<c108b43e>] sys_openat+0x1f/0x25
 [<c1314a90>] sysenter_do_call+0x12/0x26
Code: 85 ff 89 43 10 75 0b 85 c0 74 14 8b 78 2c 85 ff 74 0d 89 da 89 f0 ff d7 85 c0 89 45 f0 75 4d 81 63 20 3f fc ff ff 8b 43 7c 8b 00 <8b> 50 18 8d 43 4c e8 5d 10 fe ff f6 43 21 40 0f 84 a2 00 00 00 
EIP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed SS:ESP 0068:c93c1e40
CR2: 0000000000000018
---[ end trace 34290958b6905e1d ]---


[Timestamp Nov 24 08:01:32]
BUG: unable to handle kernel NULL pointer dereference at 00000018
IP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed
*pde = 00000000 
Oops: 0000 [#6] SMP 
Modules linked in: nfs lockd sunrpc serpent xcbc b43 mac80211 cfg80211 rfkill bcma

Pid: 11313, comm: find Tainted: G      D     3.1.0-rc10-wl #281 Dell Inc. MXC061                          /0MG532
EIP: 0060:[<c108a788>] EFLAGS: 00010206 CPU: 1
EIP is at __dentry_open.isra.16+0x12c/0x1ed
EAX: 00000000 EBX: d09da800 ECX: 00000000 EDX: 00000000
ESI: c003fd0c EDI: 00000000 EBP: f45bde58 ESP: f45bde40
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process find (pid: 11313, ti=f45bc000 task=f4ad1130 task.ti=f45bc000)
Stack:
 cef74480 f62a2d80 f45bdeec f45bdeec cef74480 d09da800 f45bde70 c108b309
 00000000 f45bdeec 00000000 00000000 f45bdea4 c10950fe 00000000 00000001
 c003fd0c 00000024 c1093dad cef74400 00000000 00038900 f45bdeec 0000000b
Call Trace:
 [<c108b309>] nameidata_to_filp+0x33/0x3d
 [<c10950fe>] do_last.isra.49+0x3dc/0x4c3
 [<c1093dad>] ? path_init+0x20d/0x249
 [<c10952ab>] path_openat+0xa1/0x254
 [<c11109b7>] ? copy_to_user+0x3f/0x46
 [<c109549e>] do_filp_open+0x26/0x67
 [<c111080b>] ? might_fault+0x8/0xa
 [<c109cfc3>] ? alloc_fd+0x4e/0xba
 [<c1092ee7>] ? getname_flags+0x6d/0xad
 [<c108b36d>] do_sys_open+0x5a/0xe5
 [<c108b43e>] sys_openat+0x1f/0x25
 [<c1314a90>] sysenter_do_call+0x12/0x26
Code: 85 ff 89 43 10 75 0b 85 c0 74 14 8b 78 2c 85 ff 74 0d 89 da 89 f0 ff d7 85 c0 89 45 f0 75 4d 81 63 20 3f fc ff ff 8b 43 7c 8b 00 <8b> 50 18 8d 43 4c e8 5d 10 fe ff f6 43 21 40 0f 84 a2 00 00 00 
EIP: [<c108a788>] __dentry_open.isra.16+0x12c/0x1ed SS:ESP 0068:f45bde40
CR2: 0000000000000018
---[ end trace 34290958b6905e1e ]---

It turned out the machine was quite recoverable and I've been running it without rebooting since then.
This includes several suspends to RAM and one to disk.

So far, it seems pretty reproducible, but I suppose it could be a kernel bit flip.
(F***ing Intel not even *allowing* ECC in "consumer" chipsets...)

I should probably add a debugging patch and reboot.  Is there a debugging helper
for printing a dentry and vfsmount?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 16:44   ` George Spelvin
@ 2011-11-24 17:38     ` Al Viro
  2011-11-24 17:50       ` Al Viro
                         ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Al Viro @ 2011-11-24 17:38 UTC (permalink / raw)
  To: George Spelvin; +Cc: jack, linux-fsdevel, linux-kernel

On Thu, Nov 24, 2011 at 11:44:06AM -0500, George Spelvin wrote:

> It turned out the machine was quite recoverable and I've been running it without rebooting since then.
> This includes several suspends to RAM and one to disk.
> 
> So far, it seems pretty reproducible, but I suppose it could be a kernel bit flip.
> (F***ing Intel not even *allowing* ECC in "consumer" chipsets...)
> 
> I should probably add a debugging patch and reboot.  Is there a debugging helper
> for printing a dentry and vfsmount?

d_path(); takes struct path *, pointer to buffer and buffer length, puts
the pathname into the end of buffer and returns a pointer to the beginning
of resulting string.

I'd add (hell, maybe start with) printing this:
	file->f_path.dentry->d_inode
	inode
	file->f_mapping
	inode->i_mapping
	inode->i_mapping->host
just to see whether it's open() callback resetting ->f_mapping to NULL or
weird inode->i_mapping->host.  All in case file->f_mapping->host == NULL
just before the spot where it oopses.

Getting pathname would be something like
	static char name[4096];
	struct path path = {.mnt = mnt, .dentry = dentry};
	char *p = d_path(&path, name, 4096);
	if (IS_ERR(p))
		printk("[%d]", PTR_ERR(p));
	else
		printk("'%s'", p);
conditional on the same test.  

Said that, I'm not buying the theory of open assigning to ->f_mapping and
screwing it up; all such assignments end up with ->i_mapping of *some*
inode, as far as I can see from cursory grep over the tree.  Just in case:
do you have CONFIG_FS_POSIX_ACL set?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 17:38     ` Al Viro
@ 2011-11-24 17:50       ` Al Viro
  2011-11-24 17:51         ` Al Viro
  2011-11-30 23:57         ` Andrew Morton
  2011-11-24 21:14       ` Jan Kara
  2011-11-24 21:47       ` George Spelvin
  2 siblings, 2 replies; 10+ messages in thread
From: Al Viro @ 2011-11-24 17:50 UTC (permalink / raw)
  To: George Spelvin; +Cc: jack, linux-fsdevel, linux-kernel, George Spelvin

On Thu, Nov 24, 2011 at 05:38:29PM +0000, Al Viro wrote:
> Said that, I'm not buying the theory of open assigning to ->f_mapping and
> screwing it up; all such assignments end up with ->i_mapping of *some*
> inode, as far as I can see from cursory grep over the tree.  Just in case:
> do you have CONFIG_FS_POSIX_ACL set?

BTW, why are we going through that dance with ->host->i_mapping anyway?
It had been introduced by commit by akpm back in 2004 and from my reading
of the commit message it was an overkill even back then.  Basically,
that call got moved to the point past the call of ->open() (good, ->f_mapping
could've been changed by it) *and* converted from ->f_mapping to
->f_mapping->host->i_mapping, which is useless.  Definitely so in the
case mentioned in that commit (blkdev_open() sets ->f_mapping 
bdev->bd_inode->i_mapping and that thing will have ->host pointing
back to bdev->bd_inode).  Commit was in BK, its copy in historical tree
is commit 1c211088833a27daa4512348bcae9890e8cf92d4
Author: Andrew Morton <akpm@osdl.org>
Date:   Wed May 26 17:35:42 2004 -0700

    [PATCH] Fix the setting of file->f_ra on block-special files
    
    We need to set file->f_ra _after_ calling blkdev_open(), when inode->i_mapping
    points at the right thing.  And we need to get it from
    inode->i_mapping->host->i_mapping too, which represents the underlying device.
    
    Also, don't test for null file->f_mapping in the O_DIRECT checks.
    
    Signed-off-by: Andrew Morton <akpm@osdl.org>

and the only difference wrt setting ->f_mapping on bdev open back then
is that it used to be done in do_open() instead of blkdev_open() itself.
So I don't understand what that part of changes had been for...  Andrew?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 17:50       ` Al Viro
@ 2011-11-24 17:51         ` Al Viro
  2011-11-30 23:57         ` Andrew Morton
  1 sibling, 0 replies; 10+ messages in thread
From: Al Viro @ 2011-11-24 17:51 UTC (permalink / raw)
  To: Andrew Morton; +Cc: jack, linux-fsdevel, linux-kernel

[grr... wrong To: on that one; instead of going it akpm it went to George,
in one too many copies; my apologies...]

On Thu, Nov 24, 2011 at 05:50:03PM +0000, Al Viro wrote:
> On Thu, Nov 24, 2011 at 05:38:29PM +0000, Al Viro wrote:
> > Said that, I'm not buying the theory of open assigning to ->f_mapping and
> > screwing it up; all such assignments end up with ->i_mapping of *some*
> > inode, as far as I can see from cursory grep over the tree.  Just in case:
> > do you have CONFIG_FS_POSIX_ACL set?
> 
> BTW, why are we going through that dance with ->host->i_mapping anyway?
> It had been introduced by commit by akpm back in 2004 and from my reading
> of the commit message it was an overkill even back then.  Basically,
> that call got moved to the point past the call of ->open() (good, ->f_mapping
> could've been changed by it) *and* converted from ->f_mapping to
> ->f_mapping->host->i_mapping, which is useless.  Definitely so in the
> case mentioned in that commit (blkdev_open() sets ->f_mapping 
> bdev->bd_inode->i_mapping and that thing will have ->host pointing
> back to bdev->bd_inode).  Commit was in BK, its copy in historical tree
> is commit 1c211088833a27daa4512348bcae9890e8cf92d4
> Author: Andrew Morton <akpm@osdl.org>
> Date:   Wed May 26 17:35:42 2004 -0700
> 
>     [PATCH] Fix the setting of file->f_ra on block-special files
>     
>     We need to set file->f_ra _after_ calling blkdev_open(), when inode->i_mapping
>     points at the right thing.  And we need to get it from
>     inode->i_mapping->host->i_mapping too, which represents the underlying device.
>     
>     Also, don't test for null file->f_mapping in the O_DIRECT checks.
>     
>     Signed-off-by: Andrew Morton <akpm@osdl.org>
> 
> and the only difference wrt setting ->f_mapping on bdev open back then
> is that it used to be done in do_open() instead of blkdev_open() itself.
> So I don't understand what that part of changes had been for...  Andrew?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 17:38     ` Al Viro
  2011-11-24 17:50       ` Al Viro
@ 2011-11-24 21:14       ` Jan Kara
  2011-11-24 21:47       ` George Spelvin
  2 siblings, 0 replies; 10+ messages in thread
From: Jan Kara @ 2011-11-24 21:14 UTC (permalink / raw)
  To: Al Viro; +Cc: George Spelvin, jack, linux-fsdevel, linux-kernel

On Thu 24-11-11 17:38:29, Al Viro wrote:
> On Thu, Nov 24, 2011 at 11:44:06AM -0500, George Spelvin wrote:
> 
> > It turned out the machine was quite recoverable and I've been running it without rebooting since then.
> > This includes several suspends to RAM and one to disk.
> > 
> > So far, it seems pretty reproducible, but I suppose it could be a kernel bit flip.
> > (F***ing Intel not even *allowing* ECC in "consumer" chipsets...)
> > 
> > I should probably add a debugging patch and reboot.  Is there a debugging helper
> > for printing a dentry and vfsmount?
> 
> d_path(); takes struct path *, pointer to buffer and buffer length, puts
> the pathname into the end of buffer and returns a pointer to the beginning
> of resulting string.
> 
> I'd add (hell, maybe start with) printing this:
> 	file->f_path.dentry->d_inode
> 	inode
> 	file->f_mapping
> 	inode->i_mapping
> 	inode->i_mapping->host
> just to see whether it's open() callback resetting ->f_mapping to NULL or
> weird inode->i_mapping->host.  All in case file->f_mapping->host == NULL
> just before the spot where it oopses.
> 
> Getting pathname would be something like
> 	static char name[4096];
> 	struct path path = {.mnt = mnt, .dentry = dentry};
> 	char *p = d_path(&path, name, 4096);
> 	if (IS_ERR(p))
> 		printk("[%d]", PTR_ERR(p));
> 	else
> 		printk("'%s'", p);
> conditional on the same test.  
> 
> Said that, I'm not buying the theory of open assigning to ->f_mapping and
> screwing it up; all such assignments end up with ->i_mapping of *some*
> inode, as far as I can see from cursory grep over the tree.
  Yeah, after some thought and grepping, setting ->f_mapping to something
bogus does not seem likely. More likely is that i_mapping got somehow
corrupted (use after free?) or something like that.

> Just in case: do you have CONFIG_FS_POSIX_ACL set?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 17:38     ` Al Viro
  2011-11-24 17:50       ` Al Viro
  2011-11-24 21:14       ` Jan Kara
@ 2011-11-24 21:47       ` George Spelvin
  2011-11-24 22:13         ` Al Viro
  2 siblings, 1 reply; 10+ messages in thread
From: George Spelvin @ 2011-11-24 21:47 UTC (permalink / raw)
  To: linux, viro; +Cc: jack, linux-fsdevel, linux-kernel

Thanks for the pointer.  I'll write some debug code.

> Said that, I'm not buying the theory of open assigning to ->f_mapping and
> screwing it up; all such assignments end up with ->i_mapping of *some*
> inode, as far as I can see from cursory grep over the tree.  Just in case:
> do you have CONFIG_FS_POSIX_ACL set?

No, I always turn that off.  (And TMPFS_POSIX_ACL, too.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 21:47       ` George Spelvin
@ 2011-11-24 22:13         ` Al Viro
  0 siblings, 0 replies; 10+ messages in thread
From: Al Viro @ 2011-11-24 22:13 UTC (permalink / raw)
  To: George Spelvin; +Cc: jack, linux-fsdevel, linux-kernel

On Thu, Nov 24, 2011 at 04:47:10PM -0500, George Spelvin wrote:
> Thanks for the pointer.  I'll write some debug code.
> 
> > Said that, I'm not buying the theory of open assigning to ->f_mapping and
> > screwing it up; all such assignments end up with ->i_mapping of *some*
> > inode, as far as I can see from cursory grep over the tree.  Just in case:
> > do you have CONFIG_FS_POSIX_ACL set?
> 
> No, I always turn that off.  (And TMPFS_POSIX_ACL, too.)

OK, then this 0x18 is very likely to be offset of i_mapping in struct inode
and we have file->f_mapping->host == NULL.  Very odd...  The only struct
address_space on your config besides inode->i_data (which _all_ get non-NULL
->host, pointing straight back to the containing struct inode) is
swapper_space.  And there's no way in hell for it to legitimately end up
in file->f_mapping or inode->i_mapping.

So the possibilities are:
	legitimate ->f_mapping, fucked contents of address_space it points to.
	fucked ->i_mapping, ->f_mapping copied from it.
	legitimate ->i_mapping, open() callback fucking ->f_mapping up.
Try to enable memory poisoning; if it turns those into oopsen on attempt
to access poison + 0x18, we'll know it's some inode getting freed while we
still have references to it...  OTOH, that would make your debugging printks
not fire at all, so it might be better to get the pathname first before
enabling that and seeing how behaviour changes...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 3.1-rc10 oops in nameidata_to_filp
  2011-11-24 17:50       ` Al Viro
  2011-11-24 17:51         ` Al Viro
@ 2011-11-30 23:57         ` Andrew Morton
  1 sibling, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2011-11-30 23:57 UTC (permalink / raw)
  To: Al Viro; +Cc: George Spelvin, jack, linux-fsdevel, linux-kernel

(sorry for the delay)

On Thu, 24 Nov 2011 17:50:03 +0000
Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Thu, Nov 24, 2011 at 05:38:29PM +0000, Al Viro wrote:
> > Said that, I'm not buying the theory of open assigning to ->f_mapping and
> > screwing it up; all such assignments end up with ->i_mapping of *some*
> > inode, as far as I can see from cursory grep over the tree.  Just in case:
> > do you have CONFIG_FS_POSIX_ACL set?
> 
> BTW, why are we going through that dance with ->host->i_mapping anyway?
> It had been introduced by commit by akpm back in 2004 and from my reading
> of the commit message it was an overkill even back then.  Basically,
> that call got moved to the point past the call of ->open() (good, ->f_mapping
> could've been changed by it) *and* converted from ->f_mapping to
> ->f_mapping->host->i_mapping, which is useless.  Definitely so in the
> case mentioned in that commit (blkdev_open() sets ->f_mapping 
> bdev->bd_inode->i_mapping and that thing will have ->host pointing
> back to bdev->bd_inode).  Commit was in BK, its copy in historical tree
> is commit 1c211088833a27daa4512348bcae9890e8cf92d4
> Author: Andrew Morton <akpm@osdl.org>
> Date:   Wed May 26 17:35:42 2004 -0700
> 
>     [PATCH] Fix the setting of file->f_ra on block-special files
>     
>     We need to set file->f_ra _after_ calling blkdev_open(), when inode->i_mapping
>     points at the right thing.  And we need to get it from
>     inode->i_mapping->host->i_mapping too, which represents the underlying device.
>     
>     Also, don't test for null file->f_mapping in the O_DIRECT checks.
>     
>     Signed-off-by: Andrew Morton <akpm@osdl.org>
> 
> and the only difference wrt setting ->f_mapping on bdev open back then
> is that it used to be done in do_open() instead of blkdev_open() itself.
> So I don't understand what that part of changes had been for...  Andrew?

Beats me, sorry.  Here's the thread:
http://www.gossamer-threads.com/lists/linux/kernel/443511?do=post_view_flat#443511

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-11-30 23:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-16 11:22 3.1-rc10 oops in nameidata_to_filp George Spelvin
2011-11-24 14:51 ` Jan Kara
2011-11-24 16:44   ` George Spelvin
2011-11-24 17:38     ` Al Viro
2011-11-24 17:50       ` Al Viro
2011-11-24 17:51         ` Al Viro
2011-11-30 23:57         ` Andrew Morton
2011-11-24 21:14       ` Jan Kara
2011-11-24 21:47       ` George Spelvin
2011-11-24 22:13         ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).