All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
@ 2011-01-26 14:55 Dr. Bernd Feige
  2011-01-26 16:32 ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. Bernd Feige @ 2011-01-26 14:55 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3095 bytes --]

Hi,

On 2.6.37 I get the following when listing one of our Novell directories
containing 9499 files (no subdirs; note that this works fine on
2.6.36.x):

kernel: kernel BUG at include/linux/dcache.h:340!
kernel: invalid opcode: 0000 [#1] SMP 
kernel: last sysfs file: /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq
kernel: Modules linked in: nls_cp437 nls_iso8859_1 ncpfs coretemp cpufreq_ondemand nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 autofs4 snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ext3 jbd mbcache ext2 dm_crypt dm_mod crypto_blkcipher crypto_algapi fuse vboxnetflt vboxdrv fbcon font bitblit softcursor usbhid usb_storage uas snd_hda_codec_analog radeon ttm drm_kms_helper drm sr_mod snd_hda_intel psmouse cdrom snd_hda_codec snd_pcm snd_timer sg uhci_hcd i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect parport_pc parport ehci_hcd dcdbas i2c_i801 snd soundcore snd_page_alloc usbcore
kernel: 
kernel: Pid: 4226, comm: ls Not tainted 2.6.37-gentoo #3 0GM819/OptiPlex 755                 
kernel: EIP: 0060:[<c108e2b6>] EFLAGS: 00010246 CPU: 1
kernel: EIP is at d_validate+0x6c/0x99
kernel: EAX: 00000000 EBX: f14952a8 ECX: 00000011 EDX: f14952a8
kernel: ESI: f5d675c0 EDI: f21a2aa0 EBP: 0272e622 ESP: f1a45ef0
kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
kernel: Process ls (pid: 4226, ti=f1a44000 task=f3793ba0 task.ti=f1a44000)
kernel: Stack:
kernel: 00000011 0001ffff f14952a8 00000000 00000000 f1e37300 f813f6a6 b9339067
kernel: f21a2aa0 f75228c0 00000555 ff8a0000 f21ea7b8 f1a45f90 c108b804 f21ea870
kernel: f21a2adc 4d4028b4 0000c2cd 00000556 00000001 f7515620 ff9d7000 00000557
kernel: Call Trace:
kernel: [<f813f6a6>] ? ncp_readdir+0x246/0x544 [ncpfs]
kernel: [<c108b804>] ? filldir64+0x0/0xcb
kernel: [<c108b804>] ? filldir64+0x0/0xcb
kernel: [<c108ba9b>] ? vfs_readdir+0x5c/0x80
kernel: [<c108bc11>] ? sys_getdents64+0x66/0xa5
kernel: [<c100270c>] ? sysenter_do_call+0x12/0x22
kernel: Code: 4f 81 f2 01 00 37 9e c1 ea 06 8d 2c 2a 89 e8 35 01 00 37 9e d3 e8 31 e8 23 44 24 04 8d 04 86 eb 11 85 db 74 22 8b 03 85 c0 75 02 <0f> 0b f0 ff 03 eb 15 8b 00 85 c0 74 16 8b 10 0f 18 02 90 8d 50 
kernel: EIP: [<c108e2b6>] d_validate+0x6c/0x99 SS:ESP 0068:f1a45ef0
kernel: ---[ end trace 4a1258c426b4363e ]---

I then created empty files in an empty directory on the server using the
attached script. For me, files up to 1363 could be handled without crash
while the addition of one more file showed the crash at the next ls.
I.e., the directory could have no more than 1366 entries including the
script, '.' and '..'.

Steps to reproduce:
cd /path/to/mounted/ncp/dir
mkdir tst; cd tst
cp ~/Mail/create_files .
bash create_files # Will create 2000 empty files 0001-2000 to be on the safe side ;-)
ls

I assumed that the changes to ncpfs in 2.6.37 caused this, but reverting
them did not solve the problem. Turning off preemption and group
scheduling did not help either. So I'm lost and my spare time is running
out, thought I'd report it nonetheless.

Thanks for your time,
Bernd

[-- Attachment #2: create_files --]
[-- Type: application/x-shellscript, Size: 104 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
  2011-01-26 14:55 Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory Dr. Bernd Feige
@ 2011-01-26 16:32 ` Al Viro
  2011-01-26 17:03   ` Dr. Bernd Feige
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2011-01-26 16:32 UTC (permalink / raw)
  To: Dr. Bernd Feige; +Cc: linux-kernel

On Wed, Jan 26, 2011 at 03:55:00PM +0100, Dr. Bernd Feige wrote:
> Hi,
> 
> On 2.6.37 I get the following when listing one of our Novell directories
> containing 9499 files (no subdirs; note that this works fine on
> 2.6.36.x):

Plain .37?  Not .38-rc1?

> I assumed that the changes to ncpfs in 2.6.37 caused this, but reverting
> them did not solve the problem. Turning off preemption and group
> scheduling did not help either. So I'm lost and my spare time is running
> out, thought I'd report it nonetheless.

3825bdb7ed920845961f32f364454bee5f469abb may be a suspect...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
  2011-01-26 16:32 ` Al Viro
@ 2011-01-26 17:03   ` Dr. Bernd Feige
  2011-01-26 17:26     ` Dr. Bernd Feige
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. Bernd Feige @ 2011-01-26 17:03 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel

Hi Al,

thanks for the response!

> Plain .37?  Not .38-rc1?

Yes. 2.6.37-gentoo to be exact. gentoo adds the fbcondecor patch.

> 3825bdb7ed920845961f32f364454bee5f469abb may be a suspect...

I checked that dcache.c change but 2.6.37 doesn't include that one.

Best regards,
Bernd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
  2011-01-26 17:03   ` Dr. Bernd Feige
@ 2011-01-26 17:26     ` Dr. Bernd Feige
  2011-01-26 23:22       ` Christian Kujau
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. Bernd Feige @ 2011-01-26 17:26 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel

Dear Al,

my apologies. You hit the nail on the head! Of course plain 2.6.37
contains 3825bdb7ed920845961f32f364454bee5f469abb - downloaded it again,
reverted it, and no "kernel BUG" any more!

Thanks a lot!
Bernd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
  2011-01-26 17:26     ` Dr. Bernd Feige
@ 2011-01-26 23:22       ` Christian Kujau
  2011-01-27  8:22         ` Dr. Bernd Feige
  0 siblings, 1 reply; 7+ messages in thread
From: Christian Kujau @ 2011-01-26 23:22 UTC (permalink / raw)
  To: Dr. Bernd Feige; +Cc: Al Viro, LKML, david, npiggin

On Wed, 26 Jan 2011 at 18:26, Dr. Bernd Feige wrote:
> my apologies. You hit the nail on the head! Of course plain 2.6.37
> contains 3825bdb7ed920845961f32f364454bee5f469abb - downloaded it again,
> reverted it, and no "kernel BUG" any more!

Hasn't there been an earlier 
attempt[0] to remove 3825bdb7ed920845961f32f364454bee5f469abb back in Nov 
2010?

Christian.

[0] http://www.gossamer-threads.com/lists/linux/kernel/1308598
-- 
BOFH excuse #243:

The computer fleetly, mouse and all.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
  2011-01-26 23:22       ` Christian Kujau
@ 2011-01-27  8:22         ` Dr. Bernd Feige
  2011-01-27 10:21           ` Dr. Bernd Feige
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. Bernd Feige @ 2011-01-27  8:22 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Al Viro, LKML, david, npiggin


> Hasn't there been an earlier 
> attempt[0] to remove 3825bdb7ed920845961f32f364454bee5f469abb back in Nov 
> 2010?
> 
> Christian.
> 
> [0] http://www.gossamer-threads.com/lists/linux/kernel/1308598

It obviously didn't make it into 2.6.37. All I can say is that the bug
is 100% reproducible here by the procedure I outlined. And reading large
directories on ncpfs is as solid as it was in 2.6.36 if only this
dcache.c change is reverted. So I'm of course glad Nick's patch made it
into the tree now and would vote for its inclusion in 2.6.37-1.
I posted this because I didn't find a similar bug report. Or did I
overlook it?

Thanks and best regards,
Bernd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory
  2011-01-27  8:22         ` Dr. Bernd Feige
@ 2011-01-27 10:21           ` Dr. Bernd Feige
  0 siblings, 0 replies; 7+ messages in thread
From: Dr. Bernd Feige @ 2011-01-27 10:21 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Al Viro, LKML, david, npiggin

Earlier I wrote...
> So I'm of course glad Nick's patch made it
> into the tree now and would vote for its inclusion in 2.6.37-1.

Now I thought it would be a good idea to check this with 2.6.38-rc2. The
result is a similar crash but no longer confined to large ncpfs
directories. It is also not temporally related to the ls command, but ls
completes, only to give the following seconds later:

kernel: kernel BUG at fs/dcache.c:2102!
kernel: invalid opcode: 0000 [#1] PREEMPT SMP 
kernel: last sysfs file: /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq
kernel: Modules linked in: nls_cp437 nls_iso8859_1 ncpfs coretemp cpufreq_ondemand nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 autofs4 snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ext3 jbd mbcache ext2 dm_crypt dm_mod crypto_hash crypto_blkcipher crypto_algapi fuse usb_storage uas usbhid snd_hda_codec_analog fbcon font bitblit softcursor uhci_hcd dcdbas ehci_hcd sr_mod cdrom radeon snd_hda_intel snd_hda_codec ttm snd_pcm drm_kms_helper snd_timer drm snd i2c_algo_bit sg usbcore cfbcopyarea cfbimgblt parport_pc parport psmouse soundcore cfbfillrect snd_page_alloc i2c_i801
kernel: 
kernel: Pid: 2572, comm: bash Not tainted 2.6.38-rc2 #1 0GM819/OptiPlex 755                 
kernel: EIP: 0060:[<c109f13c>] EFLAGS: 00010246 CPU: 1
kernel: EIP is at dentry_update_name_case+0x12/0x46
kernel: EAX: 00000000 EBX: f20f1f00 ECX: ffffffff EDX: f1c59d28
kernel: ESI: f1c59d28 EDI: f1c59d28 EBP: f1c59f34 ESP: f1c59bb8
kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
kernel: Process bash (pid: 2572, ti=f1c58000 task=f283c390 task.ti=f1c58000)
kernel: Stack:
kernel: f20f1f00 f1c59d28 00000001 f1c59f34 f814a7b1 00000000 f3eb2400 f3fae4b8
kernel: 00000000 f1e0fd00 00000004 ff89b000 00000000 00000001 00000002 f1c59f90
kernel: c109d030 625f3231 6f765f66 72657672 65627261 6e757469 70702e67 f5440074
kernel: Call Trace:
kernel: [<f814a7b1>] ? ncp_fill_cache+0x19c/0x3c2 [ncpfs]
kernel: [<c109d030>] ? filldir64+0x0/0xcb
kernel: [<c1001a1c>] ? __switch_to+0x160/0x199
kernel: [<c1021101>] ? finish_task_switch+0x36/0x94
kernel: [<c1254267>] ? schedule+0x67b/0x70b
kernel: [<c1031295>] ? lock_timer_base.clone.25+0x18/0x32
kernel: [<c10316ab>] ? mod_timer+0x12d/0x13e
kernel: [<c1031cf4>] ? recalc_sigpending+0xf/0x2e
kernel: [<f8150746>] ? ncp_do_request+0x341/0x34b [ncpfs]
kernel: [<c103b208>] ? autoremove_wake_function+0x0/0x29
kernel: [<f814ab8b>] ? ncp_do_readdir+0x108/0x152 [ncpfs]
kernel: [<c109d030>] ? filldir64+0x0/0xcb
kernel: [<c106bae7>] ? get_page_from_freelist+0x309/0x373
kernel: [<c106bd58>] ? __alloc_pages_nodemask+0xd0/0x554
kernel: [<c1078e53>] ? do_wp_page+0x278/0x57c
kernel: [<c1077708>] ? page_address+0x8c/0xaa
kernel: [<f814b8ba>] ? ncp_readdir+0x557/0x55c [ncpfs]
kernel: [<c109d030>] ? filldir64+0x0/0xcb
kernel: [<c109d030>] ? filldir64+0x0/0xcb
kernel: [<c109d2c1>] ? vfs_readdir+0x56/0x7a
kernel: [<c109d437>] ? sys_getdents64+0x66/0xa7
kernel: [<c100274c>] ? sysenter_do_call+0x12/0x22
kernel: Code: 89 50 18 eb 0c 8b 48 18 8b 5a 18 89 58 18 89 4a 18 83 c4 04 5b 5e 5f 5d c3 55 57 56 53 fc 89 c3 89 d7 8b 40 20 8b 40 1c 48 75 02 <0f> 0b 8b 42 04 39 43 18 74 02 0f 0b 8d 6b 4c 89 e8 e8 66 64 1b 
kernel: EIP: [<c109f13c>] dentry_update_name_case+0x12/0x46 SS:ESP 0068:f1c59bb8
kernel: ---[ end trace 24c91ac4506eb552 ]---

As far as I can see, 2.6.38-rc1 has the same dcache.c so I didn't try
that.

Best regards,
Bernd

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-01-27 10:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-26 14:55 Linux 2.6.37 x86 ncpfs regression: kernel BUG at include/linux/dcache.h:340 with >1366 files in directory Dr. Bernd Feige
2011-01-26 16:32 ` Al Viro
2011-01-26 17:03   ` Dr. Bernd Feige
2011-01-26 17:26     ` Dr. Bernd Feige
2011-01-26 23:22       ` Christian Kujau
2011-01-27  8:22         ` Dr. Bernd Feige
2011-01-27 10:21           ` Dr. Bernd Feige

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.