* Kernel Bug: unable to handle kernel paging request @ 2013-07-12 5:24 Jérôme Poulin [not found] ` <CALJXSJquK6YxGKuH97Ec2CTMyJaZrJjOfePSKtgPDm8_9YXzzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Jérôme Poulin @ 2013-07-12 5:24 UTC (permalink / raw) To: linux-nilfs In response to Vyacheslav Dubeyko, here is the problem I encounter, I'm not sure how to reproduce it now, but before deleting /var/cache/apt, I was able to reproduce it by issuing apt-get update. Now, it triggers when Ubuntu launches the apt daemon. Afterward, the whole system is frozen, not even SysRq+I would return me to the console. After setting the log in another partition, here is what I have in syslog: Jul 12 01:08:43 bluetoothd[635]: Stopping discovery Jul 12 01:10:43 dbus[622]: [system] Activating service name='org.freedesktop.PackageKit' (using servicehelper) Jul 12 01:10:43 AptDaemon: INFO: Initializing daemon Jul 12 01:10:43 AptDaemon.PackageKit: INFO: Initializing PackageKit compat layer Jul 12 01:10:43 dbus[622]: [system] Successfully activated service 'org.freedesktop.PackageKit' [ 1677.310656] BUG: unable to handle kernel paging request at 0000000000004c83 [ 1677.310683] IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] [ 1677.310708] PGD 0 [ 1677.310715] Oops: 0000 [#1] SMP [ 1677.310726] Modules linked in: pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF) vboxdrv(OF) nfsd(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F) dm_crypt(F) bbswitch(OF) zram(C) intel_powerclamp kvm_intel kvm parport_pc(F) ppdev(F) lp(F) uvcvideo parport(F) crc32_pclmul(F) ghash_clmulni_intel(F) aesni_intel(F) snd_hda_codec_realtek snd_hda_intel snd_hda_codec aes_x86_64(F) asus_wmi lrw(F) sparse_keymap gf128mul(F) snd_hwdep(F) glue_helper(F) ablk_helper(F) arc4(F) snd_pcm(F) cryptd(F) joydev(F) videobuf2_vmalloc videobuf2_memops videobuf2_core mxm_wmi iwldvm snd_page_alloc(F) bnep snd_seq_midi(F) videodev snd_seq_midi_event(F) snd_rawmidi(F) mac80211 snd_seq(F) snd_seq_device(F) btusb snd_timer(F) iwlwifi snd(F) soundcore(F) microcode(F) psmouse(F) rfcomm bluetooth serio_raw(F) cfg80211 lpc_ich mei_me wmi mei mac_hid coretemp binfmt_misc(F) nilfs2 btrfs(F) xor(F) zlib_deflate(F) raid6_pq(F) libcrc32c(F) nbd(F) i915 i2c_algo_bit drm_kms_helper drm alx mdio ahci(F) libahci(F) vi deo(F) [last unloaded: ipmi_msghandler] [ 1677.311066] CPU: 7 PID: 414 Comm: segctord Tainted: GF C O 3.10.0-2-generic #10-Ubuntu [ 1677.311096] Hardware name: ASUSTeK COMPUTER INC. N56VZ/N56VZ, BIOS N56VZ.216 12/06/2012 [ 1677.311124] task: ffff88021c484650 ti: ffff88021eaa2000 task.ti: ffff88021eaa2000 [ 1677.311155] RIP: 0010:[<ffffffffa024d0f2>] [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] [ 1677.311199] RSP: 0000:ffff88021eaa3d00 EFLAGS: 00010202 [ 1677.311218] RAX: ffff880167625180 RBX: 0000000000004c83 RCX: 0000000000000034 [ 1677.311248] RDX: 000000000000000d RSI: 0000000000000000 RDI: 0000000000004c83 [ 1677.311277] RBP: ffff88021eaa3d08 R08: 7800000000000000 R09: a8001fa0bc000000 [ 1677.311305] R10: 57ffca5f4be82f00 R11: 0000000000000019 R12: ffff880213f46288 [ 1677.311328] R13: 0000000000000000 R14: ffffea0007321f80 R15: ffff880167625138 [ 1677.311353] FS: 0000000000000000(0000) GS:ffff88022efc0000(0000) knlGS:0000000000000000 [ 1677.311383] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1677.311403] CR2: 0000000000004c83 CR3: 0000000001c0e000 CR4: 00000000001407e0 [ 1677.311428] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1677.311455] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1677.311478] Stack: [ 1677.311487] ffff880213f461e0 ffff88021eaa3e00 ffffffffa024e4c5 ffffffff81019d09 [ 1677.311520] ffff88021c484650 ffff88021c484650 ffff88021c484650 ffff880221282270 [ 1677.311541] ffff88021e691f58 ffff88021e691e00 ffff880221282260 000000031c484650 [ 1677.311562] Call Trace: [ 1677.311575] [<ffffffffa024e4c5>] nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] [ 1677.311596] [<ffffffff81019d09>] ? sched_clock+0x9/0x10 [ 1677.311614] [<ffffffffa024f3ab>] nilfs_segctor_construct+0x17b/0x290 [nilfs2] [ 1677.311636] [<ffffffffa024f5e2>] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] [ 1677.311657] [<ffffffffa024f4c0>] ? nilfs_segctor_construct+0x290/0x290 [nilfs2] [ 1677.311677] [<ffffffff8107cae0>] kthread+0xc0/0xd0 [ 1677.311690] [<ffffffff8107ca20>] ? kthread_create_on_node+0x120/0x120 [ 1677.311709] [<ffffffff816dd16c>] ret_from_fork+0x7c/0xb0 [ 1677.311724] [<ffffffff8107ca20>] ? kthread_create_on_node+0x120/0x120 [ 1677.311740] Code: 2d ee e0 5b 5d c3 48 89 df e8 fb 25 ee e0 eb db 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 85 ff 48 89 e5 53 48 89 fb 74 29 <48> 8b 07 f6 c4 08 0f 84 9c 00 00 00 48 8b 47 30 48 8b 00 a9 00 [ 1677.311821] RIP [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] [ 1677.311841] RSP <ffff88021eaa3d00> [ 1677.311850] CR2: 0000000000004c83 [ 1677.320046] ---[ end trace 0e7c8d51bd66cbe6 ]--- Jul 12 01:11:50 kernel: [ 1741.418989] SysRq : Emergency Sync Jul 12 01:11:53 kernel: [ 1744.788020] SysRq : Terminate All Tasks -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CALJXSJquK6YxGKuH97Ec2CTMyJaZrJjOfePSKtgPDm8_9YXzzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJquK6YxGKuH97Ec2CTMyJaZrJjOfePSKtgPDm8_9YXzzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-12 18:58 ` Jérôme Poulin [not found] ` <CALJXSJoW9Qpp9t42u_k4cW3gO6qzSPoeCjtQDU3tDKq6TJ=K8Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Jérôme Poulin @ 2013-07-12 18:58 UTC (permalink / raw) To: linux-nilfs The problem happened again today after resume, it seems to be more frequent since last week. Here is a pastebin of the traceback + sysrq+W. http://pastebin.ca/2426059 On Fri, Jul 12, 2013 at 1:24 AM, Jérôme Poulin <jeromepoulin@gmail.com> wrote: > In response to Vyacheslav Dubeyko, here is the problem I encounter, > I'm not sure how to reproduce it now, but before deleting > /var/cache/apt, I was able to reproduce it by issuing apt-get update. > Now, it triggers when Ubuntu launches the apt daemon. Afterward, the > whole system is frozen, not even SysRq+I would return me to the > console. > > After setting the log in another partition, here is what I have in syslog: > > Jul 12 01:08:43 bluetoothd[635]: Stopping discovery > Jul 12 01:10:43 dbus[622]: [system] Activating service > name='org.freedesktop.PackageKit' (using servicehelper) > Jul 12 01:10:43 AptDaemon: INFO: Initializing daemon > Jul 12 01:10:43 AptDaemon.PackageKit: INFO: Initializing PackageKit compat layer > Jul 12 01:10:43 dbus[622]: [system] Successfully activated service > 'org.freedesktop.PackageKit' > [ 1677.310656] BUG: unable to handle kernel paging request at 0000000000004c83 > [ 1677.310683] IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] > [ 1677.310708] PGD 0 > [ 1677.310715] Oops: 0000 [#1] SMP > [ 1677.310726] Modules linked in: pci_stub vboxpci(OF) vboxnetadp(OF) > vboxnetflt(OF) vboxdrv(OF) nfsd(F) auth_rpcgss(F) nfs_acl(F) lockd(F) > sunrpc(F) dm_crypt(F) bbswitch(OF) zram(C) intel_powerclamp kvm_intel > kvm parport_pc(F) ppdev(F) lp(F) uvcvideo parport(F) crc32_pclmul(F) > ghash_clmulni_intel(F) aesni_intel(F) snd_hda_codec_realtek > snd_hda_intel snd_hda_codec aes_x86_64(F) asus_wmi lrw(F) > sparse_keymap gf128mul(F) snd_hwdep(F) glue_helper(F) ablk_helper(F) > arc4(F) snd_pcm(F) cryptd(F) joydev(F) videobuf2_vmalloc > videobuf2_memops videobuf2_core mxm_wmi iwldvm snd_page_alloc(F) bnep > snd_seq_midi(F) videodev snd_seq_midi_event(F) snd_rawmidi(F) mac80211 > snd_seq(F) snd_seq_device(F) btusb snd_timer(F) iwlwifi snd(F) > soundcore(F) microcode(F) psmouse(F) rfcomm bluetooth serio_raw(F) > cfg80211 lpc_ich mei_me wmi mei mac_hid coretemp binfmt_misc(F) nilfs2 > btrfs(F) xor(F) zlib_deflate(F) raid6_pq(F) libcrc32c(F) nbd(F) i915 > i2c_algo_bit drm_kms_helper drm alx mdio ahci(F) libahci(F) vi > deo(F) [last unloaded: ipmi_msghandler] > [ 1677.311066] CPU: 7 PID: 414 Comm: segctord Tainted: GF C O > 3.10.0-2-generic #10-Ubuntu > [ 1677.311096] Hardware name: ASUSTeK COMPUTER INC. N56VZ/N56VZ, BIOS > N56VZ.216 12/06/2012 > [ 1677.311124] task: ffff88021c484650 ti: ffff88021eaa2000 task.ti: > ffff88021eaa2000 > [ 1677.311155] RIP: 0010:[<ffffffffa024d0f2>] [<ffffffffa024d0f2>] > nilfs_end_page_io+0x12/0xd0 [nilfs2] > [ 1677.311199] RSP: 0000:ffff88021eaa3d00 EFLAGS: 00010202 > [ 1677.311218] RAX: ffff880167625180 RBX: 0000000000004c83 RCX: 0000000000000034 > [ 1677.311248] RDX: 000000000000000d RSI: 0000000000000000 RDI: 0000000000004c83 > [ 1677.311277] RBP: ffff88021eaa3d08 R08: 7800000000000000 R09: a8001fa0bc000000 > [ 1677.311305] R10: 57ffca5f4be82f00 R11: 0000000000000019 R12: ffff880213f46288 > [ 1677.311328] R13: 0000000000000000 R14: ffffea0007321f80 R15: ffff880167625138 > [ 1677.311353] FS: 0000000000000000(0000) GS:ffff88022efc0000(0000) > knlGS:0000000000000000 > [ 1677.311383] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1677.311403] CR2: 0000000000004c83 CR3: 0000000001c0e000 CR4: 00000000001407e0 > [ 1677.311428] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 1677.311455] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 1677.311478] Stack: > [ 1677.311487] ffff880213f461e0 ffff88021eaa3e00 ffffffffa024e4c5 > ffffffff81019d09 > [ 1677.311520] ffff88021c484650 ffff88021c484650 ffff88021c484650 > ffff880221282270 > [ 1677.311541] ffff88021e691f58 ffff88021e691e00 ffff880221282260 > 000000031c484650 > [ 1677.311562] Call Trace: > [ 1677.311575] [<ffffffffa024e4c5>] > nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] > [ 1677.311596] [<ffffffff81019d09>] ? sched_clock+0x9/0x10 > [ 1677.311614] [<ffffffffa024f3ab>] > nilfs_segctor_construct+0x17b/0x290 [nilfs2] > [ 1677.311636] [<ffffffffa024f5e2>] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] > [ 1677.311657] [<ffffffffa024f4c0>] ? > nilfs_segctor_construct+0x290/0x290 [nilfs2] > [ 1677.311677] [<ffffffff8107cae0>] kthread+0xc0/0xd0 > [ 1677.311690] [<ffffffff8107ca20>] ? kthread_create_on_node+0x120/0x120 > [ 1677.311709] [<ffffffff816dd16c>] ret_from_fork+0x7c/0xb0 > [ 1677.311724] [<ffffffff8107ca20>] ? kthread_create_on_node+0x120/0x120 > [ 1677.311740] Code: 2d ee e0 5b 5d c3 48 89 df e8 fb 25 ee e0 eb db > 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 85 ff 48 89 e5 53 48 > 89 fb 74 29 <48> 8b 07 f6 c4 08 0f 84 9c 00 00 00 48 8b 47 30 48 8b 00 > a9 00 > [ 1677.311821] RIP [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] > [ 1677.311841] RSP <ffff88021eaa3d00> > [ 1677.311850] CR2: 0000000000004c83 > [ 1677.320046] ---[ end trace 0e7c8d51bd66cbe6 ]--- > Jul 12 01:11:50 kernel: [ 1741.418989] SysRq : Emergency Sync > Jul 12 01:11:53 kernel: [ 1744.788020] SysRq : Terminate All Tasks -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CALJXSJoW9Qpp9t42u_k4cW3gO6qzSPoeCjtQDU3tDKq6TJ=K8Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJoW9Qpp9t42u_k4cW3gO6qzSPoeCjtQDU3tDKq6TJ=K8Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-18 17:30 ` Vyacheslav Dubeyko [not found] ` <F4156394-8A25-4F81-81C3-9921CB00BD92-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-07-18 17:30 UTC (permalink / raw) To: Jérôme Poulin; +Cc: linux-nilfs Hi Jérôme, On Jul 12, 2013, at 10:58 PM, Jérôme Poulin wrote: Thank you for details. Sorry for delay with answer. I were on vacation. > The problem happened again today after resume, it seems to be more > frequent since last week. Here is a pastebin of the traceback + > sysrq+W. > > http://pastebin.ca/2426059 > Unfortunately, currently I haven't access to this share. > On Fri, Jul 12, 2013 at 1:24 AM, Jérôme Poulin <jeromepoulin@gmail.com> wrote: >> In response to Vyacheslav Dubeyko, here is the problem I encounter, >> I'm not sure how to reproduce it now, but before deleting >> /var/cache/apt, I was able to reproduce it by issuing apt-get update. >> Now, it triggers when Ubuntu launches the apt daemon. Afterward, the >> whole system is frozen, not even SysRq+I would return me to the >> console. >> So, as I see, the reproducing path is: (1) delete /var/cache/apt; (2) issue apt-get update. Could you share additional details about the issue on your side? I mean such details: (1) The strace output for the case of issuing the apt-get update (in the case of issue reproducing). (2) I need more details about your NILFS2 partition. Could you share output of "nilfs-tune -l"? Thanks, Vyacheslav Dubeyko. >> After setting the log in another partition, here is what I have in syslog: >> >> Jul 12 01:08:43 bluetoothd[635]: Stopping discovery >> Jul 12 01:10:43 dbus[622]: [system] Activating service >> name='org.freedesktop.PackageKit' (using servicehelper) >> Jul 12 01:10:43 AptDaemon: INFO: Initializing daemon >> Jul 12 01:10:43 AptDaemon.PackageKit: INFO: Initializing PackageKit compat layer >> Jul 12 01:10:43 dbus[622]: [system] Successfully activated service >> 'org.freedesktop.PackageKit' >> [ 1677.310656] BUG: unable to handle kernel paging request at 0000000000004c83 >> [ 1677.310683] IP: [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] >> [ 1677.310708] PGD 0 >> [ 1677.310715] Oops: 0000 [#1] SMP >> [ 1677.310726] Modules linked in: pci_stub vboxpci(OF) vboxnetadp(OF) >> vboxnetflt(OF) vboxdrv(OF) nfsd(F) auth_rpcgss(F) nfs_acl(F) lockd(F) >> sunrpc(F) dm_crypt(F) bbswitch(OF) zram(C) intel_powerclamp kvm_intel >> kvm parport_pc(F) ppdev(F) lp(F) uvcvideo parport(F) crc32_pclmul(F) >> ghash_clmulni_intel(F) aesni_intel(F) snd_hda_codec_realtek >> snd_hda_intel snd_hda_codec aes_x86_64(F) asus_wmi lrw(F) >> sparse_keymap gf128mul(F) snd_hwdep(F) glue_helper(F) ablk_helper(F) >> arc4(F) snd_pcm(F) cryptd(F) joydev(F) videobuf2_vmalloc >> videobuf2_memops videobuf2_core mxm_wmi iwldvm snd_page_alloc(F) bnep >> snd_seq_midi(F) videodev snd_seq_midi_event(F) snd_rawmidi(F) mac80211 >> snd_seq(F) snd_seq_device(F) btusb snd_timer(F) iwlwifi snd(F) >> soundcore(F) microcode(F) psmouse(F) rfcomm bluetooth serio_raw(F) >> cfg80211 lpc_ich mei_me wmi mei mac_hid coretemp binfmt_misc(F) nilfs2 >> btrfs(F) xor(F) zlib_deflate(F) raid6_pq(F) libcrc32c(F) nbd(F) i915 >> i2c_algo_bit drm_kms_helper drm alx mdio ahci(F) libahci(F) vi >> deo(F) [last unloaded: ipmi_msghandler] >> [ 1677.311066] CPU: 7 PID: 414 Comm: segctord Tainted: GF C O >> 3.10.0-2-generic #10-Ubuntu >> [ 1677.311096] Hardware name: ASUSTeK COMPUTER INC. N56VZ/N56VZ, BIOS >> N56VZ.216 12/06/2012 >> [ 1677.311124] task: ffff88021c484650 ti: ffff88021eaa2000 task.ti: >> ffff88021eaa2000 >> [ 1677.311155] RIP: 0010:[<ffffffffa024d0f2>] [<ffffffffa024d0f2>] >> nilfs_end_page_io+0x12/0xd0 [nilfs2] >> [ 1677.311199] RSP: 0000:ffff88021eaa3d00 EFLAGS: 00010202 >> [ 1677.311218] RAX: ffff880167625180 RBX: 0000000000004c83 RCX: 0000000000000034 >> [ 1677.311248] RDX: 000000000000000d RSI: 0000000000000000 RDI: 0000000000004c83 >> [ 1677.311277] RBP: ffff88021eaa3d08 R08: 7800000000000000 R09: a8001fa0bc000000 >> [ 1677.311305] R10: 57ffca5f4be82f00 R11: 0000000000000019 R12: ffff880213f46288 >> [ 1677.311328] R13: 0000000000000000 R14: ffffea0007321f80 R15: ffff880167625138 >> [ 1677.311353] FS: 0000000000000000(0000) GS:ffff88022efc0000(0000) >> knlGS:0000000000000000 >> [ 1677.311383] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 1677.311403] CR2: 0000000000004c83 CR3: 0000000001c0e000 CR4: 00000000001407e0 >> [ 1677.311428] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 1677.311455] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> [ 1677.311478] Stack: >> [ 1677.311487] ffff880213f461e0 ffff88021eaa3e00 ffffffffa024e4c5 >> ffffffff81019d09 >> [ 1677.311520] ffff88021c484650 ffff88021c484650 ffff88021c484650 >> ffff880221282270 >> [ 1677.311541] ffff88021e691f58 ffff88021e691e00 ffff880221282260 >> 000000031c484650 >> [ 1677.311562] Call Trace: >> [ 1677.311575] [<ffffffffa024e4c5>] >> nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] >> [ 1677.311596] [<ffffffff81019d09>] ? sched_clock+0x9/0x10 >> [ 1677.311614] [<ffffffffa024f3ab>] >> nilfs_segctor_construct+0x17b/0x290 [nilfs2] >> [ 1677.311636] [<ffffffffa024f5e2>] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] >> [ 1677.311657] [<ffffffffa024f4c0>] ? >> nilfs_segctor_construct+0x290/0x290 [nilfs2] >> [ 1677.311677] [<ffffffff8107cae0>] kthread+0xc0/0xd0 >> [ 1677.311690] [<ffffffff8107ca20>] ? kthread_create_on_node+0x120/0x120 >> [ 1677.311709] [<ffffffff816dd16c>] ret_from_fork+0x7c/0xb0 >> [ 1677.311724] [<ffffffff8107ca20>] ? kthread_create_on_node+0x120/0x120 >> [ 1677.311740] Code: 2d ee e0 5b 5d c3 48 89 df e8 fb 25 ee e0 eb db >> 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 85 ff 48 89 e5 53 48 >> 89 fb 74 29 <48> 8b 07 f6 c4 08 0f 84 9c 00 00 00 48 8b 47 30 48 8b 00 >> a9 00 >> [ 1677.311821] RIP [<ffffffffa024d0f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] >> [ 1677.311841] RSP <ffff88021eaa3d00> >> [ 1677.311850] CR2: 0000000000004c83 >> [ 1677.320046] ---[ end trace 0e7c8d51bd66cbe6 ]--- >> Jul 12 01:11:50 kernel: [ 1741.418989] SysRq : Emergency Sync >> Jul 12 01:11:53 kernel: [ 1744.788020] SysRq : Terminate All Tasks > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <F4156394-8A25-4F81-81C3-9921CB00BD92-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <F4156394-8A25-4F81-81C3-9921CB00BD92-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2013-07-22 19:11 ` Jérôme Poulin [not found] ` <CALJXSJrj0J_-ZUCOurJXaYhx_wEJwxb2_5OOJjQSSmmP-PQDgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Jérôme Poulin @ 2013-07-22 19:11 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs On Thu, Jul 18, 2013 at 1:30 PM, Vyacheslav Dubeyko <slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> wrote: > > > The problem happened again today after resume, it seems to be more > > frequent since last week. Here is a pastebin of the traceback + > > sysrq+W. > > > > http://pastebin.ca/2426059 > > > > Unfortunately, currently I haven't access to this share. I'm back using my laptop but wasn't able to reproduce it again, it is not as easy as it was before to reproduce, sometimes apt-get update works, sometimes it freezes. Here is a different paste bin as pastebin.ca seems to be down for some time now: http://pastebin.com/ALmuHdfh > So, as I see, the reproducing path is: (1) delete /var/cache/apt; (2) issue apt-get update. In fact it is the other way around; (1) Issue apt-get update: Crash. (2) Reboot, try again: Crash. (3) Delete /var/lib/apt/lists/* and try again: Works for some time. > (1) The strace output for the case of issuing the apt-get update (in the case of issue reproducing). That will be hard to obtain except maybe if I alias my apt-get update to strace -f -o /boot/somefile.txt apt-get update and hope the problem occur again. It sometimes happen when launched from the Ubuntu Store which doesn't use bash to launch the update. > (2) I need more details about your NILFS2 partition. Could you share output of "nilfs-tune -l"? $ sudo nilfs-tune -l /dev/vgUbuntu/root nilfs-tune 2.1.4 Filesystem volume name: root Filesystem UUID: 336f247d-c8d1-4e91-887a-258121c4face Filesystem magic number: 0x3434 Filesystem revision #: 2.0 Filesystem features: (none) Filesystem state: invalid or mounted Filesystem OS type: Linux Block size: 4096 Filesystem created: Thu Apr 11 22:34:29 2013 Last mount time: Mon Jul 22 13:35:46 2013 Last write time: Mon Jul 22 15:06:02 2013 Mount count: 86 Maximum mount count: 50 Reserve blocks uid: 0 (user root) Reserve blocks gid: 0 (group root) First inode: 11 Inode size: 128 DAT entry size: 32 Checkpoint size: 192 Segment usage size: 16 Number of segments: 25599 Device size: 214748364800 First data block: 1 # of blocks per segment: 2048 Reserved segments %: 5 Last checkpoint #: 3229012 Last block address: 17242181 Last sequence #: 136170 Free blocks count: 17827840 Commit interval: 0 # of blks to create seg: 0 CRC seed: 0x7ab1d7ed CRC check sum: 0xfab54710 CRC check data size: 0x00000118 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CALJXSJrj0J_-ZUCOurJXaYhx_wEJwxb2_5OOJjQSSmmP-PQDgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJrj0J_-ZUCOurJXaYhx_wEJwxb2_5OOJjQSSmmP-PQDgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-23 11:15 ` Vyacheslav Dubeyko 2013-07-26 18:13 ` Jérôme Poulin 0 siblings, 1 reply; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-07-23 11:15 UTC (permalink / raw) To: Jérôme Poulin; +Cc: linux-nilfs On Mon, 2013-07-22 at 15:11 -0400, Jérôme Poulin wrote: [snip] > > I'm back using my laptop but wasn't able to reproduce it again, it is > not as easy as it was before to reproduce, sometimes apt-get update > works, sometimes it freezes. Here is a different paste bin as > pastebin.ca seems to be down for some time now: > http://pastebin.com/ALmuHdfh > Thank you. Now I downloaded this output. > > So, as I see, the reproducing path is: (1) delete /var/cache/apt; (2) issue apt-get update. > > In fact it is the other way around; > (1) Issue apt-get update: Crash. > (2) Reboot, try again: Crash. > (3) Delete /var/lib/apt/lists/* and try again: Works for some time. > Ok. Thank you for additional details. > > > (1) The strace output for the case of issuing the apt-get update (in the case of issue reproducing). > > That will be hard to obtain except maybe if I alias my apt-get update > to strace -f -o /boot/somefile.txt apt-get update and hope the problem > occur again. It sometimes happen when launched from the Ubuntu Store > which doesn't use bash to launch the update. > Ok. I see. I'll try to analyze the issue on the basis of available information. But it will be a great to have the requested strace output. Thanks, Vyacheslav Dubeyko. > > (2) I need more details about your NILFS2 partition. Could you share output of "nilfs-tune -l"? > > $ sudo nilfs-tune -l /dev/vgUbuntu/root > nilfs-tune 2.1.4 > Filesystem volume name: root > Filesystem UUID: 336f247d-c8d1-4e91-887a-258121c4face > Filesystem magic number: 0x3434 > Filesystem revision #: 2.0 > Filesystem features: (none) > Filesystem state: invalid or mounted > Filesystem OS type: Linux > Block size: 4096 > Filesystem created: Thu Apr 11 22:34:29 2013 > Last mount time: Mon Jul 22 13:35:46 2013 > Last write time: Mon Jul 22 15:06:02 2013 > Mount count: 86 > Maximum mount count: 50 > Reserve blocks uid: 0 (user root) > Reserve blocks gid: 0 (group root) > First inode: 11 > Inode size: 128 > DAT entry size: 32 > Checkpoint size: 192 > Segment usage size: 16 > Number of segments: 25599 > Device size: 214748364800 > First data block: 1 > # of blocks per segment: 2048 > Reserved segments %: 5 > Last checkpoint #: 3229012 > Last block address: 17242181 > Last sequence #: 136170 > Free blocks count: 17827840 > Commit interval: 0 > # of blks to create seg: 0 > CRC seed: 0x7ab1d7ed > CRC check sum: 0xfab54710 > CRC check data size: 0x00000118 > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel Bug: unable to handle kernel paging request 2013-07-23 11:15 ` Vyacheslav Dubeyko @ 2013-07-26 18:13 ` Jérôme Poulin [not found] ` <CALJXSJrY22eGkYA76wwL4moAdsjV+_PUtvVO6tt5K16hzMh8xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Jérôme Poulin @ 2013-07-26 18:13 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs Good afternoon, I have more informations to add to the bug. On Tue, Jul 23, 2013 at 7:15 AM, Vyacheslav Dubeyko <slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> wrote: >> > So, as I see, the reproducing path is: (1) delete /var/cache/apt; (2) issue apt-get update. >> >> In fact it is the other way around; >> (1) Issue apt-get update: Crash. >> (2) Reboot, try again: Crash. >> (3) Delete /var/lib/apt/lists/* and try again: Works for some time. >> > > Ok. Thank you for additional details. I was able to reproduce consistently the problem. 1. Make a snapshot of a checkpoint and mount it. 2. Read from the checkpoint (make a backup). 3. Issue apt-get update. > >> >> > (1) The strace output for the case of issuing the apt-get update (in the case of issue reproducing). >> >> That will be hard to obtain except maybe if I alias my apt-get update >> to strace -f -o /boot/somefile.txt apt-get update and hope the problem >> occur again. It sometimes happen when launched from the Ubuntu Store >> which doesn't use bash to launch the update. >> > > Ok. I see. I'll try to analyze the issue on the basis of available > information. But it will be a great to have the requested strace output. Here are the log files. 1. strace from the backup was still reading, it did not stop. 2. apt-get-crash.log 4362 13:32:04 read(6, " Debugging symbols\nHomepage: htt"..., 32052) = 32052 4362 13:32:04 read(6, "tcp-wrappers/libwrap0-dev_7.6.q-"..., 32460) = 32460 4362 13:32:04 read(6, "bdevel\nInstalled-Size: 542\nMaint"..., 32710) = 32710 4362 13:32:04 read(6, "ian.org>\nArchitecture: i386\nSour"..., 32558) = 32558 4362 13:32:04 read(6, "l-Maintainer: XCB Developers <xc"..., 32613) = 32613 4362 13:32:04 read(6, "ed: 9m\n\nPackage: libxcb-xvmc0-db"..., 31979) = 31979 4362 13:32:04 read(6, "ibz-dev\nFilename: pool/main/x/xf"..., 32335) = 32335 4362 13:32:04 read(6, "n: Ubuntu\nSupported: 9m\nTask: my"..., 31903) = 31903 4362 13:32:04 read(6, "\nFilename: pool/main/libx/libxpm"..., 32467) = 32467 4362 13:32:04 read(6, "buntu/+filebug\nOrigin: Ubuntu\nSu"..., 31982) = 31982 4362 13:32:04 read(6, "chpad.net/ubuntu/+filebug\nOrigin"..., 31976) = 31976 4362 13:32:04 read(6, "buntu Developers <ubuntu-devel-d"..., 32681) = 32681 4362 13:32:04 read(6, "-utils\nDepends: debconf (>= 0.5)"..., 32468) = 32468 4362 13:32:04 read(6, "top, mythbuntu-backend-slave, my"..., 31678) = 31678 4362 13:32:04 read(6, "d5: 1b0992eebd45ca5ceadc775532a4"..., 32126) = 32126 4362 13:32:04 read(6, " all\nSource: munin\nVersion: 2.0."..., 32535) = 32535 4362 13:32:04 read(6, "fice-core | openoffice.org-hunsp"..., 32373) = 32373 4362 13:32:04 read(6, "ffice.org-dictionaries (1:3.3.0~"..., 32513) = 32513 4362 13:32:04 read(6, "hbuntu-backend-master\n\nPackage: "..., 31756) = 31756 4362 13:32:04 read(6, "2614b9\nSHA256: bcc70d6577dd06565"..., 32118) = 32118 4362 13:32:04 read(6, "17\nSHA256: d903f798a8c38fef2b992"..., 32339) = 32339 -- End of file -- 3. syslog: Jul 26 13:32:04 kernel: [ 317.525021] BUG: unable to handle kernel paging request at 00000000000033e5 Jul 26 13:32:04 kernel: [ 317.525371] IP: [<ffffffffa02930f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] Jul 26 13:32:04 kernel: [ 317.525715] PGD 0 Jul 26 13:32:04 kernel: [ 317.526044] Oops: 0000 [#1] SMP Jul 26 13:32:04 kernel: [ 317.526372] Modules linked in: pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF) vboxdrv(OF) nfsd(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F) dm_crypt(F) bbswitch(OF) intel_powerclamp kvm uvcvideo videobuf2_v malloc videobuf2_memops videobuf2_core crc32_pclmul(F) ghash_clmulni_intel(F) aesni_intel(F) videodev aes_x86_64(F) lrw(F) gf128mul(F) btusb glue_helper(F) ablk_helper(F) cryptd(F) snd_hda_codec_realtek snd_hda_intel snd_hda_codec arc4(F) asus_wmi snd_ hwdep(F) sparse_keymap mxm_wmi snd_pcm(F) joydev(F) snd_page_alloc(F) iwldvm snd_seq_midi(F) snd_seq_midi_event(F) snd_rawmidi(F) snd_seq(F) mac80211 snd_seq_device(F) snd_timer(F) iwlwifi snd(F) soundcore(F) cfg80211 mei_me mei lpc_ich psmouse(F) wmi microcode(F) serio_raw(F) mac_hid parport_pc(F) ppdev(F) lp(F) parport(F) bnep rfcomm bluetooth binfmt_misc(F) coretemp nilfs2 btrfs(F) xor(F) zlib_deflate(F) raid6_pq(F) libcrc32c(F) nbd(F) hid_generic usbhid hid usb_storage(F) i915 i2c_algo_bit drm_k ms_helper drm alx mdio a Jul 26 13:32:04 kernel: hci(F) libahci(F) video(F) [last unloaded: ipmi_msghandler] Jul 26 13:32:04 kernel: [ 317.528925] CPU: 4 PID: 388 Comm: segctord Tainted: GF C O 3.10.0-4-generic #13-Ubuntu Jul 26 13:32:04 kernel: [ 317.529487] Hardware name: ASUSTeK COMPUTER INC. N56VZ/N56VZ, BIOS N56VZ.216 12/06/2012 Jul 26 13:32:04 kernel: [ 317.530046] task: ffff88021d159770 ti: ffff88022067a000 task.ti: ffff88022067a000 Jul 26 13:32:04 kernel: [ 317.530606] RIP: 0010:[<ffffffffa02930f2>] [<ffffffffa02930f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] Jul 26 13:32:04 kernel: [ 317.531184] RSP: 0018:ffff88022067bd00 EFLAGS: 00010202 Jul 26 13:32:04 kernel: [ 317.531756] RAX: ffff8801fd7f16c8 RBX: 00000000000033e5 RCX: 0000000000000034 Jul 26 13:32:04 kernel: [ 317.532371] RDX: 000000000000000d RSI: 0000000000000000 RDI: 00000000000033e5 Jul 26 13:32:04 kernel: [ 317.532952] RBP: ffff88022067bd08 R08: 7800000000000000 R09: a80022ed3c000000 Jul 26 13:32:04 kernel: [ 317.533541] R10: 57ffc712ccbb4f00 R11: 0000000000000019 R12: ffff88021db8eeb8 Jul 26 13:32:04 kernel: [ 317.534129] R13: 0000000000000000 R14: ffffea0007a481c0 R15: ffff8801fd7f1680 Jul 26 13:32:04 kernel: [ 317.534717] FS: 0000000000000000(0000) GS:ffff88022ef00000(0000) knlGS:0000000000000000 Jul 26 13:32:04 kernel: [ 317.535318] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 26 13:32:04 kernel: [ 317.535919] CR2: 00000000000033e5 CR3: 0000000001c0e000 CR4: 00000000001407e0 Jul 26 13:32:04 kernel: [ 317.536528] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 26 13:32:04 kernel: [ 317.537176] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 26 13:32:04 kernel: [ 317.537791] Stack: Jul 26 13:32:04 kernel: [ 317.538406] ffff88021db8ee10 ffff88022067be00 ffffffffa02944c5 ffffffff81019d09 Jul 26 13:32:04 kernel: [ 317.539057] ffff88021d159770 ffff88021d159770 ffff88021d159770 ffff88021f1bec70 Jul 26 13:32:04 kernel: [ 317.539715] ffff88021dcc1158 ffff88021dcc1000 ffff88021f1bec60 000000031d159770 Jul 26 13:32:04 kernel: [ 317.540384] Call Trace: Jul 26 13:32:04 kernel: [ 317.541056] [<ffffffffa02944c5>] nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] Jul 26 13:32:04 kernel: [ 317.541744] [<ffffffff81019d09>] ? sched_clock+0x9/0x10 Jul 26 13:32:04 kernel: [ 317.542440] [<ffffffffa02953ab>] nilfs_segctor_construct+0x17b/0x290 [nilfs2] Jul 26 13:32:04 kernel: [ 317.543145] [<ffffffffa02955e2>] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] Jul 26 13:32:04 kernel: [ 317.543840] [<ffffffffa02954c0>] ? nilfs_segctor_construct+0x290/0x290 [nilfs2] Jul 26 13:32:04 kernel: [ 317.544534] [<ffffffff8107cca0>] kthread+0xc0/0xd0 Jul 26 13:32:04 kernel: [ 317.545225] [<ffffffff8107cbe0>] ? kthread_create_on_node+0x120/0x120 Jul 26 13:32:04 kernel: [ 317.545924] [<ffffffff816f026c>] ret_from_fork+0x7c/0xb0 Jul 26 13:32:04 kernel: [ 317.546583] [<ffffffff8107cbe0>] ? kthread_create_on_node+0x120/0x120 Jul 26 13:32:04 kernel: [ 317.547208] Code: d9 e9 e0 5b 5d c3 48 89 df e8 db d1 e9 e0 eb db 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 85 ff 48 89 e5 53 48 89 fb 74 29 <48> 8b 07 f6 c4 08 0f 84 9c 00 00 00 48 8b 47 30 48 8b 00 a9 00 Jul 26 13:32:04 kernel: [ 317.548013] RIP [<ffffffffa02930f2>] nilfs_end_page_io+0x12/0xd0 [nilfs2] Jul 26 13:32:04 kernel: [ 317.548674] RSP <ffff88022067bd00> Jul 26 13:32:04 kernel: [ 317.549286] CR2: 00000000000033e5 Jul 26 13:32:04 kernel: [ 317.549897] ---[ end trace ffe6496742ccfbe8 ]--- Jul 26 13:32:06 AptDaemon: INFO: Initializing daemon Jul 26 13:32:07 AptDaemon.PackageKit: INFO: Initializing PackageKit compat layer Jul 26 13:32:07 dbus[569]: [system] Successfully activated service 'org.freedesktop.PackageKit' -- Reboot -- -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CALJXSJrY22eGkYA76wwL4moAdsjV+_PUtvVO6tt5K16hzMh8xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJrY22eGkYA76wwL4moAdsjV+_PUtvVO6tt5K16hzMh8xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-27 17:06 ` Vyacheslav Dubeyko [not found] ` <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A@mail.gmail.com> 0 siblings, 1 reply; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-07-27 17:06 UTC (permalink / raw) To: Jérôme Poulin; +Cc: linux-nilfs On Jul 26, 2013, at 10:13 PM, Jérôme Poulin wrote: > Good afternoon, > > I have more informations to add to the bug. > Thank you for additional info. [snip] > I was able to reproduce consistently the problem. > 1. Make a snapshot of a checkpoint and mount it. > 2. Read from the checkpoint (make a backup). > 3. Issue apt-get update. > I tried to reproduce the issue but I haven't success in it. Could you describe the reproducing path in more details? Moreover, could you share lscp utility output for your NILFS2 partition? [snip] > > Here are the log files. > 1. strace from the backup was still reading, it did not stop. > 2. apt-get-crash.log > 4362 13:32:04 read(6, " Debugging symbols\nHomepage: htt"..., 32052) = 32052 > 4362 13:32:04 read(6, "tcp-wrappers/libwrap0-dev_7.6.q-"..., 32460) = 32460 > 4362 13:32:04 read(6, "bdevel\nInstalled-Size: 542\nMaint"..., 32710) = 32710 > 4362 13:32:04 read(6, "ian.org>\nArchitecture: i386\nSour"..., 32558) = 32558 > 4362 13:32:04 read(6, "l-Maintainer: XCB Developers <xc"..., 32613) = 32613 > 4362 13:32:04 read(6, "ed: 9m\n\nPackage: libxcb-xvmc0-db"..., 31979) = 31979 > 4362 13:32:04 read(6, "ibz-dev\nFilename: pool/main/x/xf"..., 32335) = 32335 > 4362 13:32:04 read(6, "n: Ubuntu\nSupported: 9m\nTask: my"..., 31903) = 31903 > 4362 13:32:04 read(6, "\nFilename: pool/main/libx/libxpm"..., 32467) = 32467 > 4362 13:32:04 read(6, "buntu/+filebug\nOrigin: Ubuntu\nSu"..., 31982) = 31982 > 4362 13:32:04 read(6, "chpad.net/ubuntu/+filebug\nOrigin"..., 31976) = 31976 > 4362 13:32:04 read(6, "buntu Developers <ubuntu-devel-d"..., 32681) = 32681 > 4362 13:32:04 read(6, "-utils\nDepends: debconf (>= 0.5)"..., 32468) = 32468 > 4362 13:32:04 read(6, "top, mythbuntu-backend-slave, my"..., 31678) = 31678 > 4362 13:32:04 read(6, "d5: 1b0992eebd45ca5ceadc775532a4"..., 32126) = 32126 > 4362 13:32:04 read(6, " all\nSource: munin\nVersion: 2.0."..., 32535) = 32535 > 4362 13:32:04 read(6, "fice-core | openoffice.org-hunsp"..., 32373) = 32373 > 4362 13:32:04 read(6, "ffice.org-dictionaries (1:3.3.0~"..., 32513) = 32513 > 4362 13:32:04 read(6, "hbuntu-backend-master\n\nPackage: "..., 31756) = 31756 > 4362 13:32:04 read(6, "2614b9\nSHA256: bcc70d6577dd06565"..., 32118) = 32118 > 4362 13:32:04 read(6, "17\nSHA256: d903f798a8c38fef2b992"..., 32339) = 32339 > -- End of file -- > Could you share the full strace output? Because in shared part of strace output I can't see any issue or failure. Moreover, the reported issue occurs in segstor thread. But, usually, segctor activity takes place after writing or flushing operation. So, it means for me that a reason of the issue is hidden in another part of the strace output or, maybe, in strace output for backup operation. Could you share the strace output for backup operation too? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A@mail.gmail.com>]
[parent not found: <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-07-29 17:30 ` Vyacheslav Dubeyko 2013-08-09 13:15 ` Vyacheslav Dubeyko 1 sibling, 0 replies; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-07-29 17:30 UTC (permalink / raw) To: Jérôme Poulin; +Cc: linux-nilfs On Jul 27, 2013, at 10:05 PM, Jérôme Poulin wrote: > On Sat, Jul 27, 2013 at 1:06 PM, Vyacheslav Dubeyko <slava-yeENwD64cLyIwRZHo2/mJg@public.gmane.orgm> wrote: >> I tried to reproduce the issue but I haven't success in it. >> Could you describe the reproducing path in more details? >> > > Since I have a reproducible path, here is how I did: > 1. init S to get to single user mode. > 2. sysrq+E to make sure only my shell is running > 3. start network-manager to get my wifi connection up > 4. login as root and launch "screen" > 5. cd /boot/log/nilfs which is a ext3 mount point and can log when NILFS dies. > 6. lscp | xz -9e > lscp.txt.xz > 7. mount my snapshot using mount -o cp=3360839,ro /dev/vgUbuntu/root /mnt/nilfs > 8. start a screen to dump /proc/kmsg to text file since rsyslog is killed > 9. start a screen and launch strace -f -o find-cat.log -t find > /mnt/nilfs -type f -exec cat {} > /dev/null \; > 10. start a screen and launch strace -f -o apt-get.log -t apt-get update > 11. launch the last command again as it did not crash the first time > 12. apt-get crashes > 13. ps aux > ps-aux-crashed.log > 13. sysrq+W > 14. sysrq+E wait for everything to terminate > 15. sysrq+SUSB > I have reproduced the issue successfully. So, I can begin to investigate it. Thank you for efforts in the issue reproducing and detailed information about the issue. With the bets regards, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-07-29 17:30 ` Vyacheslav Dubeyko @ 2013-08-09 13:15 ` Vyacheslav Dubeyko 2013-08-14 22:38 ` Ryusuke Konishi 1 sibling, 1 reply; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-08-09 13:15 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs, Jérôme Poulin Hi Ryusuke, I am investigating the issue during last two weeks and I think that it is time to share current results and my considerations. I feel necessity to discuss possible reasons of the issue. Maybe, I miss something and it needs to advise me a proper way of the issue investigation. Actually, I can reproduce the issue by means of way of starting on rootfs compilation task of Linux kernel and apt-get update task in parallel. The issue results in such crash: [ 220.130662] BUG: unable to handle kernel paging request at 0000000000004612 [ 220.130666] IP: [<ffffffff812b55ae>] nilfs_end_page_io+0x3e/0x180 [ 220.130574] Call Trace: [ 220.130587] [<ffffffff816c6b57>] dump_stack+0x19/0x1b [ 220.130593] [<ffffffff812b5667>] nilfs_end_page_io+0xf7/0x180 [ 220.130598] [<ffffffff812ba2c4>] nilfs_segctor_do_construct+0x1984/0x2410 [ 220.130603] [<ffffffff812bb1f3>] nilfs_segctor_construct+0x1c3/0x450 [ 220.130608] [<ffffffff812bb5da>] nilfs_segctor_thread+0x15a/0x4c0 [ 220.130612] [<ffffffff816cad1f>] ? __schedule+0x3cf/0x810 [ 220.130617] [<ffffffff812bb480>] ? nilfs_segctor_construct+0x450/0x450 [ 220.130622] [<ffffffff81069760>] kthread+0xc0/0xd0 [ 220.130626] [<ffffffff810696a0>] ? flush_kthread_worker+0xb0/0xb0 [ 220.130631] [<ffffffff816d519c>] ret_from_fork+0x7c/0xb0 [ 220.130635] [<ffffffff810696a0>] ? flush_kthread_worker+0xb0/0xb0 I suppose that I haven't clear picture of the issue, currently. But I have some steady reproducible results of the issue investigation. As I can see, the issue is reproduced in the case of writing on volume many blocks of a big file (for example, 1518 blocks) with mixture in the buffer heads chain some count of another small files' blocks. Usually, the issue takes place for a buffer heads chain that contains about 1500 - 2000 blocks. I have such picture on the phase of adding of payload buffers: [ 959.803987] NILFS [nilfs_segbuf_add_payload_buffer]:167 page->index 22579166, i_ino 3, i_size 0, nblocks 1762 [ 959.803990] NILFS [nilfs_segbuf_add_payload_buffer]:168 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209838a08 [ 959.803993] NILFS [nilfs_segctor_apply_buffers]:1158 listp ffff880220345ba8, listp->prev ffff880209836a70, listp->next ffff880209839ad8 [ 959.803997] NILFS [nilfs_segctor_apply_buffers]:1159 bh->b_blocknr 22579166, bh->b_size 4096, bh->b_page ffffea000895db40 [ 959.804000] NILFS [nilfs_segctor_apply_buffers]:1160 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209838a08 [ 959.804006] NILFS [nilfs_segbuf_add_payload_buffer]:167 page->index 22579167, i_ino 3, i_size 0, nblocks 1763 [ 959.804009] NILFS [nilfs_segbuf_add_payload_buffer]:168 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff8802267aac78 [ 959.804013] NILFS [nilfs_segctor_apply_buffers]:1158 listp ffff880220345ba8, listp->prev ffff880209836a70, listp->next ffff880209836ad8 [ 959.804016] NILFS [nilfs_segctor_apply_buffers]:1159 bh->b_blocknr 22579167, bh->b_size 4096, bh->b_page ffffea00082b73c0 [ 959.804025] NILFS [nilfs_segbuf_add_payload_buffer]:167 page->index 22579168, i_ino 3, i_size 0, nblocks 1764 [ 959.804028] NILFS [nilfs_segbuf_add_payload_buffer]:168 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209839ad8 [ 959.804032] NILFS [nilfs_segctor_apply_buffers]:1158 listp ffff880220345ba8, listp->prev ffff880209836a70, listp->next ffff880209836a70 [ 959.804035] NILFS [nilfs_segctor_apply_buffers]:1159 bh->b_blocknr 22579168, bh->b_size 4096, bh->b_page ffffea00082afc00 [ 959.804044] NILFS [nilfs_segbuf_add_payload_buffer]:167 page->index 22579169, i_ino 3, i_size 0, nblocks 1765 [ 959.804047] NILFS [nilfs_segbuf_add_payload_buffer]:168 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836ad8 [ 959.804051] NILFS [nilfs_segctor_apply_buffers]:1158 listp ffff880220345ba8, listp->prev ffff880220345ba8, listp->next ffff880220345ba8 [ 959.804054] NILFS [nilfs_segctor_apply_buffers]:1159 bh->b_blocknr 22579169, bh->b_size 4096, bh->b_page ffffea00082a9b40 [ 959.804058] NILFS [nilfs_segctor_apply_buffers]:1160 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836ad8 [ 959.804092] NILFS [nilfs_segbuf_add_payload_buffer]:167 page->index 22583013, i_ino 0, i_size 242770509824, nblocks 1766 [ 959.804096] NILFS [nilfs_segbuf_add_payload_buffer]:168 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836a70 It is possible to see that: (1) It was added 1766 blocks in list. (2) The last blocks are blocks of inode (ino = 3): #1762, #1763, #1764, #1765. (3) The last buffer head has next pointer ffff8802247e3af8 that is pointed on first buffer head in list (as I understand). But on the stage of complete write we have such picture: [ 959.848722] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1 [ 959.848735] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 21394345, bh->b_size 4096, bh->b_page ffffea00076ffd80 [ 959.848739] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff88021de434c0, bh->b_assoc_buffers.prev ffff8802247e3828 [ 959.848744] NILFS [nilfs_segctor_complete_write]:2227 page->index 12, i_ino 1005398, i_size 77824 [ 959.848752] NILFS [nilfs_segctor_complete_write]:2224 bh_count 2 [ 959.848756] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 21394887, bh->b_size 4096, bh->b_page ffffea00078db900 [ 959.848759] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff88021de10048, bh->b_assoc_buffers.prev ffff88021de42048 [ 959.848763] NILFS [nilfs_segctor_complete_write]:2227 page->index 13, i_ino 1005398, i_size 77824 [ 959.848771] NILFS [nilfs_segctor_complete_write]:2224 bh_count 3 [ 959.848774] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 50231152, bh->b_size 4096, bh->b_page ffffea000889ae80 [ 959.848778] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182abab40, bh->b_assoc_buffers.prev ffff88021de434c0 [ 959.848782] NILFS [nilfs_segctor_complete_write]:2227 page->index 50231152, i_ino 1005398, i_size 77824 [............................................................................................................................................] [ 959.874242] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1761 [ 959.874245] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22583012, bh->b_size 4096, bh->b_page ffffea00082a9b40 [ 959.874249] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182b97fb8, bh->b_assoc_buffers.prev ffff880209836ad8 [ 959.874252] NILFS [nilfs_segctor_complete_write]:2227 page->index 22583012, i_ino 3, i_size 0 [ 959.874255] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1762 [ 959.874259] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22583013, bh->b_size 4096, bh->b_page ffffea0005fe3080 [ 959.874262] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836a70 [ 959.874266] NILFS [nilfs_segctor_complete_write]:2227 page->index 22583013, i_ino 0, i_size 242770509824 [ 959.874270] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1763 [ 959.874274] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22581248, bh->b_size 22583295, bh->b_page 0000000000002b13 [ 959.874277] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182abab40, bh->b_assoc_buffers.prev ffff880182b97fb8 It is possible to see that buffer head {page->index 22583013, i_ino 0, i_size 242770509824, nblocks 1766} has #1762 index on complete write phase and namely next item in the list to raise crash because of illegal page address {bh->b_page 0000000000002b13}. But all content of next item is very strange. So, I think that it is not list's memory. But it is more strange that bh->b_assoc_buffers.prev ffff880182b97fb8 of this corrupted item has address that points on previous good item (this item was last in the list). As I can see, item #1762 {page->index 22583013, i_ino 0, i_size 242770509824} has unchanged next and prev pointers {bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836a70}. So, I suspect that we have the reason of the issue somewhere between add payload buffer and complete write phase. But, currently, I haven't clear understanding of the whole picture and the reason of the issue. I think that it makes sense to try to simplify the issue environment with the purpose to investigate the issue more deeply. But, maybe, you can advise something yet. Do you have any ideas about the reason of the issue? Could you share your vision of possible reason of the issue? Anyway, I continue investigation of the issue. But, unfortunately, I don't catch the issue reason yet. With the best regards, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel Bug: unable to handle kernel paging request 2013-08-09 13:15 ` Vyacheslav Dubeyko @ 2013-08-14 22:38 ` Ryusuke Konishi [not found] ` <20130815.073806.260411879.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Ryusuke Konishi @ 2013-08-14 22:38 UTC (permalink / raw) To: Vyacheslav Dubeyko Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA, jeromepoulin-Re5JQEeQqe8AvxtiuMwx3w Hi Vyacheslav, On Fri, 09 Aug 2013 17:15:25 +0400, Vyacheslav Dubeyko wrote: > Hi Ryusuke, > > I am investigating the issue during last two weeks and I think that it > is time to share current results and my considerations. I feel necessity > to discuss possible reasons of the issue. Maybe, I miss something and it > needs to advise me a proper way of the issue investigation. > > Actually, I can reproduce the issue by means of way of starting on > rootfs compilation task of Linux kernel and apt-get update task in > parallel. The issue results in such crash: <snip> > [ 959.874242] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1761 > [ 959.874245] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22583012, bh->b_size 4096, bh->b_page ffffea00082a9b40 > [ 959.874249] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182b97fb8, bh->b_assoc_buffers.prev ffff880209836ad8 > [ 959.874252] NILFS [nilfs_segctor_complete_write]:2227 page->index 22583012, i_ino 3, i_size 0 > [ 959.874255] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1762 > [ 959.874259] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22583013, bh->b_size 4096, bh->b_page ffffea0005fe3080 > [ 959.874262] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836a70 > [ 959.874266] NILFS [nilfs_segctor_complete_write]:2227 page->index 22583013, i_ino 0, i_size 242770509824 This block (physical block number = #22583013) looks to be a super root block, so the strange i_ino and i_size are, maybe, correct. > [ 959.874270] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1763 > [ 959.874274] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22581248, bh->b_size 22583295, bh->b_page 0000000000002b13 > [ 959.874277] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182abab40, bh->b_assoc_buffers.prev ffff880182b97fb8 This looks a list head structure on the head at &segbuf->sb_payload_buffers. So, maybe the strange b_blocknr, b_size, b_page, are correct. How did you judge the end condition of this loop? Is this buffer head actually causing the oops at nilfs_end_page_io() ? Regards, Ryusuke Konishi > It is possible to see that buffer head {page->index 22583013, i_ino 0, > i_size 242770509824, nblocks 1766} has #1762 index on complete write > phase and namely next item in the list to raise crash because of illegal > page address {bh->b_page 0000000000002b13}. But all content of next item > is very strange. So, I think that it is not list's memory. But it is > more strange that bh->b_assoc_buffers.prev ffff880182b97fb8 of this > corrupted item has address that points on previous good item (this item > was last in the list). As I can see, item #1762 {page->index 22583013, > i_ino 0, i_size 242770509824} has unchanged next and prev pointers > {bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev > ffff880209836a70}. So, I suspect that we have the reason of the issue > somewhere between add payload buffer and complete write phase. But, > currently, I haven't clear understanding of the whole picture and the > reason of the issue. > > I think that it makes sense to try to simplify the issue environment > with the purpose to investigate the issue more deeply. But, maybe, you > can advise something yet. > > Do you have any ideas about the reason of the issue? Could you share > your vision of possible reason of the issue? Anyway, I continue > investigation of the issue. But, unfortunately, I don't catch the issue > reason yet. > > With the best regards, > Vyacheslav Dubeyko. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20130815.073806.260411879.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <20130815.073806.260411879.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2013-08-16 4:49 ` Ryusuke Konishi [not found] ` <20130816.134934.27810145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Ryusuke Konishi @ 2013-08-16 4:49 UTC (permalink / raw) To: Vyacheslav Dubeyko Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA, jeromepoulin-Re5JQEeQqe8AvxtiuMwx3w Hi Vyachelav, I haven't yet succeeded to reproduce this issue even with apt-get update operation. How long did it take to reproduce this issue in your environment ? According to reported logs, the crash seems to occur at the following BUG_ON() which is inlined in nilfs_end_page_io() function: #define page_buffers(page) \ ({ \ BUG_ON(!PagePrivate(page)); \ ((struct buffer_head *)page_private(page)); \ }) However, it's hard to narrow down the cause without reproducing the issue. The page private flag is used to indicate that the given page has buffer heads. So, this issue seems to be caused by that an invalid page was passed to nilfs_end_page_io() or try_to_free_buffers() freed the buffer head by some reason. The latter situation can occur if the following buffer_busy() function unexpectedly failed for the buffer head: static inline int buffer_busy(struct buffer_head *bh) { return atomic_read(&bh->b_count) | (bh->b_state & ((1 << BH_Dirty) | (1 << BH_Lock))); } Since BH_Dirty is dropped in nilfs_segctor_complete_write() function, I suspect the situation that bh->b_count mistakenly reached zero. Anyhow, further debug seems hard without reproducing the issue. Regards, Ryusuke Konishi On Thu, 15 Aug 2013 07:38:06 +0900 (JST), Ryusuke Konishi wrote: > Hi Vyacheslav, > On Fri, 09 Aug 2013 17:15:25 +0400, Vyacheslav Dubeyko wrote: >> Hi Ryusuke, >> >> I am investigating the issue during last two weeks and I think that it >> is time to share current results and my considerations. I feel necessity >> to discuss possible reasons of the issue. Maybe, I miss something and it >> needs to advise me a proper way of the issue investigation. >> >> Actually, I can reproduce the issue by means of way of starting on >> rootfs compilation task of Linux kernel and apt-get update task in >> parallel. The issue results in such crash: > <snip> > >> [ 959.874242] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1761 >> [ 959.874245] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22583012, bh->b_size 4096, bh->b_page ffffea00082a9b40 >> [ 959.874249] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182b97fb8, bh->b_assoc_buffers.prev ffff880209836ad8 >> [ 959.874252] NILFS [nilfs_segctor_complete_write]:2227 page->index 22583012, i_ino 3, i_size 0 > >> [ 959.874255] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1762 >> [ 959.874259] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22583013, bh->b_size 4096, bh->b_page ffffea0005fe3080 >> [ 959.874262] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev ffff880209836a70 >> [ 959.874266] NILFS [nilfs_segctor_complete_write]:2227 page->index 22583013, i_ino 0, i_size 242770509824 > > This block (physical block number = #22583013) looks to be a super > root block, so the strange i_ino and i_size are, maybe, correct. > >> [ 959.874270] NILFS [nilfs_segctor_complete_write]:2224 bh_count 1763 >> [ 959.874274] NILFS [nilfs_segctor_complete_write]:2225 bh->b_blocknr 22581248, bh->b_size 22583295, bh->b_page 0000000000002b13 >> [ 959.874277] NILFS [nilfs_segctor_complete_write]:2226 bh->b_assoc_buffers.next ffff880182abab40, bh->b_assoc_buffers.prev ffff880182b97fb8 > > This looks a list head structure on the head at &segbuf->sb_payload_buffers. > So, maybe the strange b_blocknr, b_size, b_page, are correct. > > How did you judge the end condition of this loop? > > Is this buffer head actually causing the oops at nilfs_end_page_io() ? > > > Regards, > Ryusuke Konishi > > >> It is possible to see that buffer head {page->index 22583013, i_ino 0, >> i_size 242770509824, nblocks 1766} has #1762 index on complete write >> phase and namely next item in the list to raise crash because of illegal >> page address {bh->b_page 0000000000002b13}. But all content of next item >> is very strange. So, I think that it is not list's memory. But it is >> more strange that bh->b_assoc_buffers.prev ffff880182b97fb8 of this >> corrupted item has address that points on previous good item (this item >> was last in the list). As I can see, item #1762 {page->index 22583013, >> i_ino 0, i_size 242770509824} has unchanged next and prev pointers >> {bh->b_assoc_buffers.next ffff8802247e3af8, bh->b_assoc_buffers.prev >> ffff880209836a70}. So, I suspect that we have the reason of the issue >> somewhere between add payload buffer and complete write phase. But, >> currently, I haven't clear understanding of the whole picture and the >> reason of the issue. >> >> I think that it makes sense to try to simplify the issue environment >> with the purpose to investigate the issue more deeply. But, maybe, you >> can advise something yet. >> >> Do you have any ideas about the reason of the issue? Could you share >> your vision of possible reason of the issue? Anyway, I continue >> investigation of the issue. But, unfortunately, I don't catch the issue >> reason yet. >> >> With the best regards, >> Vyacheslav Dubeyko. >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20130816.134934.27810145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <20130816.134934.27810145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2013-08-16 7:03 ` Vyacheslav Dubeyko 2013-08-29 19:10 ` Vyacheslav Dubeyko 0 siblings, 1 reply; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-08-16 7:03 UTC (permalink / raw) To: Ryusuke Konishi Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA, jeromepoulin-Re5JQEeQqe8AvxtiuMwx3w Hi Ryusuke, On Fri, 2013-08-16 at 13:49 +0900, Ryusuke Konishi wrote: > Hi Vyachelav, > > I haven't yet succeeded to reproduce this issue even with apt-get > update operation. > > How long did it take to reproduce this issue in your environment ? > I reproduce the issue stably in my environment. But sometimes I need to repeat reproducing path several times before achieving the issue. Usually, the issue is reproduced on the phase of "Reading package lists...". But it is hard to predict on what concrete percent of operation progress you will reproduce the issue. I have such version of the kernel: Linux 3.10.0-rc5+ #45 SMP Thu Aug 8 17:20:43 MSK 2013 x86_64 x86_64 x86_64 GNU/Linux. This is Ubuntu 12.04.2 LTS (GNU/Linux 3.10.0-rc5+ x86_64) distro. I simply start four terminal windows in parallel with root permissions: (1) "tail -n 30 -f /var/log/syslog" output; (2) "top" output; (3) start kernel compilation; (4) start apt-get update; > According to reported logs, the crash seems to occur at the following > BUG_ON() which is inlined in nilfs_end_page_io() function: > > #define page_buffers(page) \ > ({ \ > BUG_ON(!PagePrivate(page)); \ > ((struct buffer_head *)page_private(page)); \ > }) > > However, it's hard to narrow down the cause without reproducing the > issue. The page private flag is used to indicate that the given page > has buffer heads. So, this issue seems to be caused by that an > invalid page was passed to nilfs_end_page_io() or > try_to_free_buffers() freed the buffer head by some reason. > > The latter situation can occur if the following buffer_busy() function > unexpectedly failed for the buffer head: > > static inline int buffer_busy(struct buffer_head *bh) > { > return atomic_read(&bh->b_count) | > (bh->b_state & ((1 << BH_Dirty) | (1 << BH_Lock))); > } > > Since BH_Dirty is dropped in nilfs_segctor_complete_write() function, > I suspect the situation that bh->b_count mistakenly reached zero. > > Anyhow, further debug seems hard without reproducing the issue. > Yes, I see. I will take into account your considerations about possible reason of the issue. Thank you. Unfortunately, I haven't opportunity for the issue investigation during this week. I think that I can check your suspicion during today. Anyway, I will continue investigation of the issue on the next week. Sorry that I don't answer on your previous e-mail. I were busy. With the best regards, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel Bug: unable to handle kernel paging request 2013-08-16 7:03 ` Vyacheslav Dubeyko @ 2013-08-29 19:10 ` Vyacheslav Dubeyko [not found] ` <72C60256-983E-43D0-9DA1-D4A446B578BB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-08-29 19:10 UTC (permalink / raw) To: Jérôme Poulin; +Cc: Ryusuke Konishi, linux-nilfs Hi Jérôme, I need to check independently some my suspicions about the issue. So, I need in additional details. Did you have any mounted ext4/ext3 partitions in the background of the reproduced issue? Could you check that you can reproduce the issue in the case of absence any mounted ext4/ext3 partitions? Could you check also that you can reproduce the issue for 3.2 or earlier kernel version? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <72C60256-983E-43D0-9DA1-D4A446B578BB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <72C60256-983E-43D0-9DA1-D4A446B578BB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2013-08-29 23:37 ` Jérôme Poulin [not found] ` <CALJXSJpbHN2SQWz0e2gC_hrRKG8EcnV2bWf068GWsuoa8AX5Dw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Jérôme Poulin @ 2013-08-29 23:37 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: Ryusuke Konishi, linux-nilfs On Thu, Aug 29, 2013 at 3:10 PM, Vyacheslav Dubeyko <slava-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> wrote: > I need to check independently some my suspicions about the issue. So, I need in additional details. > > Did you have any mounted ext4/ext3 partitions in the background of the reproduced issue? I had /boot as ext3 mounted all that time but completely unused until we started diagnosing the logs. > > Could you check that you can reproduce the issue in the case of absence any mounted > ext4/ext3 partitions? > Would you like me to try with /boot umounted? > Could you check also that you can reproduce the issue for 3.2 or earlier kernel version? That would be harder to test but possible. I have a bigger issue though. Right now I'm running in the problem that the cleaner won't work anymore and partition is full. I migrated to ext4 until I decide making a new nilfs2 partition, I'm not sure I'll be able to reproduce the problem on a full FS, I could resize it a bit though. Link to this problem: http://permalink.gmane.org/gmane.comp.file-systems.nilfs.user/3072 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CALJXSJpbHN2SQWz0e2gC_hrRKG8EcnV2bWf068GWsuoa8AX5Dw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Kernel Bug: unable to handle kernel paging request [not found] ` <CALJXSJpbHN2SQWz0e2gC_hrRKG8EcnV2bWf068GWsuoa8AX5Dw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-08-30 5:43 ` Vyacheslav Dubeyko 0 siblings, 0 replies; 15+ messages in thread From: Vyacheslav Dubeyko @ 2013-08-30 5:43 UTC (permalink / raw) To: Jérôme Poulin; +Cc: Ryusuke Konishi, linux-nilfs On Thu, 2013-08-29 at 19:37 -0400, Jérôme Poulin wrote: > On Thu, Aug 29, 2013 at 3:10 PM, Vyacheslav Dubeyko <slava-yeENwD64cLyIwRZHo2/mJg@public.gmane.orgm> wrote: > > I need to check independently some my suspicions about the issue. So, I need in additional details. > > > > Did you have any mounted ext4/ext3 partitions in the background of the reproduced issue? > > I had /boot as ext3 mounted all that time but completely unused until > we started diagnosing the logs. > Yes, I also has mounted ext4 partition in the background of the issue. > > > > Could you check that you can reproduce the issue in the case of absence any mounted > > ext4/ext3 partitions? > > > > Would you like me to try with /boot umounted? > Yes, it needs to check without any ext3/ext4 mounted partitions in background. I suspect that it has some strange interaction between jbd (ext3/ext4 journaling daemon) and segctor on the block layer. Maybe I am wrong. But I can't reproduce the issue for the case of more earlier kernel version (3.2, for example). This kernel version hasn't some commits for jbd/ext4/ext3. So, I need to check my assumption independently because I can misunderstand something. I have checked many assumptions about the issue earlier but I don't catch the reason yet. I hope that I have some real hints about the issue's reason now. > > Could you check also that you can reproduce the issue for 3.2 or earlier kernel version? > > That would be harder to test but possible. > > > I have a bigger issue though. Right now I'm running in the problem > that the cleaner won't work anymore and partition is full. I migrated > to ext4 until I decide making a new nilfs2 partition, I'm not sure > I'll be able to reproduce the problem on a full FS, I could resize it > a bit though. > > Link to this problem: > http://permalink.gmane.org/gmane.comp.file-systems.nilfs.user/3072 Yes, it is bad. Maybe, it is another issue. But if you can try to reproduce and confirm (or refuse) my assumption then it will be great. Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-08-30 5:43 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-07-12 5:24 Kernel Bug: unable to handle kernel paging request Jérôme Poulin [not found] ` <CALJXSJquK6YxGKuH97Ec2CTMyJaZrJjOfePSKtgPDm8_9YXzzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-07-12 18:58 ` Jérôme Poulin [not found] ` <CALJXSJoW9Qpp9t42u_k4cW3gO6qzSPoeCjtQDU3tDKq6TJ=K8Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-07-18 17:30 ` Vyacheslav Dubeyko [not found] ` <F4156394-8A25-4F81-81C3-9921CB00BD92-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 2013-07-22 19:11 ` Jérôme Poulin [not found] ` <CALJXSJrj0J_-ZUCOurJXaYhx_wEJwxb2_5OOJjQSSmmP-PQDgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-07-23 11:15 ` Vyacheslav Dubeyko 2013-07-26 18:13 ` Jérôme Poulin [not found] ` <CALJXSJrY22eGkYA76wwL4moAdsjV+_PUtvVO6tt5K16hzMh8xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-07-27 17:06 ` Vyacheslav Dubeyko [not found] ` <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A@mail.gmail.com> [not found] ` <CALJXSJqV5nYb_t6GMS0FpWyf1aRehAgpvebwgbJzMJfctf1b2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-07-29 17:30 ` Vyacheslav Dubeyko 2013-08-09 13:15 ` Vyacheslav Dubeyko 2013-08-14 22:38 ` Ryusuke Konishi [not found] ` <20130815.073806.260411879.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2013-08-16 4:49 ` Ryusuke Konishi [not found] ` <20130816.134934.27810145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2013-08-16 7:03 ` Vyacheslav Dubeyko 2013-08-29 19:10 ` Vyacheslav Dubeyko [not found] ` <72C60256-983E-43D0-9DA1-D4A446B578BB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 2013-08-29 23:37 ` Jérôme Poulin [not found] ` <CALJXSJpbHN2SQWz0e2gC_hrRKG8EcnV2bWf068GWsuoa8AX5Dw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-08-30 5:43 ` Vyacheslav Dubeyko
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.